logo
13

Bilibili / YouTube horizontal thumbnails

⏱️ 20 min

Long-video platforms play by the opposite thumbnail rules from short-video.

Short-video viewers decide in 1 second: swipe or stop. So thumbnails need to be in-your-face.

Long-video viewers take 3-5 seconds to decide whether to click. So thumbnails need to "tell the story + trigger curiosity."

And the home feed is way more competitive: YouTube's homepage shows 30+ videos at once, Bilibili's recommendation feed shows 20+, and each one has under 0.5 seconds in front of your eyes.

This chapter gives you the full prompt templates for Bilibili and YouTube horizontal thumbnails (16:9 / 16:10), CTR-optimized color psychology, the 3-second recognition rule, plus the tonal differences between Bilibili's anime aesthetic and YouTube's business-style covers.


1. Key Dimensions + File Size Limits

PlatformRatioRecommended PixelsFile Size LimitNotes
Bilibili16:10 / 16:91146×717 / 1280×720< 5MBBottom-left gets covered by category tag
YouTube16:91920×1080< 2MBBottom-right gets covered by duration overlay
YouTube Shorts list9:161080×1920< 2MBVertical also supported (but 16:9 is still mainstream)

Warning: YouTube has a hard 2MB limit. Anything bigger gets crushed into a blurry mess. After generation, you have to compress through TinyPNG / Squoosh.


2. CTR-Optimized Color Psychology

Color combinations top YouTube creators have actually A/B tested for high CTR:

Color ComboCTR DataBest Forgpt-image-2 Keywords
Red + Yellow + BlackHigh (+30%)Tutorials / Reviews / News / Dramahigh contrast, red yellow black, news thumbnail style
Blue + OrangeMedium-High (+22%)Tech / Education / Businesstech thumbnail, vibrant blue + orange complementary
Purple + CyanMedium (+15%)Gaming / Anime / Virtualneon purple + cyan, gamer aesthetic
White + Single ColorLow (baseline)Minimal / Thoughtful / Premiumminimal flat thumbnail, single hue accent

Bottom line: Unless you're deliberately going for the "premium niche" vibe, mainstream creators use the first two combos (red-yellow-black or blue-orange).


3. YouTube Horizontal Prompt Template

The most-used MrBeast-style template (the high-CTR formula):

Horizontal video thumbnail, 16:9, 1920×1080.

Left half (60%): a developer in a hoodie pointing
dramatically at the right side, exaggerated surprised face,
eyebrows raised, mouth open in shock,
saturated color grade with slight HDR pop.

Right half (40%): three big text blocks vertically stacked,
each text outlined in thick black for readability:
- Top huge red text (with yellow outline): "I QUIT"
- Middle yellow text: "Cursor for"
- Bottom white text: "30 Days"

Background: gradient from saturated red (left) to deep navy (right),
subtle motion lines for energy.

Style: MrBeast-style YouTube thumbnail, vibrant saturated colors,
dramatic facial expression, high contrast text outlined in black,
slight HDR effect, attention-grabbing first-frame.

Exact text only. Right-bottom 200×100 area must be clean
(YouTube duration overlay area).

Three YouTube CTR Formulas

  1. Big-face close-up + exaggerated expression — takes up 50%+ of the frame, eyebrows up, mouth open
  2. 3-5 word title + strong contrast — skip long sentences, 3-5 words read fastest
  3. Numbers / contrast/reversal phrases — "I QUIT" / "$50,000" / "I tried for 30 days" hit 5x harder than "Tutorial Sharing"

4. Bilibili Anime Prompt Template

The Bilibili "kichiku / anime / meme" route:

Bilibili-style horizontal thumbnail, 16:9, 1280×720.

Subject: anime-style chibi character (big head, small body)
holding a laptop, exaggerated surprised expression with sparkle eyes,
small star elements floating around,
overall cute kawaii feel.

Background: pastel pink + light blue gradient,
with floating sparkles + small stars + small heart shapes,
slightly noisy texture for dimension.

Top huge bold Chinese text (with strong drop shadow,
yellow text outlined in white-then-pink):
"AI 写代码,我直接 emo 了"

Style: bilibili anime aesthetic, kawaii cute,
soft pastel palette with neon accent,
slight glow on text, attention-grabbing for young audience.

Left-bottom 200×80 area must be clean
(Bilibili category tag overlay area).

Bilibili Anime vs. Knowledge Category Tone

Bilibili isn't only anime. The knowledge category, food category, and finance category each have their own visual language.

Bilibili CategoryTonegpt-image-2 Keywords
Knowledge / TechEditorial + dense infoeditorial tech, data-driven, infographic style
FoodWarm + close-up ingredientswarm food photography, close-up, golden hour
FinanceBusiness + databusiness editorial, charts overlay, navy + gold
Anime / KichikuCartoon + high contrastanime aesthetic, kawaii, vibrant pastel

5. The 3-Second Recognition Rule

A clicked thumbnail nails 3 elements in 3 seconds:

  1. The headline (your hook) — readable in 1 second
  2. The subject (person / object) — recognizable in 2 seconds ("what's this about")
  3. Strong contrast color blocks — the "should I click" decision lands by second 3

Test method: shrink your finished thumbnail to 240×135 (YouTube's recommendation feed thumbnail size) and see if you can explain 3 things in 3 seconds. Can't? Redo it.


6. Real-World Failures

Failure 1: YouTube file over 2MB

ChatGPT raw outputs are often 3-4MB. Upload to YouTube and it gets crushed into a blurry image — text edges get JPEG artifacts.

Fix:

  • After generation, run it through TinyPNG / Squoosh and get it under 1.8MB
  • For complex backgrounds, drop the saturation a bit (high saturation = bigger files)
  • Or generate at medium quality from the start ($0.053/image vs $0.211/image)

Failure 2: Bilibili thumbnail text blocked by category tag

Bilibili automatically overlays a category tag in the bottom-left (like "Tech / Knowledge"), which covers your key text.

Fix: The bottom-left 200×80 area is off-limits for key text or the subject's face. Background textures and decorative elements are fine to put there.

Failure 3: YouTube duration covers your text

YouTube automatically overlays the video duration in the bottom-right, blocking text in that corner.

Fix: Leave the bottom-right 200×100 area clear. Don't put your CTA or title there.

Failure 4: Thumbnail too small to read

Your cover looks great at 480×270 on desktop, but YouTube's recommendation feed shrinks it to 168×94 and the text turns into mush.

Fix:

  • Type sizes need to be really big (taking up 1/3 of the frame)
  • Stroke + drop shadow on type for readability
  • After you generate, shrink it down to 168×94 yourself and look — that's the actual YouTube thumbnail size

7. What We've Seen

One AI-tutorial creator we coach ran A/B tests on YouTube thumbnails for 4 months:

Thumbnail StyleAvg CTRMonthly Sub Growth
Minimal + single color + small text3.2%+120
Red-yellow-black + big text + facial expression6.8%+480
MrBeast-style + numerical contrast9.4%+820

Bottom line: The "restrained" YouTube thumbnail aesthetic is a creator's romantic ideal, but CTR is the platform algorithm's hard currency.

But Bilibili doesn't work the same way. On Bilibili, minimal knowledge-category thumbnails actually beat kichiku-style ones on CTR (people come to Bilibili for depth). Two platforms, two playbooks.


8. Next Steps

The next chapter moves into English-speaking social media (Ch 14 LinkedIn), with yet another tone — restrained, business, one-line tagline — that's nothing like YouTube or Bilibili.

If you're rushing to make a YouTube thumbnail right now:

  1. Decide MrBeast-style or minimalist (depends on your category)
  2. Grab the §3 prompt template, swap out the subject and text
  3. Generate 8 in one go, pick 1
  4. Shrink to 168×94 and stare for 3 seconds — if you can identify 3 things, you're good
  5. Compress to < 2MB (YouTube) or < 5MB (Bilibili) before uploading

The hidden game-decider for long-video thumbnails is this: getting found ≠ getting clicked. SEO drops your video into the recommendation feed, but the thumbnail decides the CTR.


Long-Video Thumbnail Creative Examples

From awesome-gpt-image (CC BY 4.0). Two real cases that prove long-video thumbnails are won by "contrast + hook," not "aesthetics."

Case 1: YouTube Time-Travel Thumbnail (High-CTR Contrast Formula)

Frame 1Frame 2Frame 3
Frame 1Frame 2Frame 3

Prompt:

Screenshot of a YouTube video showing someone who time-traveled to the Middle Ages

The model produces 3 frames of YouTube-style video screenshots in one shot ("time-traveler in the Middle Ages" theme), with consistent tone, UI, and emotion. Using contrast (modern person + ancient setting) on a thumbnail gets 1.5-2x the CTR of standard story thumbnails. This "time-travel / contrast / temporal mismatch" pattern is a battle-tested high-click formula on YouTube.

Creator: @flowersslop · Curated by: awesome-gpt-image

Case 2: Game Stream Screenshot (Bilibili Esports Tone)

League of Legends Mid Lane Screenshot

Prompt:

Help me generate a screenshot of Trump versus Khamenei in the mid lane in League of Legends

Drop real-world figures into a game scene ("Trump vs. Khamenei mid-lane brawl"). This kind of meme thumbnail is the traffic cheat code for Bilibili's esports and gaming categories — pair it with caption text that mixes game terminology with political figure names, and CTR blows past standard esports thumbnails.

Creator: @underwoodxie96 · Curated by: awesome-gpt-image

❓ 常见问题

关于本章主题最常被搜索的问题,点击展开答案

B 站 / YouTube 封面尺寸?

B 站 16:10(1146×717)或 16:9(1280×720),文件 < 5MB。YouTube 16:9(1920×1080),文件 < 2MB(硬限制,超过被压糊)。

YouTube 高 CTR 配色组合?

红 + 黄 + 黑(教程/评测,+30% CTR)/ 蓝 + 橙(科技/教育,+22%)/ 紫 + 青(游戏,+15%)/ 白 + 单色(极简,基线)。主流 UP 主用前两组。

B 站二次元封面 prompt 怎么写?

关键词:bilibili anime aesthetic + chibi character(头大身小)+ pastel palette + sparkles + kawaii cute。左下 200×80 留白(B 站分类标签覆盖区)。