Xiaohongshu covers & carousels at scale

⏱️ 30 min

Anyone who's been creating on Xiaohongshu (RED) for a year shares the same nightmare: covers.

Copy takes 30 minutes. Covers take 30 minutes to an hour — hunting down photos, cutting backgrounds, adding text, tweaking colors, redoing it 7 times. Need 5 carousel images for a paid push? That's two hours plus. Three posts a week? Half a day gone.

After gpt-image-2 dropped, this whole thing collapses to 5-6 minutes per image. There are three reasons, and each one is something Midjourney / Flux just can't do:

Chinese title text rendered directly — no Photoshop pass to add text afterwards
Multi-image style locking — one prompt set produces 5 carousel images with consistent vibe
3:4 vertical 1242×1660, 8 outputs at once — pick the one that looks right

This chapter is about turning "making a Xiaohongshu cover" into a 5-minute job. Includes platform aesthetic patterns, the Chinese title prompt formula, the workflow for cranking out 30 same-style images from one prompt, and real failure stories.

1. Xiaohongshu covers ≠ regular posters

A lot of people drag their Midjourney-era "premium design" mindset into Xiaohongshu, ship something, and watch the numbers tank.

Reason: the platform aesthetic plays by different rules.

Dimension	Regular poster / WeChat MP header	Xiaohongshu cover
Composition	Horizontal, lots of whitespace, sparse info	Vertical 3:4, dense info, full frame
Font size	Headline takes 1/8 of image	Headline starts at 1/3 of image, "in-your-face"
Subject	Abstract / conceptual	Real people / objects / concrete scenes
Color	Neutral grays, Morandi tones	High saturation or soft pinks, avoid neutral grays
Text	Beauty first	Pain point first ("Must-read before 30")

These aren't aesthetic preferences. They're hard constraints because a phone scroller decides in 0.3 seconds whether to keep swiping or stop. Small text on cover = nobody reads; tasteful and refined = looks like an ad = swipe past.

Your prompt has to bake these patterns in.

Style type	Fits which topics	gpt-image-2 keywords
Real-person aesthetic	Personal growth / how-to / experience posts	photorealistic, soft daylight, candid, real-person aesthetic
Vintage hand-drawn	Food / travel / lifestyle	hand-drawn watercolor, illustrated, warm pastel palette
Glitch art	Tech / gadgets / job hunting	y2k aesthetic, glitch type, vibrant gradient, chrome accent
Minimal text	Reflection / notes / knowledge	minimal flat design, single bold typography, monochrome
Magazine editorial	Beauty / fashion / outfits	editorial fashion, magazine cover style, dramatic lighting

2. Title text prompt formula (99% Chinese rendering)

This is the biggest gpt-image-2 vs Midjourney differentiator. In the MJ era, every Xiaohongshu creator had to manually add text in Photoshop because MJ got Chinese characters wrong almost every time. gpt-image-2 hits 99% accuracy, which means the prompt produces a usable cover directly.

But to get it to behave, you need to hit three conditions:

1. Wrap literal text in double quotes
Don't write "the headline is 30 天学会 ChatGPT". Write Headline: "30 天学会 ChatGPT".
The double quotes tell the model "render this part exactly as written".

2. Use role hints to control hierarchy
Use English role words like headline / subhead / footer / caption.
gpt-image-2 reads these to decide font size, weight, and position. Skip the role hint and the model guesses.

3. Explicit position + color + typography style
"directly below headline, centered horizontally" beats "below the headline" every time.

Full prompt template

Vertical 3:4 social media cover, 1242×1660.

[场景] A young Chinese woman sitting in a Sydney café,
bright daylight, candid laptop session, blurred coffee cup foreground.

[文字层]
Headline (top center, large bold, white with subtle shadow): "30 天学会 ChatGPT"
Subhead (directly below, smaller, soft yellow): "保姆级实操指南"

[风格] warm autumn palette, slight vintage film grain, real-person aesthetic.

[约束] Exact Chinese text only. No extra words. No duplicate copy.

Write the four pieces in this order and your first-pass accuracy is 90%+, with one tweak round to push it to 99%.

3. Cranking out 30 same-style images from one prompt

What Xiaohongshu creators actually need isn't 1 image — it's a series of same-style images.

Concrete example:

Monday you've planned a "30-day AI tools challenge" series, one post a day, 30 posts = 30 covers.
Every cover has to feel consistent or your whole account looks scattered.
Old way (Canva / MJ + PS): 30 images × 30 minutes = 15 hours.
gpt-image-2 multi-turn editing: 30 images × 90 seconds ≈ 45 minutes.

The workflow has two steps.

Image 1: full prompt to set the tone

Write a thorough prompt (using the §2 template above), generate 4 options, pick 1. This first image is your "style anchor".

Images 2-30: reference + short instructions

Don't rewrite the whole prompt. Just tell ChatGPT:

Same style as previous, change to:

Day 2 — A laptop on a wooden desk with morning sunlight,
no person, focus on screen showing Cursor IDE.
Headline: "Day 2 · Cursor 入门"
Subhead: "5 分钟出第一个项目"

The model holds onto the color tone, typography, and vibe — it just swaps the subject and text. Generation runs 3-4× faster than the first image.

Practical rhythm:

Sunday: spend 1 hour anchoring (tweaking the prompt until it looks right)
Monday through Sunday: 90 seconds per image, 1 per day
Monthly output pushes past 90 covers

4. Failure log (don't skip this part)

Failure 1: title text came out as Japanese

First time I asked gpt-image-2 for a "30 天学会 ChatGPT" cover, it came back as "30 天学会 ChatGPT" — with Japanese katakana "だ" mixed in.

Tracked it down: the prompt had "Japanese aesthetic" as a style word, and the model read that as "include Japanese characters".

Fix: change the style word to Chinese minimalist or editorial Asian aesthetic. Or just skip country-level style words and only describe elements (lighting, color palette).

Failure 2: subhead position drifted

Asked the model to put the subhead under the headline. Out of 8 images, 3 had the subhead floating to the bottom right.

Fix: position words have to be explicit. directly below the headline, centered horizontally beats below the headline. You can also add aligned with headline.

Failure 3: 3:4 ratio came out as 1:1

On the ChatGPT web app, prompts with vertical 3:4 sometimes return a 1:1 square.

Fix:

Explicit pixel size: 1242×1660 is more reliable than 3:4
Stack the hints: vertical poster, 3:4 ratio, portrait orientation, 1242 by 1660 pixels
Hit the API: pass size directly and you can't get this wrong

Failure 4: carousel image 5 drifts in style

A 5-image carousel: first 3 stay on tone, then images 4-5 quietly shift colors.

Fix: reset the full style block every 3 images instead of leaning on reference the whole way. Group the first 3 into one carousel, then re-paste the prompt anchor for image 4.

5. What we found

The JR Academy Xiaohongshu account @JR 匠人学院 ran three workflows in parallel for 4 weeks:

Workflow	Time per cover	Consistency	Monthly cover output	Chinese accuracy
Canva templates	25 min	Medium	~80	100% (typed manually)
Midjourney + PS	35 min	Low	~40	100% (typed manually)
gpt-image-2 + tweaking	6 min	High	~280	97%

gpt-image-2 hits 3-4× the monthly output, and consistency actually goes up (locking a reference is more stable than humans eyeballing a template).

There's a hidden cost though: the first hour figuring out the prompt formula isn't saved. If you're new to writing full prompts, your first image might take 5-6 tries. Once you've built up 5 templates of your own (people / scenes / text / infographics / glitch art), it's smooth from there.

We eventually locked our 5 prompt templates into Notion and shared them across the team.

What's next

The next chapter is about WeChat MP — header images plus the in-article image rhythm. WeChat MP sizing (2.35:1 horizontal), the platform's headline whitespace requirements, and keeping 6 images on tone are all completely different from Xiaohongshu, so the prompts have to be rewritten from scratch.

If you're in a hurry to crank out a few Xiaohongshu covers right now:

Take the prompt template from §2
Swap in your subject description and headline text
Generate 8 at once, pick 1
If the text is wrong, fix it against the 4 failures in §4
Once you've nailed the "style anchor", use the §3 reference workflow to mass-produce

Want more style prompt templates? v2 will add 15 sub-style chapters (photorealistic / Chinese-style / glitch art / minimalist etc.). Subscribe to JR Academy WeChat MP @JR 匠人学院 to get them first.

📷 Xiaohongshu creative case study

From awesome-gpt-image (CC BY 4.0). Dropping historical figures into modern social platforms is one of the hottest content formats right now, and here's a great Xiaohongshu version.

Case: Ancient figures posting on Xiaohongshu (a series content goldmine)

Ancient Figures Xiaohongshu Post

Prompt:

Generate a screenshot of [historical figure name] on [platform name]

Swap [historical figure name] with Li Bai / Du Fu / Wang Zhaojun / Empress Dowager Cixi, set [platform name] to Xiaohongshu, and the model fills in the note cover, cover text, and page UI that figure might post. One prompt carries 30 episodes of the "ancient figures on Xiaohongshu" series — copy it with historical stories blended into modern marketing language, and you've got the most reliable goldmine for cultural creators on Xiaohongshu.

📷 Creator: @MrLarus · Indexed in: awesome-gpt-image

❓ 常见问题

关于本章主题最常被搜索的问题，点击展开答案

小红书封面尺寸多少？

3:4 竖版，标准 1242×1660 像素。手机刷流缩略图小，标题字号必须占图 1/3 起步（"砸脸"效果），低于 1/4 看不清。

小红书 carousel 怎么风格统一？

第 1 张写完整 prompt 出 4 张选 1 张作"风格锚"，第 2-30 张用 "Same style as previous, change scene to: ..." 短指令换主体。模型保留色调 / 字体 / 调性，只换场景。

小红书 vs 微信公众号 prompt 差异？

小红书"砸脸"——大字 + 高饱和 + 信息密 + 真人感；公众号"克制"——给平台标题留位 + 编辑感 + 中等字号。两套 prompt 不能互搬。