Xiaohongshu covers & carousels at scale
Anyone who's been creating on Xiaohongshu (RED) for a year shares the same nightmare: covers.
Copy takes 30 minutes. Covers take 30 minutes to an hour — hunting down photos, cutting backgrounds, adding text, tweaking colors, redoing it 7 times. Need 5 carousel images for a paid push? That's two hours plus. Three posts a week? Half a day gone.
After gpt-image-2 dropped, this whole thing collapses to 5-6 minutes per image. There are three reasons, and each one is something Midjourney / Flux just can't do:
- Chinese title text rendered directly — no Photoshop pass to add text afterwards
- Multi-image style locking — one prompt set produces 5 carousel images with consistent vibe
- 3:4 vertical 1242×1660, 8 outputs at once — pick the one that looks right
This chapter is about turning "making a Xiaohongshu cover" into a 5-minute job. Includes platform aesthetic patterns, the Chinese title prompt formula, the workflow for cranking out 30 same-style images from one prompt, and real failure stories.
1. Xiaohongshu covers ≠ regular posters
A lot of people drag their Midjourney-era "premium design" mindset into Xiaohongshu, ship something, and watch the numbers tank.
Reason: the platform aesthetic plays by different rules.
| Dimension | Regular poster / WeChat MP header | Xiaohongshu cover |
|---|---|---|
| Composition | Horizontal, lots of whitespace, sparse info | Vertical 3:4, dense info, full frame |
| Font size | Headline takes 1/8 of image | Headline starts at 1/3 of image, "in-your-face" |
| Subject | Abstract / conceptual | Real people / objects / concrete scenes |
| Color | Neutral grays, Morandi tones | High saturation or soft pinks, avoid neutral grays |
| Text | Beauty first | Pain point first ("Must-read before 30") |
These aren't aesthetic preferences. They're hard constraints because a phone scroller decides in 0.3 seconds whether to keep swiping or stop. Small text on cover = nobody reads; tasteful and refined = looks like an ad = swipe past.
Your prompt has to bake these patterns in.
| Style type | Fits which topics | gpt-image-2 keywords |
|---|---|---|
| Real-person aesthetic | Personal growth / how-to / experience posts | photorealistic, soft daylight, candid, real-person aesthetic |
| Vintage hand-drawn | Food / travel / lifestyle | hand-drawn watercolor, illustrated, warm pastel palette |
| Glitch art | Tech / gadgets / job hunting | y2k aesthetic, glitch type, vibrant gradient, chrome accent |
| Minimal text | Reflection / notes / knowledge | minimal flat design, single bold typography, monochrome |
| Magazine editorial | Beauty / fashion / outfits | editorial fashion, magazine cover style, dramatic lighting |
2. Title text prompt formula (99% Chinese rendering)
This is the biggest gpt-image-2 vs Midjourney differentiator. In the MJ era, every Xiaohongshu creator had to manually add text in Photoshop because MJ got Chinese characters wrong almost every time. gpt-image-2 hits 99% accuracy, which means the prompt produces a usable cover directly.
But to get it to behave, you need to hit three conditions:
1. Wrap literal text in double quotes
Don't write "the headline is 30 天学会 ChatGPT". WriteHeadline: "30 天学会 ChatGPT".
The double quotes tell the model "render this part exactly as written".
2. Use role hints to control hierarchy
Use English role words likeheadline/subhead/footer/caption.
gpt-image-2 reads these to decide font size, weight, and position. Skip the role hint and the model guesses.
3. Explicit position + color + typography style
"directly below headline, centered horizontally" beats "below the headline" every time.
Full prompt template
Vertical 3:4 social media cover, 1242×1660.
[场景] A young Chinese woman sitting in a Sydney café,
bright daylight, candid laptop session, blurred coffee cup foreground.
[文字层]
Headline (top center, large bold, white with subtle shadow): "30 天学会 ChatGPT"
Subhead (directly below, smaller, soft yellow): "保姆级实操指南"
[风格] warm autumn palette, slight vintage film grain, real-person aesthetic.
[约束] Exact Chinese text only. No extra words. No duplicate copy.
Write the four pieces in this order and your first-pass accuracy is 90%+, with one tweak round to push it to 99%.
3. Cranking out 30 same-style images from one prompt
What Xiaohongshu creators actually need isn't 1 image — it's a series of same-style images.
Concrete example:
Monday you've planned a "30-day AI tools challenge" series, one post a day, 30 posts = 30 covers.
Every cover has to feel consistent or your whole account looks scattered.
Old way (Canva / MJ + PS): 30 images × 30 minutes = 15 hours.
gpt-image-2 multi-turn editing: 30 images × 90 seconds ≈ 45 minutes.
The workflow has two steps.
Image 1: full prompt to set the tone
Write a thorough prompt (using the §2 template above), generate 4 options, pick 1. This first image is your "style anchor".
Images 2-30: reference + short instructions
Don't rewrite the whole prompt. Just tell ChatGPT:
Same style as previous, change to:
Day 2 — A laptop on a wooden desk with morning sunlight,
no person, focus on screen showing Cursor IDE.
Headline: "Day 2 · Cursor 入门"
Subhead: "5 分钟出第一个项目"
The model holds onto the color tone, typography, and vibe — it just swaps the subject and text. Generation runs 3-4× faster than the first image.
Practical rhythm:
- Sunday: spend 1 hour anchoring (tweaking the prompt until it looks right)
- Monday through Sunday: 90 seconds per image, 1 per day
- Monthly output pushes past 90 covers
4. Failure log (don't skip this part)
Failure 1: title text came out as Japanese
First time I asked gpt-image-2 for a "30 天学会 ChatGPT" cover, it came back as "30天学会ChatGPT" — with Japanese katakana "だ" mixed in.
Tracked it down: the prompt had "Japanese aesthetic" as a style word, and the model read that as "include Japanese characters".
Fix: change the style word to Chinese minimalist or editorial Asian aesthetic. Or just skip country-level style words and only describe elements (lighting, color palette).
Failure 2: subhead position drifted
Asked the model to put the subhead under the headline. Out of 8 images, 3 had the subhead floating to the bottom right.
Fix: position words have to be explicit. directly below the headline, centered horizontally beats below the headline. You can also add aligned with headline.
Failure 3: 3:4 ratio came out as 1:1
On the ChatGPT web app, prompts with vertical 3:4 sometimes return a 1:1 square.
Fix:
- Explicit pixel size:
1242×1660is more reliable than3:4 - Stack the hints:
vertical poster, 3:4 ratio, portrait orientation, 1242 by 1660 pixels - Hit the API: pass
sizedirectly and you can't get this wrong
Failure 4: carousel image 5 drifts in style
A 5-image carousel: first 3 stay on tone, then images 4-5 quietly shift colors.
Fix: reset the full style block every 3 images instead of leaning on reference the whole way. Group the first 3 into one carousel, then re-paste the prompt anchor for image 4.
5. What we found
The JR Academy Xiaohongshu account @JR匠人学院 ran three workflows in parallel for 4 weeks:
| Workflow | Time per cover | Consistency | Monthly cover output | Chinese accuracy |
|---|---|---|---|---|
| Canva templates | 25 min | Medium | ~80 | 100% (typed manually) |
| Midjourney + PS | 35 min | Low | ~40 | 100% (typed manually) |
| gpt-image-2 + tweaking | 6 min | High | ~280 | 97% |
gpt-image-2 hits 3-4× the monthly output, and consistency actually goes up (locking a reference is more stable than humans eyeballing a template).
There's a hidden cost though: the first hour figuring out the prompt formula isn't saved. If you're new to writing full prompts, your first image might take 5-6 tries. Once you've built up 5 templates of your own (people / scenes / text / infographics / glitch art), it's smooth from there.
We eventually locked our 5 prompt templates into Notion and shared them across the team.
What's next
The next chapter is about WeChat MP — header images plus the in-article image rhythm. WeChat MP sizing (2.35:1 horizontal), the platform's headline whitespace requirements, and keeping 6 images on tone are all completely different from Xiaohongshu, so the prompts have to be rewritten from scratch.
If you're in a hurry to crank out a few Xiaohongshu covers right now:
- Take the prompt template from §2
- Swap in your subject description and headline text
- Generate 8 at once, pick 1
- If the text is wrong, fix it against the 4 failures in §4
- Once you've nailed the "style anchor", use the §3 reference workflow to mass-produce
Want more style prompt templates? v2 will add 15 sub-style chapters (photorealistic / Chinese-style / glitch art / minimalist etc.). Subscribe to JR Academy WeChat MP @JR匠人学院 to get them first.
📷 Xiaohongshu creative case study
From awesome-gpt-image (CC BY 4.0). Dropping historical figures into modern social platforms is one of the hottest content formats right now, and here's a great Xiaohongshu version.
Case: Ancient figures posting on Xiaohongshu (a series content goldmine)
Prompt:
Generate a screenshot of [historical figure name] on [platform name]
Swap [historical figure name] with Li Bai / Du Fu / Wang Zhaojun / Empress Dowager Cixi, set [platform name] to Xiaohongshu, and the model fills in the note cover, cover text, and page UI that figure might post. One prompt carries 30 episodes of the "ancient figures on Xiaohongshu" series — copy it with historical stories blended into modern marketing language, and you've got the most reliable goldmine for cultural creators on Xiaohongshu.
📷 Creator: @MrLarus · Indexed in: awesome-gpt-image
❓ 常见问题
关于本章主题最常被搜索的问题,点击展开答案
小红书封面尺寸多少?
3:4 竖版,标准 1242×1660 像素。手机刷流缩略图小,标题字号必须占图 1/3 起步("砸脸"效果),低于 1/4 看不清。
小红书 carousel 怎么风格统一?
第 1 张写完整 prompt 出 4 张选 1 张作"风格锚",第 2-30 张用 "Same style as previous, change scene to: ..." 短指令换主体。模型保留色调 / 字体 / 调性,只换场景。
小红书 vs 微信公众号 prompt 差异?
小红书"砸脸"——大字 + 高饱和 + 信息密 + 真人感;公众号"克制"——给平台标题留位 + 编辑感 + 中等字号。两套 prompt 不能互搬。