"AI Image Prompting 2026 — The 8-Element Formula and How Each Tool Differs"
Subject · Scene · Camera · Lighting · Style — one structure that works across Nano Banana, Midjourney, and GPT Image
ํต์ฌ ์์ฝ
- Audience: You read the tool comparison and picked one — but your prompts keep collapsing into the same generic look or missing the brief.
- What you'll get: 1) The 8-element prompt formula that works in all three tools, 2) tool-specific differences (Midjourney parameters, Nano Banana's natural language, GPT Image's conversational edits), 3) before/after concrete examples, 4) seven common mistakes, 5) weights and negative prompts.
- Mental model: A prompt isn't a list of words. It's a director's note to a camera crew — lens, light, blocking, mood — not "make it pretty."
1. The 8-element formula (works in all tools)
Synthesizing 2026 official guides from Black Forest Labs, Anthropic, and Midjourney, they all converge on the same order:
[Subject] + [Scene/Environment] + [Composition/Shot] + [Camera/Lens]
+ [Lighting] + [Style/Medium] + [Color/Mood] + [Quality/Negatives]
1.1 What goes where
| # | Element | Example phrasing |
|---|---|---|
| 1 | Subject | "Korean woman in her 30s," "black cat," "lighthouse on a cliff" |
| 2 | Scene | "rainy Tokyo alley," "sunlit cafe window seat" |
| 3 | Composition | "close-up," "wide-angle," "rule of thirds," "over-the-shoulder" |
| 4 | Camera/Lens | "35mm film," "85mm portrait lens," "shallow depth of field" |
| 5 | Lighting | "golden hour," "rim light," "softbox front," "neon back light" |
| 6 | Style/Medium | "photorealistic," "watercolor," "oil painting," "Studio Ghibli style" |
| 7 | Color/Mood | "muted pastel palette," "high contrast," "moody, melancholic" |
| 8 | Quality/Negatives | "8K, sharp focus" / "no text, no watermark" |
1.2 Order matters
Diffusion models put more weight on words near the front (2026 guide). Put your primary subject and key action in the first 10–15 words.
1.3 Full example
"A 30-something Korean woman in a beige trench coat [Subject] walking through a rainy Seoul alley at night [Scene] medium shot, slight low angle [Composition] shot on 50mm prime, shallow depth of field [Camera] soft neon backlight from shop signs [Lighting] photorealistic, cinematic still [Style] muted teal and amber palette, melancholic [Mood] 8K, sharp focus, no watermark [Quality]"
This works as-is in Nano Banana 2, Midjourney, and GPT Image 1.5.
2. Where the tools diverge
2.1 Nano Banana 2 (natural-language friendly)
Plain prose works fine. Text rendering is strong, so you can directly request on-image text.
"A book cover for 'AI for Beginners' — minimalist white background, serif title in black, geometric illustration of a circuit board with leaves growing out of it, soft gradient orange-to-yellow accent, clean modern editorial design"
2.2 Midjourney v7 (parameter-driven precision)
Use parameters to fine-tune style strength, diversity, and consistency (official parameters).
| Parameter | Effect | Recommended values |
|---|---|---|
--s (stylize) |
Aesthetic strength | 100 (default), 50 (faithful to prompt), 750 (heavy stylization) |
--c (chaos) |
Diversity across the four outputs | 0–50 normal, 50–100 experimental |
--ar (aspect ratio) |
Aspect ratio | 16:9, 2:3, 1:1 |
--seed |
Lock the seed | Vary one element while keeping the base |
--sref |
Style-reference URL | Mimic the style of another image |
--oref |
Character-reference URL | Keep a person consistent across prompts |
Example:
A medieval castle on a cliff, sunrise, cinematic, fog --ar 16:9 --s 250 --c 30
V7 specials: personalization profiles (--p) apply your trained taste; Draft Mode (--draft) gives 10× faster ideation.
2.3 GPT Image 1.5 (conversational)
Its strength isn't the first generation — it's iterative editing. Use the 8-element formula on the first call, then plain conversation afterwards.
1st: "A young man holding an espresso cup, cafรฉ window seat, morning light, photorealistic, 50mm lens, shallow depth of field" 2nd: "Same image, but change the cup to a glass of orange juice" 3rd: "Now add a dog sleeping under the table"
Each step persists. It tracks the previous image, which gives the highest cross-edit consistency.
3. Before / After
Before (vague)
"Pretty landscape photo"
After 1 (specific)
"A photorealistic landscape of a quiet mountain lake at golden hour, mirror-like water reflection, autumn maple trees at the shore, mist rising from the surface, wide-angle composition, shot on 24mm lens, warm orange and teal palette, sharp focus, 8K"
After 2 (Midjourney parameters added)
A photorealistic landscape of a quiet mountain lake at golden hour, mirror-like water reflection, autumn maple trees at the shore, mist rising from the surface, wide-angle composition, shot on 24mm lens, warm orange and teal palette, sharp focus --ar 21:9 --s 200 --c 20
After 3 (GPT Image conversational)
After the first generation: "Same scene but at twilight with a faint full moon over the mountains."
Same intent, 5–10× quality gap.
4. Seven common mistakes
| Mistake | Result | Fix |
|---|---|---|
| 1. Stacking adjectives ("amazing, beautiful, stunning") | Mostly ignored | Replace with concrete description ("misty rim light, gold-tipped autumn leaves") |
| 2. Negatives ("not blurry") | Ignored or reversed | Use positive form ("sharp focus, fine detail") |
| 3. Too many elements at once | Some get dropped | Keep 3–5 priorities, push the rest into edits |
| 4. Repeating the same word | No effect | Use weights: ((emphasis)) or word::2 |
| 5. Generic quality tags ("8K, ultra-realistic") | Weak signal | Describe actual detail ("pores visible on skin, fabric texture") |
| 6. Missing detail anchors for people | Hands and eyes break | "natural hands, anatomically correct, sharp eyes" |
| 7. Not pinning a seed | Can't iterate | --seed (Midjourney) or save a generated image to lock in GPT Image |
5. Weights and negative prompts
5.1 Weights (Midjourney)
:: followed by a number controls per-token influence.
red sports car::3, urban street::1, neon signs::0.5
→ The car gets 3× weight, the street is baseline, neon is downweighted.
5.2 Negative prompts (Stable Diffusion-family, Midjourney --no)
--no text, watermark, signature, blur, low quality
Midjourney doesn't auto-honor negative phrasing — use the --no parameter. Nano Banana 2 and GPT Image have weaker negative-prompt support; prefer positive phrasing.
5.3 Reference images
| Tool | How |
|---|---|
| Nano Banana 2 | Attach an image + natural language ("in this style") |
| Midjourney | --sref [URL] for style, --oref [URL] for character consistency |
| GPT Image | Attach an image, then say "in this style" |
6. Copy-paste starter templates
Portrait
[Person description] in [Location], [shot type] shot, [Lighting],
shot on [Lens] with shallow depth of field, photorealistic,
[Mood] mood, 8K, sharp focus, natural hands and eyes
Landscape
A photorealistic landscape of [Subject] at [Time of day],
[weather/atmosphere], [composition type] composition,
shot on [Lens], [color palette] palette, sharp focus, 8K
Illustration / concept art
[Subject] in [Setting], [art style — e.g., Studio Ghibli / Moebius /
watercolor], [color palette], [lighting], detailed line art,
[mood], --ar 16:9 --s 400
Product mockup
[Product] on [Surface], studio lighting with softbox front,
[background — clean white / wooden table], 50mm macro lens,
shallow depth of field, photorealistic, commercial photography
Book cover / poster
A book cover design for "[Title]" — [layout description],
[typography — serif/sans, color], [illustration concept],
[color palette], minimalist editorial design, --ar 2:3
Developer notes
- Templatize prompts: Python f-strings or LangChain
PromptTemplatewith{subject},{lighting}, etc. — essential when generating 100+ images. - Automate quality scoring: GPT-5 Vision or Claude Vision can score "prompt fidelity." Auto-regenerate anything below threshold.
- Save (seed, prompt, model_version): a small DB makes good outputs reproducible.
- Nano Banana 2 batch: API supports
n=4per call → choose the best automatically. - Midjourney
--sref/--orefautomation: not recommended via unofficial bots — ToS and stability concerns. Stick with OpenAI/Gemini for production automation. - IP-safety filter: pre-filter prompts for real-person and brand mentions before submission.
References
- Midjourney — Prompt Basics
- Midjourney — Parameter List
- Black Forest Labs — Prompting Guide
- ImprovePrompt — 2026 Image Prompting Guide
- QuestStudio — Camera & Lighting Cheatsheet
This is part 5-2 of 11 in the AI Basics series. Next: AI voice/video — Suno, Runway, and Sora.
๋๊ธ
๋๊ธ ์ฐ๊ธฐ