"AI Image Prompting 2026 — The 8-Element Formula and How Each Tool Differs"

4월 25, 2026

Subject · Scene · Camera · Lighting · Style — one structure that works across Nano Banana, Midjourney, and GPT Image

핵심 요약

Audience: You read the tool comparison and picked one — but your prompts keep collapsing into the same generic look or missing the brief.
What you'll get: 1) The 8-element prompt formula that works in all three tools, 2) tool-specific differences (Midjourney parameters, Nano Banana's natural language, GPT Image's conversational edits), 3) before/after concrete examples, 4) seven common mistakes, 5) weights and negative prompts.
Mental model: A prompt isn't a list of words. It's a director's note to a camera crew — lens, light, blocking, mood — not "make it pretty."

1. The 8-element formula (works in all tools)

Synthesizing 2026 official guides from Black Forest Labs, Anthropic, and Midjourney, they all converge on the same order:

[Subject] + [Scene/Environment] + [Composition/Shot] + [Camera/Lens]
+ [Lighting] + [Style/Medium] + [Color/Mood] + [Quality/Negatives]

1.1 What goes where

#	Element	Example phrasing
1	Subject	"Korean woman in her 30s," "black cat," "lighthouse on a cliff"
2	Scene	"rainy Tokyo alley," "sunlit cafe window seat"
3	Composition	"close-up," "wide-angle," "rule of thirds," "over-the-shoulder"
4	Camera/Lens	"35mm film," "85mm portrait lens," "shallow depth of field"
5	Lighting	"golden hour," "rim light," "softbox front," "neon back light"
6	Style/Medium	"photorealistic," "watercolor," "oil painting," "Studio Ghibli style"
7	Color/Mood	"muted pastel palette," "high contrast," "moody, melancholic"
8	Quality/Negatives	"8K, sharp focus" / "no text, no watermark"

1.2 Order matters

Diffusion models put more weight on words near the front (2026 guide). Put your primary subject and key action in the first 10–15 words.

1.3 Full example

"A 30-something Korean woman in a beige trench coat [Subject] walking through a rainy Seoul alley at night [Scene] medium shot, slight low angle [Composition] shot on 50mm prime, shallow depth of field [Camera] soft neon backlight from shop signs [Lighting] photorealistic, cinematic still [Style] muted teal and amber palette, melancholic [Mood] 8K, sharp focus, no watermark [Quality]"

This works as-is in Nano Banana 2, Midjourney, and GPT Image 1.5.

2. Where the tools diverge

2.1 Nano Banana 2 (natural-language friendly)

Plain prose works fine. Text rendering is strong, so you can directly request on-image text.

"A book cover for 'AI for Beginners' — minimalist white background, serif title in black, geometric illustration of a circuit board with leaves growing out of it, soft gradient orange-to-yellow accent, clean modern editorial design"

2.2 Midjourney v7 (parameter-driven precision)

Use parameters to fine-tune style strength, diversity, and consistency (official parameters).

Parameter	Effect	Recommended values
`--s` (stylize)	Aesthetic strength	100 (default), 50 (faithful to prompt), 750 (heavy stylization)
`--c` (chaos)	Diversity across the four outputs	0–50 normal, 50–100 experimental
`--ar` (aspect ratio)	Aspect ratio	`16:9`, `2:3`, `1:1`
`--seed`	Lock the seed	Vary one element while keeping the base
`--sref`	Style-reference URL	Mimic the style of another image
`--oref`	Character-reference URL	Keep a person consistent across prompts

Example:

A medieval castle on a cliff, sunrise, cinematic, fog --ar 16:9 --s 250 --c 30

V7 specials: personalization profiles (--p) apply your trained taste; Draft Mode (--draft) gives 10× faster ideation.

2.3 GPT Image 1.5 (conversational)

Its strength isn't the first generation — it's iterative editing. Use the 8-element formula on the first call, then plain conversation afterwards.

1st: "A young man holding an espresso cup, café window seat, morning light, photorealistic, 50mm lens, shallow depth of field" 2nd: "Same image, but change the cup to a glass of orange juice" 3rd: "Now add a dog sleeping under the table"

Each step persists. It tracks the previous image, which gives the highest cross-edit consistency.

3. Before / After

Before (vague)

"Pretty landscape photo"

After 1 (specific)

"A photorealistic landscape of a quiet mountain lake at golden hour, mirror-like water reflection, autumn maple trees at the shore, mist rising from the surface, wide-angle composition, shot on 24mm lens, warm orange and teal palette, sharp focus, 8K"

After 2 (Midjourney parameters added)

A photorealistic landscape of a quiet mountain lake at golden hour, mirror-like water reflection, autumn maple trees at the shore, mist rising from the surface, wide-angle composition, shot on 24mm lens, warm orange and teal palette, sharp focus --ar 21:9 --s 200 --c 20

After 3 (GPT Image conversational)

After the first generation: "Same scene but at twilight with a faint full moon over the mountains."

Same intent, 5–10× quality gap.

4. Seven common mistakes

Mistake	Result	Fix
1. Stacking adjectives ("amazing, beautiful, stunning")	Mostly ignored	Replace with concrete description ("misty rim light, gold-tipped autumn leaves")
2. Negatives ("not blurry")	Ignored or reversed	Use positive form ("sharp focus, fine detail")
3. Too many elements at once	Some get dropped	Keep 3–5 priorities, push the rest into edits
4. Repeating the same word	No effect	Use weights: `((emphasis))` or `word::2`
5. Generic quality tags ("8K, ultra-realistic")	Weak signal	Describe actual detail ("pores visible on skin, fabric texture")
6. Missing detail anchors for people	Hands and eyes break	"natural hands, anatomically correct, sharp eyes"
7. Not pinning a seed	Can't iterate	`--seed` (Midjourney) or save a generated image to lock in GPT Image

5. Weights and negative prompts

5.1 Weights (Midjourney)

:: followed by a number controls per-token influence.

red sports car::3, urban street::1, neon signs::0.5

→ The car gets 3× weight, the street is baseline, neon is downweighted.

5.2 Negative prompts (Stable Diffusion-family, Midjourney `--no`)

--no text, watermark, signature, blur, low quality

Midjourney doesn't auto-honor negative phrasing — use the --no parameter. Nano Banana 2 and GPT Image have weaker negative-prompt support; prefer positive phrasing.

5.3 Reference images

Tool	How
Nano Banana 2	Attach an image + natural language ("in this style")
Midjourney	`--sref [URL]` for style, `--oref [URL]` for character consistency
GPT Image	Attach an image, then say "in this style"

6. Copy-paste starter templates

Portrait

[Person description] in [Location], [shot type] shot, [Lighting], 
shot on [Lens] with shallow depth of field, photorealistic, 
[Mood] mood, 8K, sharp focus, natural hands and eyes

Landscape

A photorealistic landscape of [Subject] at [Time of day], 
[weather/atmosphere], [composition type] composition, 
shot on [Lens], [color palette] palette, sharp focus, 8K

Illustration / concept art

[Subject] in [Setting], [art style — e.g., Studio Ghibli / Moebius / 
watercolor], [color palette], [lighting], detailed line art, 
[mood], --ar 16:9 --s 400

Product mockup

[Product] on [Surface], studio lighting with softbox front, 
[background — clean white / wooden table], 50mm macro lens, 
shallow depth of field, photorealistic, commercial photography

Book cover / poster

A book cover design for "[Title]" — [layout description], 
[typography — serif/sans, color], [illustration concept], 
[color palette], minimalist editorial design, --ar 2:3

Developer notes

Templatize prompts: Python f-strings or LangChain PromptTemplate with {subject}, {lighting}, etc. — essential when generating 100+ images.
Automate quality scoring: GPT-5 Vision or Claude Vision can score "prompt fidelity." Auto-regenerate anything below threshold.
Save (seed, prompt, model_version): a small DB makes good outputs reproducible.
Nano Banana 2 batch: API supports n=4 per call → choose the best automatically.
Midjourney --sref / --oref automation: not recommended via unofficial bots — ToS and stability concerns. Stick with OpenAI/Gemini for production automation.
IP-safety filter: pre-filter prompts for real-person and brand mentions before submission.

References

This is part 5-2 of 11 in the AI Basics series. Next: AI voice/video — Suno, Runway, and Sora.

이 블로그 검색

MaJu Tech Notes