AI Agents I Built (6/7) — Dual-Language Blog Operation: Image Reuse and Dual Publishing
A pipeline design that reduces cost and redundant work when publishing the same content to Korean and English blogs simultaneously.
What This Post Covers
- How to configure a Blogger API v3 dual-publishing pipeline with a single
--blogflag for Blog ID switching - The conditions and automation script structure that bring English blog image generation cost to $0 via image reuse
- A ratio-based mapping algorithm for placing images correctly when Korean and English posts differ in section count
- The Blogger API post size ceiling when using base64 inline images, and mitigation options
Background: Why Dual Operation Needs Automation
Korean technical blogs have a hard audience ceiling set by population. Topics like AI agents, Claude Code, and local LLMs have disproportionately larger search demand in English, making bilingual publishing an effective audience expansion strategy.
The problem: maintaining both languages doubles every step — writing, image generation, publishing, and table of contents management. This post documents an implementation across four axes (pipeline branching, rewrite strategy, image reuse, position mapping) that eliminates most of that duplication cost.
1. Dual Publishing Pipeline — One Flag, One Code Path
Adding a --blog flag to blogger_publish.py allows publishing to both blogs through the same OAuth token and the same code path, switching only the Blog ID.
- Korean: Blog ID
8353076619669124413(maju-not.blogspot.com) - English: Blog ID
2499508976191968092(maju-tech.blogspot.com)
Blogger API v3 does not segregate auth scopes or endpoints by blog. A single token grants access to all owned blogs, so authentication does not need to be duplicated. Publish, update, delete, and list operations all run through a single code path.
Publishing flow:
1. Write Korean post → save to drafts/blog/
2. Publish Korean: python3 blogger_publish.py draft.md --blog kr
3. Save English rewrite
4. Publish English: python3 blogger_publish.py draft_en.md --blog en
2. English Posts — Rewriting, Not Translating
Korean and English technical writing follow different stylistic conventions. Korean technical prose conventionally uses formal polite endings; English technical writing favors active voice, short sentences, and direct instruction. Machine translation flattens this difference and produces unnatural output.
The unit of work is therefore "preserve structure and argument + rewrite in style", not translation. Section structure and claims from the original are retained, but prose is rebuilt sentence-by-sentence to match English technical blog conventions. During this process:
- Korean-specific service names and community references are replaced with globally intelligible equivalents
- Indirect phrasing natural in Korean becomes direct assertion
- Background context meaningful only to Korean readers (domestic policy, local market specifics) is condensed or removed in the English version
3. Image Reuse — Conditions for $0 Additional Cost
This is the largest savings. Generating 228 images for 76 Korean blog posts with Nano Banana 2 cost approximately $15. Reusing the same images for the English blog brings image generation cost to $0.
Three preconditions must hold for image reuse to work:
- All prompts were written in English from the start: Korean-language prompts can produce Korean text in generated images
- All text within images is in English: diagram labels, speech bubbles, and captions must all be in English
- Content is predominantly technical diagrams: abstract and technical visuals that do not depend on cultural context are language-neutral and reusable
If any precondition breaks — Korean captions, culturally specific illustrations — image reuse is not viable and separate English assets must be generated.
The copy_images_to_en.py automation script:
1. Fetch post lists from both the Korean and English blogs
2. Match Korean and English posts by publication order
3. Extract base64 image divs from Korean post HTML
4. Insert identical images at corresponding positions in English post HTML
5. Update via Blogger API
Processing all 36 English posts takes approximately 2 minutes. The bottleneck is API call rate, not image generation — because image generation is absent from this step entirely.
4. Ratio-Based Position Mapping — When Section Structure Differs
During rewriting, the English post's section count or order may diverge from the original. Mechanically inserting images at "the same index" produces misaligned context.
The solution is heading-based ratio mapping:
- Calculate the ratio position of each image relative to total h2/h3 headings in the Korean post
- Multiply that ratio by the English post's heading count to determine insertion position
Example: if a Korean post has 10 sections and an image appears before the 3rd section, ratio = 30%. If the English post has 8 sections, the image is placed before the 8 × 0.30 ≈ 2nd section. The alignment is not exact, but intro-region images land near the intro and conclusion-region images land near the conclusion, preserving contextual coherence.
5. Cost Structure
| Item | Korean Blog | English Blog | Total |
|---|---|---|---|
| Image generation | $15.3 | $0 | $15.3 |
| Blogger API | Free | Free | Free |
| Domain | blogspot (free) | blogspot (free) | Free |
| Content writing | Claude API | Claude API | Claude API cost only |
When image reuse preconditions hold, the marginal cost of operating an additional English blog converges to Claude API call cost for the rewrite. Additional cost for images, hosting, and publishing infrastructure is $0.
Limitations and Improvement Directions
Actual rewrite cost: Machine translation cannot reliably meet quality standards, so each English post requires effort roughly equivalent to rewriting a substantial portion of the original. A two-stage pipeline — translation API for a first draft, then style correction with an LLM — is a potential alternative.
Post-order matching fragility: The current image copy script matches Korean and English posts by publication order. Adding posts to only one blog misaligns the sequence and causes images to be inserted into the wrong posts. The improvement path is either title-similarity matching (embedding-based) or an explicit kr_post_id → en_post_id mapping table maintained at publish time.
Base64 inline image size ceiling: Inlining images as base64 inflates post HTML rapidly. Distributing 228 images across posts resulted in some post HTML exceeding 2 MB. Blogger API enforces a per-request post size limit of approximately 1 MB, causing update failures on affected posts. Mitigation options: (1) upload images to an external CDN and replace inline data with URL references; (2) cap images per post; (3) reduce base64 payload size via image resize and WebP conversion.
Scope and Open Questions
This structure extends directly to the general problem of scaling the same content across N languages. Expanding the --blog flag from kr/en to kr/en/ja and maintaining image reuse preconditions (English prompts, language-neutral diagrams) brings Japanese blog image cost to $0 as well.
Two open questions remain: - How far can the rewrite process be automated — does a two-stage pipeline (translation API + style-correction LLM) approach the quality of manual rewriting? - When switching post matching to title similarity, how should the similarity threshold be calibrated to correctly exclude posts that exist on only one blog?
Resolving both would bring bilingual operation to the level of "write once, publish twice."
Series overview: Series index
๋๊ธ
๋๊ธ ์ฐ๊ธฐ