AI Agents I Built (5/7) — Building an Automated Blogger API Publishing System

Designing a Markdown-Based Auto-Publish Pipeline with Blogger API v3 + OAuth 2.0


Key Summary

  • Blogger API v3 + OAuth 2.0 enables a pipeline that converts Markdown files into publishable HTML posts
  • blogger_publish.py handles the full flow — frontmatter parsing → body cleanup → HTML conversion → CSS injection → API post — in a single script
  • For bulk publishing, quota management (delay, retry, TOC suppression) is the critical design concern

Platform Selection Criteria

API access is mandatory for AI agent-driven content publishing. A comparison of platforms:

Platform API Cost AdSense Decision
WordPress.com REST API Paid plan required Paid only Cost overhead
Ghost REST API Self-hosted or paid Manual setup Ops overhead
Blogger v3 API Free Native integration Adopted

Blogger is free, exposes a REST API, and integrates natively with AdSense. For personal blog automation, this combination is the most practical.


Body

1. OAuth 2.0 Authentication Flow

Blogger API v3 requires OAuth 2.0 authentication. The full flow:

  1. Create OAuth 2.0 client ID in Google Cloud Console → download client_secret.json
  2. blogger_auth.py runs local server auth via InstalledAppFlow (port 8080)
  3. User logs into Google account in browser and grants permissions
  4. token.json is generated — contains access token + refresh token
  5. Subsequent requests auto-refresh using the refresh token
flow = InstalledAppFlow.from_client_secrets_file(client_secret, scopes)
creds = flow.run_local_server(port=8080)

Browser authentication is required only once. After that, the refresh token in token.json handles all renewals automatically — enabling the agent to publish without human intervention.

Token expiry warning: If the app is registered in "Testing" mode on Google Cloud Console, the refresh token is revoked after approximately 7 days. For long-term operation, the app must be switched to "Production" status to maintain a persistent refresh token.

2. blogger_publish.py — Core Publishing Script

This script is the center of the auto-publish pipeline. It accepts a Markdown file as input and publishes it to Blogger.

Processing flow:

Markdown file → frontmatter parse → body cleanup → HTML convert → CSS inject → API publish

Frontmatter parsing: Extracts title and labels from the --- block at the top of the Markdown file. Supports both Korean and English keys: ์ œ๋ชฉ:, ๋ผ๋ฒจ:, ํƒœ๊ทธ:, etc.

Body cleanup (clean_body): - H1 removal: Blogger renders the post title separately, making a Markdown H1 redundant. clean_body removes H1 to prevent double display. - Replaces ํ•ต์‹ฌ ์š”์•ฝ with "Key Summary" - Converts [Figure: description] notation into image placeholder divs

HTML conversion: Uses the Python markdown library with extra, tables, and fenced_code extensions.

CSS injection (BLOG_CSS): Injects a consistent style across all posts:

Element Style
Body text line-height 1.9, 16px
Code blocks Dark terminal style (background #1e1e2e)
Table headers Blue background (#4a90d9)
Blockquote Left blue border + light background, used for subtitles
Divider 2px solid #e0e0e0

Without direct CSS injection, Blogger's default theme styles apply to code blocks and tables, significantly reducing readability.

3. Command-Line Interface

python3 blogger_publish.py draft.md

python3 blogger_publish.py draft.md --draft

python3 blogger_publish.py draft.md --update POST_ID

python3 blogger_publish.py --list

python3 blogger_publish.py --delete POST_ID

python3 blogger_publish.py draft.md --blog en

The --blog kr/en flag switches between the Korean and English blogs. Each blog has a separate Blog ID; both share a single token.json for authentication.

4. Automatic TOC Update (update_toc.py)

Each time a post is published, the blog's table of contents page is updated automatically.

Operation: 1. Fetch full post list via Blogger API 2. Classify posts into categories by label (priority: Embedded > Claude Code > Local LLM > OpenClaw > AI Agents) 3. Generate HTML table of contents (by category + by series + by subgroup) 4. Update TOC page via Blogger Pages API

Series posts are grouped separately. OpenClaw posts are classified into subgroups: "Getting Started / Installation & Setup / Architecture & Advanced."

blogger_publish.py automatically calls update_toc.py after a successful publish. Use the --no-toc flag to suppress this behavior.

5. Bulk Publishing — Quota Management Design

Use batch_publish.py for bulk migration of existing posts or batch publishing.

for filepath in sorted_files:
    title, labels, html = parse_blog_post(filepath)
    publish_post(title, html, labels)
    time.sleep(10)  # 10-second delay between posts

Three quota design principles:

① 10-second delay between posts: The baseline interval to avoid Blogger API rate limits. Stays within the free quota reliably.

② 429 retry handling: On rate limit (HTTP 429), wait 60 seconds and retry. Without this logic, failures accumulate during bulk publishing.

③ TOC call suppression during batch operations: Per-post API call structure: - 1 publish call - TOC update: 1 full post list fetch + 1 page update

Updating the TOC on every publish triples the API call count. For bulk operations, suppress TOC updates with --no-toc and run it once after all posts are published — this is the quota-efficient design.

Bulk style update for existing posts: If CSS injection was added to the pipeline after posts were already published, those posts lack styling. batch_update_style.py injects CSS into all existing posts in bulk.

6. Dual-Language Operation (Korean + English)

Two blogs operated simultaneously:

Blog Purpose Flag
Korean blog Primary blog, Korean content --blog kr (default)
English blog English content, same topics --blog en

The same topic is written in both Korean and English and published to each respective blog. The script switches Blog IDs internally, so a single token.json handles authentication for both.


Conclusion

The core structure of the Blogger API auto-publish pipeline is three steps: Markdown → HTML → API. No complex CMS required — write in Markdown, publish with one script.

The most critical design consideration is quota management. Free APIs offer broad accessibility but impose daily call limits. For bulk operations, delay and retry logic are mandatory. Suppressing unnecessary chained API calls — such as updating the TOC on every single publish — is the key to stable long-term operation.

Series overview: Series index

๋Œ“๊ธ€

์ด ๋ธ”๋กœ๊ทธ์˜ ์ธ๊ธฐ ๊ฒŒ์‹œ๋ฌผ

Agent Memory Engine (2/10) — Building an AI Agent Memory System with SQLite Alone

"ML Foundations (9/9) — PyTorch vs TensorFlow, and the Road to Local LLMs"

"RAG Core Study (14/26) — Evaluation Sets with RAGAS & DeepEval"

"ML Foundations (8/9) — Deep Learning Architectures: CNN, RNN, Attention"

"ML Foundations (7/9) — Deep Learning Training: Optimizers, Regularization, Initialization"

OpenClaw to Hermes Migration (2/13) — What to Preserve, Partially Port, or Discard