April 4, 2026 · 14 min read

I Automated a Knowledge Base with 6 Serverless Agents

The knowledge base had 365 articles. The product team shipped features faster than anyone could document them. Only 1% of released features had matching documentation. So I built a pipeline: 6 serverless agents that turn project tickets into published knowledge base articles. Here's the full architecture.

The Problem

There were 4,640 feature tickets spanning 10 months. Each one represented a shipped product change. Some were major features, some were small fixes. Almost none of them had a corresponding knowledge base article.

The numbers were grim:

4,640 feature tickets analyzed
1% had a linked documentation article
192 tickets in "Ready for QA" or later — shipped but undocumented
824 tickets stuck in design sub-stages — a bottleneck I couldn't control

The documentation gap wasn't laziness. The team was small, features shipped fast, and writing knowledge base articles was always the thing that got cut when deadlines hit. I needed a system that could detect when documentation was needed, draft articles automatically, and route them through human review — all without anyone having to remember to do it.

The Pipeline at a Glance

Ticket Sync → AI Triage → Human Gate → Source Collection → AI Draft → Editor Review → PM Review → CMS Push → Published

Six automated agents. Three human checkpoints. Everything runs on Supabase Edge Functions and pg_cron — no servers, no containers, no infrastructure to maintain. Here's each agent in detail.

The 6 Agents

1. Ticket Refresh daily 6:30 AM

Syncs tickets from 3 project boards into a normalized pipeline_tickets table (842+ rows). Detects late-stage features (Ready for QA, Final Release Prep) as documentation candidates. Entry point for the entire pipeline.

2. Triage daily 7:00 AM

Gemini Flash classifies each new ticket in batches of 25 (temperature 0.1 for consistency). Returns: needs_article, priority tier (1-4), suggested template, and reasoning. Creates candidate rows for human approval.

3. Source Collector on demand

Aggregates 5-6 data sources per article: ticket description, comments, attachments, linked tickets, keyword-matched existing articles, PM recordings. Assesses completeness: complete, partial, or pm_call_needed.

4. Writer v2 on demand

Gemini Pro generates dual output: a Google Doc draft + CMS-ready HTML. Reads the style guide (32 rules), component library (49 components), and template. For edits: fetches existing CMS HTML and performs a merge guard to prevent missing sections.

5. Ticket Sync on demand

Manages the lifecycle: transitions ticket statuses, reassigns owners per stage, creates UX review tasks, posts doc links and staging URLs back to ticket comments. Sends Slack notifications at each transition.

6. CMS Push on demand

Two-phase publish: draft (PATCH to staging) then live (item-level POST to production). For edits: runs a merge guard comparing headers before overwriting. Auto-generates slugs for new articles.

The 9 Stages

The pipeline has a deterministic contract — a reference document that defines exactly what happens at each stage, who owns it, and what the exit criteria are. No ambiguity. This is what makes it possible for AI agents to participate reliably.

Triage

Automated Gemini Flash classifies tickets. Human I review classifications and approve/reject.

Scoping

Human Refine article count, tier, template, and target platforms. Some tickets need multiple articles.

Source Collection

Automated Agent gathers 5-6 sources. Human Schedule PM walkthrough if sources are incomplete.

AI Draft

Automated Gemini Pro generates draft text + CMS HTML. Dual output from a single API call. Human Quick quality review.

Editor Review

Human UX writer polishes wording, links, and tips. Flags accuracy concerns. This is where the article becomes publishable.

PM Review

Human Product manager verifies technical accuracy. Can approve, request minor fixes, send back for major rework, or change scope entirely.

CMS Build

Human Replace screenshot placeholders with real captures, set taxonomy, push to staging via API.

Staging QA

Human Visual review on staging site. Check component rendering, links, images, mobile responsiveness.

Published

Automated Item-level publish pushes the specific article to production without touching anything else on the site.

The Architecture Decisions That Mattered

Why "deterministic contracts" over "just let AI figure it out"

The biggest lesson from this project: AI agents need contracts, not vibes.

I wrote a 54KB reference document (I call it the "harness") that defines every stage, every decision tree, every edge case. When the triage agent classifies a ticket, it doesn't "think about" whether an article is needed — it evaluates against explicit criteria. When the source collector runs, it doesn't "try to find information" — it checks 5 specific sources in a specific order and returns a completeness score.

This sounds rigid, and it is. That's the point. AI agents are great at executing well-defined tasks. They're terrible at deciding which tasks to execute. The contract does the deciding. The agents do the executing.

Why dual output (text + HTML) from a single call

The writer agent generates two outputs from one Gemini Pro call:

Draft text — structured for a Google Doc (headings, paragraphs, bullet lists). This is what the editor and PM review.
Draft HTML — CMS-ready markup with the exact component library tags. This goes directly into the CMS.

Why not generate text first, then convert to HTML? Because the conversion step is where quality dies. CMS component syntax is specific and brittle — a <button> tag gets stripped by the rich text editor, so you need <div role="button">. Complex <div> structures inside <li> elements get mangled. These are rules the AI can learn from the style guide on the first pass, but a post-hoc converter would miss them every time.

Why source completeness scoring

The source collector doesn't just gather information — it tells you when there isn't enough. Each article gets a completeness assessment:

Complete: ticket has substantive description, comments from PM, and linked mockups. Proceed to draft.
Partial: ticket description is empty or vague. I add manual notes before drafting.
PM call needed: Tier 3-4 feature with no structured data anywhere. Schedule a recorded walkthrough with the PM.

This single feature prevents the #1 failure mode of AI content generation: hallucination from insufficient context. If the AI doesn't have enough real information, it will confidently make things up. The completeness flag surfaces data gaps before the AI ever touches the content.

Why item-level CMS publish

This was a March discovery that changed everything. The CMS API has two publish modes:

Site-wide publish — publishes everything, including your staging sandbox, test pages, and half-finished drafts. Dangerous.
Item-level publish — publishes a specific article without touching anything else. Safe.

// Item-level publish — only this article goes live
POST /collections/{collection_id}/items/publish
Body: { "itemIds": ["article_123"] }

This sounds obvious, but I didn't know it existed for weeks. I was doing site-wide publishes and accidentally pushing sandbox content to production. Item-level publish eliminated that entire class of errors.

The Reference Data Layer

The agents don't operate in a vacuum. They read from 5 reference tables that encode institutional knowledge:

Components (49 rows) — every CMS component with its HTML syntax, usage rules, and rendering behavior
SVG Icons (58 rows) — available icons for use in articles
Style Guide (32 rows) — brand rules, tone of voice, formatting standards
Templates (6 rows) — article templates (new article, edit, how-to, troubleshooting, etc.)
Taxonomy (124 rows) — category and subcategory mappings for the CMS

This is essentially a knowledge graph that the AI agents query at runtime. When the writer agent needs to format a step-by-step guide, it reads the component table to find the correct HTML syntax for step blocks, substep blocks, and collapsible sections. When it needs to assign a category, it reads the taxonomy table.

Maintaining these tables is cheap (I update them when components change) and the payoff is huge: the AI generates CMS-ready HTML on the first try instead of requiring manual fixups.

The Stack

Infrastructure

Agents: 6 Supabase Edge Functions (Deno runtime)
Scheduling: pg_cron (the project tracker refresh daily 6:30 AM, triage daily 7:00 AM, rest on-demand)
AI: Gemini Flash (triage classification), Gemini Pro (article drafting)
Database: Supabase Postgres (20+ tables, 7 schemas)
CMS: Webflow Data API v2 (draft + item-level publish)
Integrations: project tracker API, Google Docs, Slack notifications
Tracker UI: React SPA on Vercel (Kanban board for article status)
Cost: Supabase free tier + Gemini API usage. No servers.

What I Learned

1. project tickets are the worst data source

This was the biggest surprise. You'd think the ticket that describes the feature would be the best source of information. It's not. Most project tickets have vague descriptions, no attachments, and scattered comments. The "golden triangle" of sources is: recorded PM walkthrough + design mockups + UX copy. the project tracker is just the index.

2. The AI triage is 90% accurate

Gemini Flash at temperature 0.1 does remarkably well at classifying whether a ticket needs documentation. But the 10% it gets wrong are the important ones — edge cases where a "small fix" actually changes user-facing behavior. The human approval gate catches these. Never skip it.

3. CMS component syntax is the hardest part

Getting the AI to write good prose is easy. Getting it to output valid CMS HTML with the right component tags, the right nesting rules, and the right attribute syntax — that's where the reference tables earn their keep. Without the 49-row component table, every draft would need manual HTML fixups.

4. Source completeness prevents hallucination

If I could give one piece of advice to anyone building AI content pipelines: score your sources before generating. The completeness check (complete / partial / pm_call_needed) is the single most impactful feature in the entire pipeline. It prevents bad articles from ever being generated.

5. Item-level publish is a must

Site-wide CMS publishes are a footgun. One wrong push and your sandbox test page is on production. Item-level publish eliminates this. If your CMS supports it, use it. If it doesn't, build a guard.

6. The review chain is the bottleneck (and that's okay)

The PM review stage has a 3-5 day SLA. That's the slowest part of the pipeline by far. But it's also the most important — it's where technical accuracy gets verified. I tried shortening it. Don't. Let the humans do the thing humans are good at: judging accuracy. Let the machines do the thing machines are good at: drafting, formatting, and pushing buttons.

The Numbers

Pipeline Metrics

4,640 project tickets scanned
365+ existing knowledge base articles (our baseline)
49 CMS components in the reference table
32 style guide rules
124 taxonomy categories
6 Edge Functions
2 pg_cron jobs (daily automated runs)
0 servers to maintain

Would I Build This Again?

Yes, but I'd change two things:

Start with the reference tables. I built the agents first and the style guide / component table later. Everything improved when I added them. They should have been day-one work.
Build the tracker UI earlier. A Kanban board showing where every article is in the pipeline changed how the team interacted with the system. Before the board, articles got stuck in stages because nobody could see them. After the board, bottlenecks became visible and people moved things along.

(Full disclosure: I used Brain Kit — my own MCP memory server — to keep track of all the architecture decisions, component rules, and pipeline stage contracts across sessions while building this. Persistent AI memory across tools was essential for a project this complex.)

The pipeline itself? It works. Articles that used to take 2-3 days of manual writing now take 2-3 hours of review. The AI draft isn't publishable on its own — but it's 80% of the way there, and the remaining 20% is the kind of editing humans are actually good at.

"Wait, the AI wrote this? It reads like our existing articles."
— the PM reviewing the first AI-drafted article

That's the compliment. Not "the AI wrote something amazing." The compliment is "I can't tell the difference." That's what the style guide and component table buy you — output that matches your existing standards, not output that sounds like a chatbot.

From the shop

Brain Kit ($29)

Brain Kit uses the same serverless architecture — Supabase Edge Functions + pgvector. Deploy your own semantic knowledge base in 10 minutes.

Get Brain Kit — $29

Like what I build? Check out the shop — deploy-ready kits starting at $14.

More from the build log

I write about the honest experience of building AI-powered tools and automations. No hype, just what works.