Real-World Builds & Monetization
Capstone: Multi-Agent Content Pipeline
Outcome: By the end of this lesson you will have shipped a running CrewAI pipeline of 4 specialized agents — Researcher → Writer → Fact-Checker → Publisher — that takes a topic as input, gathers sources, drafts a post, verifies every factual claim, and saves the finished markdown (or posts it to your CMS via a webhook). This is the multi-agent orchestration you've been building toward for five modules.
What you'll ship — this command runs the full crew:
python pipeline.py --topic "The state of open-source agent frameworks in Q2 2026"
Output (to stdout and ./output/{slug}.md):
[Researcher] Found 8 sources. Top sources: crewAI GitHub 2026 star history,
Anthropic SDK 2026 release notes, LangChain v1.0 announcement...
[Writer] Drafted 1840-word post. 4 sections, 12 claims marked for verification.
[Fact-Checker] Verified 10/12 claims. Flagged 2 for revision:
- "CrewAI has 80K GitHub stars" — actual is ~32K. Corrected.
- "LangChain v1.0 shipped April 2026" — actual is June 2026. Corrected.
[Publisher] Saved → ./output/state-of-open-source-agent-frameworks.md
Webhook POST → https://your-cms.com/api/drafts → 201 Created
The architecture you're building
{
"type": "architecture",
"title": "Multi-Agent Content Pipeline — CrewAI Sequential Process",
"direction": "top-down",
"layers": [
{
"label": "Input",
"color": "blue",
"components": [
{ "label": "--topic \"...\"", "description": "CLI arg; one editorial prompt per run" }
]
},
{
"label": "Researcher (Claude Sonnet 4.6)",
"color": "amber",
"components": [
{ "label": "Web search tool (Tavily)", "description": "Live web, last 90 days" },
{ "label": "Output: research brief", "description": "3-5 findings + 6-10 cited sources + flagged claims" }
]
},
{
"label": "Writer (Claude Sonnet 4.6)",
"color": "purple",
"components": [
{ "label": "Receives: research brief", "description": "Via CrewAI task.context" },
{ "label": "Output: 1500-2000 word draft", "description": "Every claim tagged [^N] for fact-checker" }
]
},
{
"label": "Fact-Checker (Claude Opus 4.7)",
"color": "red",
"components": [
{ "label": "Web search tool (Tavily)", "description": "Re-verify every [^N] claim" },
{ "label": "Output: corrected markdown + change log", "description": "Wrong claims fixed, unverifiable claims hedged" }
]
},
{
"label": "Publisher (Claude Haiku 4.6 — cheap)",
"color": "green",
"components": [
{ "label": "Adds frontmatter YAML", "description": "title, slug, date, tags" },
{ "label": "publish_to_cms webhook", "description": "POSTs to your CMS; saves to ./output/{slug}.md" }
]
}
]
}
Part 1 — Project setup (5 min)
content-crew/
├── pipeline.py # Orchestrates the 4 agents
├── agents/
│ ├── researcher.py # Defines the Researcher agent + its tools
│ ├── writer.py
│ ├── fact_checker.py
│ └── publisher.py
├── tools/
│ ├── web_search.py # Tavily search wrapper
│ └── cms_webhook.py # POST to your CMS
├── requirements.txt
├── .env.example
└── output/ # Generated posts land here
requirements.txt:
crewai==0.95.0
crewai-tools==0.25.0
anthropic==0.42.0
tavily-python==0.7.1
httpx==0.27.2
python-dotenv==1.0.1
.env.example:
ANTHROPIC_API_KEY=sk-ant-...
TAVILY_API_KEY=tvly-... # tavily.com — 1000 free searches/month
CMS_WEBHOOK_URL=https://your-cms.com/api/drafts # optional; leave empty to skip publish
CMS_WEBHOOK_TOKEN= # bearer token for your CMS
Install:
pip install -r requirements.txt
cp .env.example .env # fill in keys
Part 2 — Custom tools (10 min)
CrewAI agents use the same @tool decorator pattern from Module 3. Two tools power this crew: web search (Researcher) and CMS webhook (Publisher).
tools/web_search.py:
import os
from crewai.tools import tool
from tavily import TavilyClient
tavily = TavilyClient(api_key=os.environ["TAVILY_API_KEY"])
@tool("web_search")
def web_search(query: str, max_results: int = 6) -> str:
"""Search the live web. Returns a numbered list of results with title, URL, and snippet."""
resp = tavily.search(
query=query,
max_results=max_results,
search_depth="advanced",
include_answer=True,
days=90, # last 3 months only — we want current
)
lines = [f"Summary: {resp.get('answer', '')}\n"]
for i, r in enumerate(resp.get("results", []), 1):
lines.append(f"[{i}] {r['title']}\n {r['url']}\n {r['content'][:300]}")
return "\n".join(lines)
tools/cms_webhook.py:
import os
import httpx
from crewai.tools import tool
@tool("publish_to_cms")
def publish_to_cms(slug: str, title: str, markdown: str) -> str:
"""
Publish the finished post to the CMS via webhook.
Returns the webhook response or a skip message if no URL is configured.
"""
url = os.environ.get("CMS_WEBHOOK_URL", "").strip()
if not url:
return "CMS_WEBHOOK_URL not set — skipping publish (saved to disk only)."
headers = {"Content-Type": "application/json"}
token = os.environ.get("CMS_WEBHOOK_TOKEN", "").strip()
if token:
headers["Authorization"] = f"Bearer {token}"
r = httpx.post(url, json={"slug": slug, "title": title, "markdown": markdown}, headers=headers, timeout=30)
return f"POST {url} → {r.status_code} {r.reason_phrase}"
Part 3 — The four agents (15 min)
Each agent has one job — that's the whole point of orchestration. An agent with three responsibilities quickly turns into an agent that does none of them well.
agents/researcher.py:
from crewai import Agent
from tools.web_search import web_search
researcher = Agent(
role="Research Analyst",
goal="Gather 6-10 authoritative, recent sources on the assigned topic",
backstory=(
"You are a veteran research analyst. You trust primary sources (vendor docs, "
"GitHub repos, official announcements) over secondary summaries. You are "
"suspicious of round numbers and recent dates — both are red flags for "
"fabrication you flag for the fact-checker to verify."
),
tools=[web_search],
llm="anthropic/claude-sonnet-4-6",
verbose=True,
allow_delegation=False,
)
agents/writer.py:
from crewai import Agent
writer = Agent(
role="Technical Writer",
goal="Draft a 1500-2000 word post that is accurate, scannable, and cites every numerical claim",
backstory=(
"You write for developers. You never pad, never use buzzwords, never say "
"'revolutionary'. You structure posts as: TL;DR (3 bullets) → one H2 per main "
"idea → a concluding 'what to watch'. For every statistic, version number, "
"date, or comparison, you include an inline [^N] citation that the "
"fact-checker will verify. You draft the footnote block at the bottom "
"referencing the Researcher's sources."
),
llm="anthropic/claude-sonnet-4-6",
verbose=True,
allow_delegation=False,
)
agents/fact_checker.py:
from crewai import Agent
from tools.web_search import web_search
fact_checker = Agent(
role="Fact Checker",
goal=(
"Verify every numerical claim, version number, date, and 'first/only/largest' "
"superlative in the draft against live sources. Correct or flag anything wrong."
),
backstory=(
"You are paranoid by trade. You assume every stat is wrong until you've found "
"an authoritative source confirming it. You treat negative claims ('X doesn't "
"have Y', 'X has no API') with extra suspicion — these are the single most "
"common source of errors, because absence of evidence isn't evidence of absence. "
"You correct inline by editing the draft, preserving the rest of the writer's voice."
),
tools=[web_search],
llm="anthropic/claude-opus-4-7", # larger model for nuanced verification
verbose=True,
allow_delegation=False,
)
agents/publisher.py:
from crewai import Agent
from tools.cms_webhook import publish_to_cms
publisher = Agent(
role="Publisher",
goal="Save the final post to disk and push to the CMS webhook",
backstory=(
"You are the last pair of eyes. You add frontmatter YAML (title, slug, date, tags), "
"save the post to ./output/{slug}.md, and POST it to the CMS webhook. "
"You do not edit content — that's the writer's and fact-checker's job."
),
tools=[publish_to_cms],
llm="anthropic/claude-haiku-4-6", # cheap model for mechanical work
verbose=True,
allow_delegation=False,
)
The model choice is deliberate:
- Researcher: Sonnet — broad reasoning + tool use
- Writer: Sonnet — quality drafting, style control
- Fact-checker: Opus — highest accuracy, worth the cost for correctness
- Publisher: Haiku — tiny mechanical task, ~5-6× cheaper
Matching model power to task difficulty cuts your crew's total cost by ~60% vs. using Sonnet for every agent (pattern from Module 5 Lesson 1).
Part 4 — The orchestrating crew (10 min)
pipeline.py:
import argparse
import re
from pathlib import Path
from dotenv import load_dotenv
from crewai import Crew, Task, Process
from agents.researcher import researcher
from agents.writer import writer
from agents.fact_checker import fact_checker
from agents.publisher import publisher
load_dotenv()
Path("output").mkdir(exist_ok=True)
def slugify(title: str) -> str:
s = re.sub(r"[^\w\s-]", "", title.lower())
return re.sub(r"[\s_-]+", "-", s).strip("-")[:80]
def build_crew(topic: str) -> Crew:
research_task = Task(
description=(
f"Research the topic: '{topic}'.\n\n"
f"Deliverable: a structured brief containing:\n"
f"1. 3-5 key findings (one sentence each)\n"
f"2. 6-10 source citations in format [N] Title — URL\n"
f"3. Any claim you flag as 'potentially unreliable' (round numbers, recent dates, superlatives)\n\n"
f"Prioritize sources from the last 90 days. Avoid aggregator blogs — go to primary sources."
),
expected_output="A research brief with findings, cited sources, and flagged claims.",
agent=researcher,
)
write_task = Task(
description=(
"Using the research brief from the Researcher, draft a 1500-2000 word blog post.\n\n"
"Structure:\n"
"- TL;DR: 3 bullets\n"
"- 3-5 H2 sections, each ~300 words\n"
"- Concluding 'What to watch' section\n\n"
"For every numerical claim or date, include a [^N] inline citation and list "
"the matching footnote at the bottom using the Researcher's source URLs. "
"Do NOT invent claims not supported by the research brief."
),
expected_output="A complete markdown blog post with inline citations and a footnote block.",
agent=writer,
context=[research_task],
)
factcheck_task = Task(
description=(
"Review the draft from the Writer. For EVERY [^N] citation:\n"
"1. Re-search the cited fact using web_search.\n"
"2. Compare what the draft says against the sources.\n"
"3. If wrong, edit the draft inline to fix it. Preserve the rest of the writer's voice.\n"
"4. Flag any claim you couldn't verify — rewrite it with hedged language like "
" 'as of [date]' or 'reportedly' rather than stating it as fact.\n\n"
"Also scan for unchecked negative claims ('X has no Y') — these are the #1 "
"source of errors. Search each one independently before accepting it.\n\n"
"Output: the corrected markdown, plus a change log listing which claims "
"you fixed and which you hedged."
),
expected_output="Corrected markdown draft + change log of edits.",
agent=fact_checker,
context=[research_task, write_task],
)
publish_task = Task(
description=(
"Take the fact-checker's corrected markdown and:\n"
"1. Prepend frontmatter YAML with: title (infer from H1), slug, date (today), "
" tags (3-5 inferred from content), featured: true.\n"
"2. Use the `publish_to_cms` tool with (slug, title, full_markdown).\n"
"3. Report the final status (CMS response or skip message).\n\n"
"Do NOT modify the body. Only add frontmatter."
),
expected_output="Final markdown with frontmatter + webhook status.",
agent=publisher,
context=[factcheck_task],
)
return Crew(
agents=[researcher, writer, fact_checker, publisher],
tasks=[research_task, write_task, factcheck_task, publish_task],
process=Process.sequential,
verbose=True,
)
def save_output(topic: str, final_markdown: str):
slug = slugify(topic)
path = Path("output") / f"{slug}.md"
path.write_text(final_markdown, encoding="utf-8")
print(f"\n✓ Saved → {path}")
return path
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--topic", required=True, help="Blog post topic (1-2 sentences)")
args = parser.parse_args()
crew = build_crew(args.topic)
result = crew.kickoff()
# CrewAI's result.raw holds the final Publisher output, which is the framed markdown
save_output(args.topic, str(result.raw))
Part 5 — Run it (2 min)
python pipeline.py --topic "The state of open-source agent frameworks in Q2 2026"
You should see the 4 agents run sequentially:
[Researcher] Working on research task...
→ web_search called (3 times)
→ brief with 8 sources returned
[Writer] Working on draft task...
→ 1840-word draft returned
[Fact-Checker] Working on verification task...
→ web_search called (12 times)
→ 2 corrections, 0 hedged
[Publisher] Working on publish task...
→ publish_to_cms called
→ Saved → output/state-of-open-source-agent-frameworks.md
→ POST https://your-cms.com/... → 201 Created
Total runtime: 3–6 minutes depending on web search latency. Total cost on Anthropic: $0.40–$0.80 per post at 2026 prices.
Part 6 — What changes when you swap the topic
The pipeline is topic-agnostic. To adapt for a different editorial beat:
| Change | Where |
|---|---|
| Different research focus | Agent backstory + description on research_task |
| Different post length | Writer task description (1500-2000 → 800-1200) |
| Different CMS (WordPress, Ghost, Notion, Obsidian) | Swap publish_to_cms implementation in tools/cms_webhook.py |
| Add an SEO agent before Publisher | One new Agent(...) + one new Task(...) with context=[factcheck_task] |
| Make it a scheduled job | Wrap pipeline.py in a GitHub Actions cron (0 9 * * 1-5) |
Adding a 5th agent is ~15 lines of code — that's the power of CrewAI's structure.
Part 7 — Troubleshooting matrix
| Symptom | First check | Typical cause |
|---|---|---|
| Researcher returns 0 results | Tavily dashboard | Free-tier exhausted, or days=90 too restrictive |
| Writer output is very short (<500 words) | Writer backstory | Agent "goal" phrased as "summary" rather than "1500-2000 word post" |
| Fact-checker doesn't actually correct the draft | CrewAI version + model_config | Older CrewAI versions don't pass prior task outputs as editable context — upgrade to 0.95+ |
| Publisher fails webhook | CMS_WEBHOOK_URL | Endpoint wrong, auth header missing, or endpoint doesn't accept the shape |
All agents use the same model even though you set different llm on each | env override | CREWAI_DEFAULT_MODEL env var trumps per-agent config — unset it |
| Crew hangs forever | Task descriptions | Ambiguous "expected_output" keeps agents iterating — tighten it |
Build checkpoint — finish this before claiming the certificate
- Ship the crew.
python pipeline.py --topic "anything"runs through all 4 agents without errors. - Ship the write path. The final markdown in
./output/has correct frontmatter and real content. - Ship a catch. Run on a topic where the Writer is likely to hallucinate stats (e.g. "Recent GitHub star counts for agent frameworks"). Confirm the Fact-Checker's change log shows at least 1 correction.
- Ship the webhook. Point
CMS_WEBHOOK_URLat a real endpoint (orhttpbin.org/postfor testing). Confirm the Publisher POSTs and the response logs. - Screenshot the crew running + the final markdown + the webhook response — as your proof of work.
You've just shipped a multi-agent pipeline that does real editorial work on a topic-agnostic basis. Every orchestration pattern from the last five modules — agent specialization, task sequencing, tool routing, model-cost matching, fact-checking guardrails — is now running in production.
Next (optional): add a Slack integration so the crew drops a daily digest into your team channel, or hook it to an RSS feed so new tech announcements get turned into drafts automatically. :::