Measure and optimize your visibility in AI search.
Measure how often AI assistants cite, recommend, and mention your content — then iteratively optimize it to rank higher. Tests against Claude, GPT, and Gemini simultaneously.
Search is moving to AI. When someone asks ChatGPT, Claude, or Gemini a question, your content either shows up or it doesn't. Traditional SEO metrics don't capture this.
AutoAEO gives you a score (0–100) for how visible your content is to AI assistants, then rewrites it to score higher — in a loop, until it converges.
┌──────────────────────────────────────────────────────────────┐
│ AutoAEO │
│ │
│ 1. Generate queries "best project management tools" │
│ 2. Ask each model Claude, GPT, Gemini (with search) │
│ 3. Extract citations domains cited as sources │
│ 4. Adjudicate mentions Claude Haiku → brand + sentiment │
│ 5. Aggregate mentioned · cited · sentiment │
│ 6. Optimize (optional) rewrite content → re-score → repeat │
└──────────────────────────────────────────────────────────────┘
For a given topic and URL, the tool:
- Generates realistic search queries using Claude (informational, comparison, recommendation)
- Sends each query to every model with their native search tools enabled
- Extracts cited domains from each response (source URLs and inline links)
- Runs an adjudicator (Claude Haiku) over each response to detect brand mentions and classify sentiment as positive / neutral / negative — for the target plus the top-cited competitor domains
- Aggregates three signals into a composite score (0–100):
- Mentioned (50%) — % of responses that mention your brand
- Cited (35%) — % of responses that cite your domain as a source
- Sentiment (15%) — average sentiment when mentioned (weight scales up with mention count, redistributing to Cited when mentions are sparse)
- Reports your score plus a competitive leaderboard — every adjudicated domain ranked by composite score, so you can see how you stack up against the brands AI assistants surface for the same queries
Extends measurement with an iterative improvement loop:
- Scrapes your page content (Readability)
- Searches for competitor content (Exa API)
- Scores baseline content against competitors
- Rewrites content with Claude to maximize AI visibility
- Re-scores — keeps improvements, reverts regressions
- Repeats until convergence or budget exhausted
| Model | Provider | Search |
|---|---|---|
| Claude Sonnet 4.6 | Anthropic | webSearch |
| Claude Haiku 4.5 | Anthropic | webSearch |
| GPT-5.5 | OpenAI | web_search |
| GPT-5.4 Mini | OpenAI | web_search |
| Gemini 3.1 Pro | google_search | |
| Gemini 3 Flash | google_search |
Each model uses its native search integration via the AI SDK, so responses reflect real search-augmented behavior.
git clone https://github.com/ymansurozer/auto-aeo.git
cd auto-aeo
pnpm installcp .env.example .envANTHROPIC_API_KEY=sk-...
OPENAI_API_KEY=sk-...
GOOGLE_GENERATIVE_AI_API_KEY=...
EXA_API_KEY=... # required for optimize onlyYou only need keys for the models you want to test. Skip a provider and those models are excluded.
bun src/index.ts measure --topic "project management" --url "https://example.com"bun src/index.ts optimize --topic "project management" --url "https://example.com"| Command | What it does |
|---|---|
measure --url <u> [--topic <t> | --queries-list <q>] |
Measure AI visibility across all models |
optimize --url <u> [--topic <t> | --queries-list <q>] |
Iteratively optimize content for AI visibility |
report --list |
List past reports |
report --session <id> |
View a specific report |
| Flag | Default | Description |
|---|---|---|
--models <ids> |
all | Comma-separated model IDs to test |
--queries <n> |
15 | Number of queries to generate (when using --topic) |
--queries-list <queries> |
— | Comma-separated queries (alternative to --topic) |
--runs <n> |
3 | Runs per query per model |
--timeout <seconds> |
60 | Per-judge-call timeout in seconds |
--max-iterations <n> |
20 | Max optimization iterations |
--max-cost <usd> |
— | Cost budget cap |
--convergence-threshold <n> |
0.02 | Min improvement to continue (2%) |
All output goes to the output/ directory:
report-{session}.html— measurement or optimization report with score breakdowns and a competitive leaderboardoptimized-content-{session}.md— rewritten content (optimize only)experiment-log.jsonl— every judge call with full metadata
src/
├── cli/ CLI layer (Commander.js)
│ └── commands/ measure, optimize, report
├── measurement/ Measurement runner
├── optimization/ Optimization loop, content rewriter, search simulator
├── queries/ Query generation (Claude)
├── models/ Model registry, judge (AI SDK)
├── scoring/ Citation extraction, brand adjudicator (Claude Haiku), aggregation
├── experiment/ JSONL experiment logging
├── scraper/ Content extraction (Readability)
├── search/ Competitor search (Exa)
├── report/ HTML report generation
├── config/ Zod schema, defaults
└── utils/ Cost tracking
| Package | Purpose |
|---|---|
| AI SDK | LLM calls with native search tools |
| Exa | Competitor content search |
| Readability | Content extraction |
| Commander | CLI framework |
| Zod | Config validation |
| async-caller | Parallelized API calls with concurrency control |