AI Engine Optimization (AEO) for Magento 2 / Adobe Commerce. Generates
spec-compliant llms.txt, llms-full.txt, and JSONL files so ChatGPT,
Claude, Gemini, Perplexity, and other LLM-powered crawlers can ingest your
catalog efficiently.
After install, your storefront serves:
| URL | What it is |
|---|---|
https://shop/llms.txt |
Spec-compliant llmstxt.org file (compact markdown) |
https://shop/llms-full.txt |
Same structure, full sanitized descriptions inline |
https://shop/llms.jsonl |
One JSON record per line, for vector indexing |
https://shop/{url-key}.md |
On-the-fly Markdown mirror of any product/category/CMS page |
Generation happens via cron (daily by default), CLI, or the admin "Generate
Now" button. The output is streamed to disk with bounded memory, atomically
renamed on completion, and served with proper ETag / Cache-Control headers.
LLM crawlers can ingest a typical Magento storefront — full theme, JS, image sprites, navigation chrome — but that's wasteful for everyone. The llmstxt.org standard defines a clean text format optimized for AI ingestion: stable links, structured headings, descriptions in their natural prose form rather than buried in product cards.
This module produces that format for Magento, with care taken for the things Magento makes hard: multi-store layout, Page Builder content, CMS directive resolution, customer-group pricing, and very large catalogs.
composer require angeo/module-llms-txt:^3.0
bin/magento module:enable Angeo_LlmsTxt
bin/magento setup:upgrade
bin/magento setup:di:compile # only in production mode
bin/magento setup:static-content:deploy adminhtml # only in production mode
bin/magento cache:flushThen generate your first batch:
bin/magento angeo:llms:generateVisit https://your-store.tld/llms.txt.
All settings live at Stores → Configuration → Angeo → LLMs.txt.
| Field | Default | Notes |
|---|---|---|
| Enable | Yes | Master switch. |
| Exclude This Scope | No | Available at website + store scope. Skips generation for this scope. |
| Store Summary | — | One-line summary used as the spec-compliant blockquote. If empty, falls back to Design → HTML Head → Default Description. |
| Field | Default | Notes |
|---|---|---|
| Include Categories | Yes | |
| Include CMS Pages | Yes | |
| Include Products | Yes | |
Products under ## Optional |
Yes | Recommended. Lets context-budget-constrained AI clients drop products without losing categories / pages. |
| Product Limit | 5000 | 0 = unlimited. |
| Exclude Out-of-Stock Products | No | |
| CMS Identifiers to Exclude | no-route, enable-cookies, privacy-policy-cookie-restriction-mode |
Comma- or newline-separated. |
| Customer Group for Pricing | NOT LOGGED IN | Which group's final price (with special / group prices) is exposed. |
| Field | Default | Notes |
|---|---|---|
| Generate llms.txt | Yes | |
| Generate llms-full.txt | No | 5–50× larger; enable only if you actually want it. |
| Generate JSONL | Yes | One record per line; embeds-ready. |
Serve /url-key.md Mirrors |
No | Per-entity Markdown rendering; on-the-fly, no disk. |
| Field | Default | Notes |
|---|---|---|
| Resolve CMS Directives | Yes | Renders {{widget}}, {{block}}, {{var}} via Magento's frontend filter. |
| Page Builder Strategy | Exclude | See below. |
| Excluded Content-Types | products, banner, slider, slide, video, map, buttons, button-item, block, dynamic-block, divider, spacer |
Used under Exclude strategy. |
| Allowed Content-Types | text, heading, html, tabs, tab-item, row, column, column-group |
Used under Allow strategy. |
| Strategy | Effect |
|---|---|
| Preserve | Keep all Page Builder content; only strip wrapper attributes. |
| Exclude | Drop elements whose data-content-type is in the excluded list. Default. |
| Allow | Drop everything EXCEPT data-content-type in the allowed list. |
| Strip | Drop ALL elements that carry a data-content-type attribute. |
The filter parses content with DOMDocument (not regex), so nested Page
Builder containers are handled correctly. Known content-types include:
row, column-group, column, tabs, tab-item, text, heading,
html, image, video, map, divider, spacer, buttons, button-item,
banner, slider, slide, products, block, dynamic-block.
| Field | Default | Notes |
|---|---|---|
| Collection Page Size | 1000 | Lower if hitting memory limits on shared hosting. |
| Field | Default | Notes |
|---|---|---|
| Cache-Control TTL (s) | 3600 | Sent as public, max-age=… on the served files. |
| Field | Default | Notes |
|---|---|---|
| Cron Expression | 0 2 * * * |
Daily at 02:00 server time. |
# Generate everything for all eligible stores
bin/magento angeo:llms:generate
# Single store, skip JSONL
bin/magento angeo:llms:generate --store=default --no-jsonl
# Per-store/per-format last-run status
bin/magento angeo:llms:status
# Lint generated files for spec compliance
bin/magento angeo:llms:validateDrop a new section into llms.txt (e.g. a "Brands" list, a "Recent Posts"
section, etc.) by implementing Angeo\LlmsTxt\Api\ProviderInterface and
registering it via di.xml.
namespace Vendor\Module\Provider\Llms;
use Angeo\LlmsTxt\Api\OutputContextInterface;
use Angeo\LlmsTxt\Model\Provider\AbstractProvider;
class BrandsProvider extends AbstractProvider
{
public function provide(OutputContextInterface $context): iterable
{
yield "## Brands\n\n";
foreach ($this->brandRepo->getList($context->getStore()->getId()) as $brand) {
$label = $this->escapeMarkdown($brand->getName());
yield "- [{$label}]({$brand->getUrl()})\n";
}
yield "\n";
}
}<!-- etc/di.xml -->
<type name="Angeo\LlmsTxt\Model\Generator\LlmsTxtGenerator">
<arguments>
<argument name="providers" xsi:type="array">
<item name="brands" xsi:type="object">Vendor\Module\Provider\Llms\BrandsProvider</item>
</argument>
</arguments>
</type>The base class gives you escapeMarkdown(), encodeJsonl(), isJsonl(),
isFullTxt(), and isApplicable() overridable to opt out per-format.
Insert your own filter between Page Builder and HTML stripping (e.g. to
remove <script> data attributes, redact phone numbers, etc.) by implementing
Angeo\LlmsTxt\Api\SanitizerFilterInterface and re-declaring the pipeline
in di.xml.
<type name="Angeo\LlmsTxt\Model\Sanitizer\Sanitizer">
<arguments>
<argument name="filters" xsi:type="array">
<item name="cms_directive" xsi:type="object">Angeo\LlmsTxt\Model\Sanitizer\Filter\CmsDirectiveFilter</item>
<item name="page_builder" xsi:type="object">Angeo\LlmsTxt\Model\Sanitizer\Filter\PageBuilderFilter</item>
<item name="redact_pii" xsi:type="object">Vendor\Module\Sanitizer\Filter\PiiRedactionFilter</item>
<item name="html" xsi:type="object">Angeo\LlmsTxt\Model\Sanitizer\Filter\HtmlFilter</item>
<item name="whitespace" xsi:type="object">Angeo\LlmsTxt\Model\Sanitizer\Filter\WhitespaceFilter</item>
</argument>
</arguments>
</type>Hook in via observers — three events are dispatched per store/format pass:
| Event | Data |
|---|---|
angeo_llms_generation_before |
store, format, context |
angeo_llms_generation_after |
store, format, file, bytes, items, duration |
angeo_llms_generation_failed |
store, format, error |
- Old files in
media/llms/can be deleted (output now lives inmedia/angeo/llms/). - Any custom
ProviderInterfaceimplementations must change from returning astringto yieldingiterable<string>. See Extending — custom providers. - Drop any reverse-proxy / Nginx rewrites pointing at the old paths.
- Re-run Stores → Configuration → Angeo → LLMs.txt to set the new fields (Page Builder strategy, customer group, etc.).
- External tooling that called the GET
/admin/angeo_llms/generate/indexURL must switch to the CLI command (the admin endpoint is now POST + CSRF).
MIT — see LICENSE.
- GitHub Issues: https://github.com/angeo-dev/module-llms-txt/issues
- Email: support@angeo.dev