From 6bb5b1acaa720d4963bdeec3876175f140d644b1 Mon Sep 17 00:00:00 2001 From: AIMLPM Date: Sat, 4 Apr 2026 08:23:31 -0700 Subject: [PATCH] Add MarkCrawl - web crawler for RAG pipelines with chunking and vector upload --- README.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/README.md b/README.md index 11e2bb0..9304ba3 100644 --- a/README.md +++ b/README.md @@ -247,6 +247,10 @@ Processing: A Survey** *CocoIndex is an open-source ETL framework to index data for AI, such as RAG; with realtime incremental updates and support custom logic like lego.* [`Website`](https://cocoindex.io/) +- **MarkCrawl** + *Turn any website into clean Markdown for RAG pipelines in one command. Crawl, chunk, embed, and upload to Supabase/pgvector — with built-in LLM extraction, MCP server, and LangChain tools.* + [`Website`](https://github.com/AIMLPM/markcrawl) [`GitHub`](https://github.com/AIMLPM/markcrawl) + ## Other Collections - [Awesome LLM RAG](https://github.com/jxzhangjhu/Awesome-LLM-RAG)