A LangChain-based Python wrapper for running LLM agents with multi-provider support, multi-agent orchestration, file attachments, and MCP server integration.
- Orchestrate multiple agents in plain Python — branch conversations, roll back history, run agents in parallel, hand off between models; no graph framework needed
- Talk to any LLM — Ollama (local), Anthropic Claude, OpenAI GPT, Google Gemini, and others through a single unified API
- Connect MCP servers — extend the agent with tools: file editing, browser automation, Git, vector memory, web search, TTS, notifications, and more. Register any server in a few lines — plain Python tools, stdio MCP, HTTP/SSE MCP, or Docker-hosted MCP. Per-server system prompts, tool filtering, and schema patches are supported out of the box
- Attach files — just pass the file and don't think about model compatibility; PDFs, images, Office documents, audio, and video are each delivered in the best format the model supports, and automatically converted to text or transcribed via Whisper when it doesn't
- Persistent chat history — maintained automatically across calls; warns or auto-summarizes when the context window fills up; since history is just a Python list, you can branch it, merge histories from multiple agents, or surgically remove any entry
from ailist import AIListBase, Provider
# Local model via Ollama
ai = AIListBase("ollama:llama3.2", context_limit=8192, provider=Provider.LLAMA)
# Anthropic Claude
ai = AIListBase("anthropic:claude-opus-4-6", context_limit=180000, provider=Provider.ANTHROPIC)
# OpenAI
ai = AIListBase("openai:gpt-4o", context_limit=128000, provider=Provider.OPENAI)
ai.system = "You are a helpful assistant."
print(ai.run("Explain quantum entanglement in simple terms."))History is kept automatically across calls. Clear it when you need a fresh start:
ai.run("My name is Alex.")
print(ai.run("What is my name?")) # → "Your name is Alex."
ai.chat_history = []Pass any file to a single message or make it a permanent part of every prompt:
# One message
response = ai.run({
0: {"prompt": "Summarize this document and extract all dates."},
1: {"file": "contract.docx"},
})
# Permanent — attached to every subsequent message
ai.prompts[1] = {"file": "report.pdf"}
response = ai.run("What are the key risks mentioned?")import asyncio
response = asyncio.run(ai.run_async("Hello!"))Connect external tools at runtime — the agent is automatically rebuilt with the new capabilities.
import asyncio
from pathlib import Path
from ailist import AIListDemo
async def main():
async with AIListDemo() as ai:
ai.workspace_dir = Path(r"C:\MyProject") # set working directory before connecting
await ai.mcp_connects([
{"name": "workspace"},
{"name": "git"},
{"name": "serena", "dirs": [r"C:\MyProject"]}, # LSP analysis, ide mode by default
])
response = await ai.run_async("List all Python files and show the git status.")
print(response)
asyncio.run(main())
workspace_dirscopes all file tools to a single directory. Set it before callingmcp_connect. Default isPath.cwd().
| Name | What it does | Requires |
|---|---|---|
workspace |
Read/write/search files, run commands and Python code — scoped to workspace_dir |
— |
skills |
Browse agent skill files from a local directory | — |
git |
Git operations (status, log, diff, commit, push, pull, branch) | Node.js |
playwright |
Control a browser (navigate, click, screenshot, scrape) | Node.js |
memory-plus |
Persistent knowledge graph in a JSONL file | Node.js |
qdrant |
Semantic vector storage and search | uv, Docker |
searxng |
Privacy-friendly web search | Docker |
serena |
LSP-based semantic code analysis: find/rename/replace symbols across 30+ languages, persistent project memory | uv |
sympy |
Symbolic math and equation solving | sympy |
piper |
Text-to-speech, generates WAV files | piper-tts |
apprise |
Push notifications (Telegram, Discord, Email…) | apprise |
file-converter |
Convert files to text (PDF, Office, audio, video) | — |
Register any server by adding an MCPServerDef entry in your AIList subclass. Four launcher types are supported:
from ailist import AIList, MCPServerDef, Provider
class MyAgent(AIList):
def __init__(self):
super().__init__("openai:gpt-4o", int(128000 * 0.8), Provider.OPENAI)
# Plain Python tools — no MCP protocol, just @tool functions
self.MCPServers["mytools"] = MCPServerDef(
package="", launcher="builtin",
builtin_tools=[my_tool_a, my_tool_b],
description="My custom tools",
)
# stdio MCP server via npx
self.MCPServers["mytool"] = MCPServerDef(
package="@myorg/my-mcp-server",
description="My npx MCP server",
)
# HTTP/SSE MCP server already running elsewhere
self.MCPServers["mysse"] = MCPServerDef(
package="", launcher="sse",
url="http://localhost:8000/sse",
description="My SSE MCP server",
)
# Docker-hosted MCP (SSE inside container)
self.MCPServers["myservice"] = MCPServerDef(
package="", launcher="sse",
url="http://localhost:9000/sse",
docker={"image": "myorg/myservice", "name": "myservice", "ports": ["9000:9000"]},
description="My Docker MCP server",
)Per-server customisation after connecting:
async with MyAgent() as ai:
await ai.mcp_connect("mytool",
exclude_tools=["tool_i_dont_need"], # hide specific tools from the model
schema_patches={"some_tool": {"param": {"description": "clearer description"}}},
)
# add a system prompt hint for this server's tools
ai.system_tool_instructions += "\nWhen using mytool, prefer X over Y."workspace handles all file operations (~20 tools: read, edit, search, run commands). serena adds LSP-level code intelligence on top: find symbols by name across the codebase, rename with scope awareness, replace a function body without knowing its exact text, persistent per-project memory.
serena defaults to context="ide" which disables its own file tools — no conflicts:
ai.workspace_dir = Path(r"C:\MyProject")
await ai.mcp_connects([
{"name": "workspace"},
{"name": "serena", "dirs": [r"C:\MyProject"]},
])
# Second run — project already indexed, skip onboarding:
# {"name": "serena", "dirs": [...], "modes": ["no-onboarding", "interactive", "editing"]}Approximate tool counts by combination:
| Combination | Tools |
|---|---|
workspace only |
~20 |
workspace + git |
~28 |
workspace + serena |
~27 |
workspace + serena + git |
~35 |
Note:
serena(context="agent")andworkspacetogether cause tool name conflicts — always use the defaultcontext="ide"when both are connected.
async with AIListDemo() as ai:
await ai.mcp_connect("apprise")
ai.apprise.urls.append("tgram://YOUR_BOT_TOKEN/YOUR_CHAT_ID")
await ai.run_async(
"Analyze all CSV files in C:\\Data, write a summary report, "
"and notify me when done."
)Supports Telegram, Discord, email, Slack, and 100+ other services.
Each AIList instance is a plain Python object — chat_history is just a list. No graph framework, no special primitives. Branch, merge, hand off, run in parallel — it's all regular Python.
Save history before a risky step; restore if the result is unsatisfactory:
ai = AIListBase("ollama:llama3.2", context_limit=8192, provider=Provider.LLAMA)
ai.run("Analyze the project structure and suggest a refactoring plan.")
checkpoint = list(ai.chat_history)
result_a = ai.run("Apply the plan — rename modules and update imports.")
if "error" in result_a.lower():
ai.chat_history = checkpoint
result_b = ai.run("Apply only the safe parts of the plan, skip renames.")Run multiple agents simultaneously, each with its own context:
async def main():
ai_docs = AIListBase("ollama:llama3.2", context_limit=8192, provider=Provider.LLAMA)
ai_code = AIListBase("ollama:llama3.2", context_limit=8192, provider=Provider.LLAMA)
ai_tests = AIListBase("ollama:llama3.2", context_limit=8192, provider=Provider.LLAMA)
docs, code, tests = await asyncio.gather(
ai_docs.run_async("Write documentation for this module."),
ai_code.run_async("Refactor this module for readability."),
ai_tests.run_async("Write unit tests for this module."),
)Draft with a fast local model, polish with a powerful one:
draft_ai = AIListBase("ollama:llama3.2", context_limit=8192, provider=Provider.LLAMA)
final_ai = AIListBase("anthropic:claude-opus-4-6", context_limit=180000, provider=Provider.ANTHROPIC)
draft_ai.run("Write a first draft of the executive summary based on these notes.")
draft_ai.run("Expand the risks section.")
draft_ai.run("Add a conclusion.")
# Hand the full conversation to Claude for the final pass
final_ai.chat_history = list(draft_ai.chat_history)
result = final_ai.run("Polish the entire document: fix tone, tighten language, ensure consistency.")A lightweight model decides which specialist handles the request:
async def main():
router = AIListBase("ollama:llama3.2", context_limit=8192, provider=Provider.LLAMA)
router.system = "Reply with exactly one word: CODE, DATA, or SEARCH."
decision = await router.run_async(user_query)
async with AIListDemo() as ai:
ai.workspace_dir = Path(r"C:\MyProject")
if "CODE" in decision:
await ai.mcp_connects([{"name": "workspace"}, {"name": "git"}])
elif "DATA" in decision:
await ai.mcp_connect("qdrant")
elif "SEARCH" in decision:
await ai.mcp_connect("searxng")
result = await ai.run_async(user_query)Combine what multiple agents learned into a single context:
async def main():
ai_a = AIListBase("ollama:llama3.2", context_limit=8192, provider=Provider.LLAMA)
ai_b = AIListBase("ollama:llama3.2", context_limit=8192, provider=Provider.LLAMA)
await asyncio.gather(
ai_a.run_async("Research the market opportunity for this product idea."),
ai_b.run_async("Research the main technical risks for this product idea."),
)
final = AIListBase("anthropic:claude-opus-4-6", context_limit=180000, provider=Provider.ANTHROPIC)
final.chat_history = ai_a.chat_history + ai_b.chat_history
result = await final.run_async("Based on everything above, write an investment memo.")Native (passed as-is when the provider supports it): PNG, JPG, JPEG, WEBP, GIF, BMP, TIFF, PDF, MP3, WAV, OGG, M4A, FLAC, AAC, MP4, MOV, AVI, WEBM.
Auto-converted to text: DOCX, DOC, XLSX, XLS, PPTX, HTML, and all plain text formats (TXT, MD, PY, JS, TS, GO, RS, CPP, C, JAVA, JSON, CSV, YAML, SQL, and more).
Audio and video files are automatically transcribed via faster-whisper and passed to the model as text — even when the provider has no native audio support. Video also extracts evenly spaced frames.
response = ai.run({
0: {"prompt": "Who is speaking and what are the main points?"},
1: {"file": "meeting.mp3"},
})Transcription runs on CPU by default. Enable GPU acceleration:
ai.transcript_on_cuda = True
ai._converter = FileConverter(use_cuda=True)Every call to run() assembles the model's context in this order:
[systems] → [prompts] → [chat_history] → [current run() input]
systems becomes the system message. prompts are prepended to every human message as permanent context. chat_history is the conversation so far — each run() appends the new exchange to it automatically, so the model remembers everything until you clear it. The current run() input lands at the very end.
This structure matters for quality: models read the beginning (systems + prompts) and the end (current input) most reliably, and tend to under-weight the middle (history). Use prompts for reference material that should always be in focus — a project overview, a style guide, a PDF the model should consult on every question — rather than burying it in history.
system / prompt are shorthands for systems[0] / prompts[0]. The dictionary form accepts any integer key — entries are sent in ascending key order, which lets you compose a system prompt from independent parts without string concatenation:
ai.system = "You are a code reviewer." # systems[0]
ai.systems[1] = {"prompt": policy_text, "tag": "policy"} # wrapped in <policy>...</policy>
ai.systems[2] = {"file": "conventions.md"} # injected after policy
ai.prompt = "Focus on security issues." # prompts[0]
ai.prompts[1] = {"file": "codebase_overview.md"} # always in context, always freshMCP tool instructions are appended to the system message automatically and never interfere with your entries.
ai.context_limit = 8192 # max tokens sent to the model — set to model's context window
# minus expected response length to leave room for the answer
# (passed to constructor; can be changed at any time)
ai.auto_summarize_history = True # compress old history when context_limit is approached (default: False)
ai.auto_summarize_history_keep_last = 4 # keep last N exchanges verbatim when summarizing (default: 1)
ai.control_context_limit = True # raise an error instead of overflowing silently (default: True)
# Manual summarization
ai.summarize_history(keep_last=2)
# Token estimate before sending (no API call)
_, estimated = ai.estimate_tokens_withfactor("My next question")
print(estimated, "/", ai.context_limit) # e.g. 3241 / 8192
# Clear history
ai.chat_history = []All token counts are available after every run() / run_async() — useful for cost tracking and capacity planning:
# How full is the context window?
print(ai.input_tokens, "/", ai.context_limit)
# Actual tokens billed this call (includes all intermediate tool calls)
print("Billed input: ", ai.full_input_tokens)
print("Billed output:", ai.full_output_tokens)
# Cumulative totals since object creation
print("Total input: ", ai.full_input_tokens_total)
print("Total output:", ai.full_output_tokens_total)input_tokens reflects how much of the context window your request occupies. full_input_tokens sums all API round-trips in a single run() call, including intermediate tool calls — use it for cost estimation. Cumulative totals let you track spend across a long session without external instrumentation.
Supported on Anthropic Claude, OpenAI o-series, and Qwen/DeepSeek reasoning models. Each provider exposes reasoning through a different API parameter — apply_thinking_mode handles all the differences internally, so you just pick a level:
config = ai.apply_thinking_mode(thinking="high")
response = await ai.run_async("Solve this step by step.", config=config)
print(ai.last_thinking) # the model's internal reasoning
print(response) # the final answerAvailable levels:
| Level | Description |
|---|---|
off |
Thinking disabled (or minimum, if the provider doesn't support full off) |
low |
Minimal reasoning budget |
medium |
Moderate reasoning budget |
high |
Extended reasoning — good default for hard tasks |
max |
Maximum budget the provider supports |
ai.loglevel = 0 # silent (default in AIList)
ai.loglevel = 1 # timing and token stats
ai.loglevel = 2 # + message text per round (default in AIListDemo)
ai.loglevel = 3 # + full JSON history
print(ai.log) # accumulated log
print(ai.last_message) # last log entry
ai.get_systemprompt_log() # snapshot current system prompt to log (useful after mcp_connect)AIListBase — core engine: history, files, providers, token counting
└── AIList — adds MCP servers, workspace tools, policy prompts, variable substitution
└── AIListDemo — ready-to-run config for development (local Ollama model, loglevel=2)
Use AIListBase for scripts that don't need MCP. Use AIList as the base for your own production subclass. Use AIListDemo for quick experiments — and as a reference for how to structure your own subclass.
The recommended pattern is to subclass AIList once per project, baking in your model, workspace, MCP servers, and any prompt adjustments. This keeps all agent configuration in one place and makes call-site code minimal:
from pathlib import Path
from ailist import AIList, Provider
class MyAgent(AIList):
def __init__(self, tools=None):
super().__init__(
modelName = "openai:gpt-4o",
context_limit = int(128000 * 0.8),
provider = Provider.OPENAI,
tools = tools or [],
)
self.loglevel = 1
self.workspace_dir = Path(r"C:\MyProject")
# Adjust or disable built-in policy prompts:
# self.systems[-3] = {} # disable MINIMAL CHANGE POLICY
# self.systems[-2] = {} # disable VERIFICATION POLICY
# self._prompt_workspace = "Custom workspace instructions..."
# Usage is then just:
async with MyAgent() as ai:
await ai.mcp_connects([{"name": "workspace"}, {"name": "git"}])
result = await ai.run_async("Refactor the auth module.")AIList automatically injects workspace boundary and behaviour policies into the system prompt. Variables are resolved before each run():
from pathlib import Path
ai.workspace_dir = Path(r"C:\MyProject") # → {WORKSPACE_DIR} in prompts
ai.attachments_dir = "attachments" # → {ATTACHMENTS_DIR} = workspace/attachments
ai.skills_dir = "skills" # → {SKILLS_DIR} = workspace/skills
# Policies are stored as editable strings:
ai._prompt_workspace # [WORKSPACE BOUNDARY POLICY]
ai._prompt_minimal_change # [MINIMAL CHANGE POLICY]
ai._prompt_verification # [VERIFICATION POLICY]
ai._prompt_attachments # [ATTACHMENT WORKSPACE POLICY]
# Toggle attachment policy at runtime:
ai.set_use_attachments(True)| Model / family | Constant | Vision | Audio | Thinking | |
|---|---|---|---|---|---|
| Anthropic Claude | Provider.ANTHROPIC |
✓ | — | ✓ | ✓ |
| OpenAI GPT / o-series | Provider.OPENAI |
✓ | ✓ | — | ✓ |
| Google Gemini | Provider.GEMINI |
✓ | ✓ | ✓ | ✓ |
| Qwen / DeepSeek | Provider.QWEN |
✓ | — | — | ✓ |
| Llama 3.x | Provider.LLAMA |
✓ | — | — | — |
| Mistral / Mixtral | Provider.MISTRAL |
— | — | — | — |
| Text + thinking, no vision | Provider.TEXT_REASONING |
— | — | — | ✓ |
| Text only | Provider.TEXT_ONLY |
— | — | — | — |
For a model not listed above, pass a ProviderCaps instance directly instead of a Provider constant. ProviderCaps is a dataclass — set only the flags your model actually supports:
from ailist import AIListBase, ProviderCaps
my_caps = ProviderCaps(
supports_content_blocks = True,
supports_binary = True,
image_format = "image_url",
pdf_format = None, # no native PDF → extract text via pypdf
supports_audio = False,
supports_video = False,
)
ai = AIListBase("ollama:my-custom-model", context_limit=32768, provider=my_caps)Skills are Markdown files that teach the agent how to perform a specific task — which tools to call, in what order, what to watch out for. They are one of the most powerful ways to improve agent reliability without touching the model or the code.
Any skill file from the community works as-is: drop it into your skills directory and the agent can read it on demand via the skills MCP server.
async with MyAgent() as ai:
ai.skills_dir = Path(r"C:\MyProject\skills")
await ai.mcp_connects([
{"name": "workspace"},
{"name": "skills"}, # exposes list_skills / get_skill tools to the agent
])
# The agent will look up the relevant skill automatically,
# or you can load one explicitly:
result = await ai.run_async(
"Read the skill 'write_tests' and follow it to write tests for auth.py."
)Skills can also be pre-loaded into the system prompt as a permanent prompt block, so the agent always has them in context without a tool call:
skill_text = Path(r"C:\MyProject\skills\write_tests.md").read_text()
ai.prompts[10] = {"prompt": skill_text, "tag": "skill_write_tests"}Requirements:
- Python 3.11 or 3.12
pip install langchain langchain-core pydantic platformdirs
pip install langchain-ollama langchain-anthropic langchain-openai langchain-google-genai
pip install mcp tiktoken
pip install pypdf python-docx openpyxl python-pptx
pip install "beautifulsoup4[lxml]"
pip install faster-whisper opencv-python moviepy
pip install Pillow xhtml2pdf markdown
pip install piper-tts apprise sympy
pip install fastmcp
pip install uv huggingface-hub
uv tool install --python 3.12 mcp-server-qdrant
uv python install 3.13 # needed for Serena
npx -y @cyanheads/git-mcp-server --help
npx -y @executeautomation/playwright-mcp-server --help
npx -y @modelcontextprotocol/server-memory --help- Node.js (for git/playwright/memory): https://nodejs.org/
- Docker (for Qdrant/SearXNG): https://www.docker.com/
Optional - pre-download, the project will download itself the first time you use it:
python -c "from faster_whisper import WhisperModel; WhisperModel('turbo')"
python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')"| What | Where | When | Size |
|---|---|---|---|
Whisper turbo (audio/video transcription) |
~/.cache/ailist/whisper |
First audio or video file attached | ~800 MB |
| Piper voice model (TTS) | ~/.cache/ailist/piper |
First piper_synthesize call |
~50–100 MB per voice |
| Serena language servers (LSP) | managed by uv | First symbol operation in a new language | varies |
| Qdrant embedding model | managed by uv | First await ai.mcp_connect("qdrant") |
~90 MB |
All models are downloaded once and shared across projects. Override the cache location with the
AILIST_CACHE_DIRenvironment variable or by settingai._cache_dir = Path("/my/cache")before first use.