Skip to content

Corpusvision/foundry-kit

Repository files navigation

foundry-kit

PyPI Python License: MIT

A Python framework for building, managing, and serving Azure AI Foundry agents — with a local FastAPI dev server, Azure AI Search RAG, OpenAI Responses API streaming, a ToolBuilder SDK, and a full document ETL pipeline.


Features

Capability Description
CLI foundry-kit CLI to scaffold config, provision agents, manage indexes, run a dev server, deploy to Azure Functions
REST API FastAPI server with /chat/{agent}, /agents, /health endpoints
Threads engine Azure AI Foundry Hosted Agents (threads/runs) — default
Responses engine OpenAI Responses API with tool-calling loop and SSE streaming — opt-in per agent
ToolBuilder @tool decorator, FunctionTool, ApiTool, SearchTool, ToolRegistry — zero-boilerplate OpenAI function schemas
Document ETL Extract → chunk → AI-enrich → embed → upsert pipeline for PDF, DOCX, XLSX, ZIP
Azure AI Search HNSW vector index management, batch upsert, semantic query
Azure Functions One-command deploy via func CLI

Installation

pip install foundry-kit

For document ingestion support (PDF, DOCX, XLSX):

pip install "foundry-kit[documents]"

Quickstart

1. Initialise project config

foundry-kit config init
# → writes foundry-kit.yaml

2. Create an agent

foundry-kit agent init my-bot \
  --prompt "You are a helpful assistant." \
  --model gpt-4o

3. Start local dev server

foundry-kit serve
# → http://127.0.0.1:8000

4. Chat

curl -X POST http://localhost:8000/chat/my-bot \
     -H "Content-Type: application/json" \
     -d '{"message": "Hello!"}'

5. Deploy to Azure Functions

foundry-kit deploy --app-name my-function-app

Configuration (foundry-kit.yaml)

foundry_endpoint: "https://myhub.services.ai.azure.com/api/projects/myproject"
search_endpoint:  "https://mysearch.search.windows.net"
openai_endpoint:  "https://myoai.openai.azure.com"
embedding_model:  "text-embedding-3-small"
embedding_dimensions: 1536

agents:
  # Standard agent — uses Azure AI Foundry threads/runs
  - name: my-bot
    model: gpt-4o
    instructions: "You are a helpful assistant."
    search_index: support-docs       # optional RAG index
    foundry_agent_id: null           # filled by `foundry-kit agent init`

  # Responses API agent — uses OpenAI Responses API with tool-calling
  - name: policy-bot
    model: gpt-4o
    instructions: "You are an expert insurance analyst."
    search_index: policies
    engine: responses                # opt-in to Responses API engine

All fields can be overridden via environment variables prefixed FOUNDRY_
(e.g. FOUNDRY_FOUNDRY_ENDPOINT).


Document Ingestion

Ingest an entire folder of PDFs, DOCX, XLSX, or ZIP archives into Azure AI Search:

foundry-kit index ingest ./docs/ --index policies
# Ingested 12 file(s), 147 chunk(s), 0 error(s).

# Preview without writing
foundry-kit index ingest ./docs/ --index policies --dry-run

# Disable LLM metadata enrichment (faster, no AI cost)
foundry-kit index ingest ./docs/ --index policies --no-ai

The pipeline:

  1. Extract — pdfplumber/pypdf for PDF; python-docx for DOCX; openpyxl for XLSX; recursive ZIP with path-traversal protection
  2. Chunk — paragraph-aware chunker with configurable --chunk-size and --overlap
  3. Enrich — AI metadata detection (document type, named insured, policy number, dates, carrier) with regex fallback
  4. Embed — Azure OpenAI embeddings
  5. Upsert — deterministic SHA-256 chunk IDs for idempotent re-ingestion

ToolBuilder

Build OpenAI function-call tools with zero boilerplate:

from foundry_kit import tool, FunctionTool, ApiTool, SearchTool, ToolRegistry

# 1. Python function tool — schema auto-generated from type hints + docstring
@tool
def get_policy(policy_number: str, include_endorsements: bool = False) -> str:
    "Retrieve a policy document by its number."
    return fetch_from_db(policy_number)

# 2. HTTP API tool
claims_tool = ApiTool(
    name="submit_claim",
    description="Submit a new insurance claim.",
    method="POST",
    url_template="https://api.example.com/claims",
    parameters={
        "type": "object",
        "properties": {
            "policy_number": {"type": "string"},
            "description": {"type": "string"},
        },
        "required": ["policy_number", "description"],
    },
    headers={"Authorization": "Bearer MY_TOKEN"},
)

# 3. Azure AI Search tool
from foundry_kit import SearchManager
search = SearchManager(search_endpoint="...", openai_endpoint="...")
search_tool = SearchTool(
    name="search_policies",
    description="Search the policy knowledge base.",
    index_name="policies",
    search_manager=search,
    top_k=5,
)

# 4. Registry — used by ResponsesEngine
registry = ToolRegistry()
registry.register(get_policy)
registry.register(claims_tool)
registry.register(search_tool)

Responses API Engine

Use the OpenAI Responses API with automatic tool-calling (instead of the default Azure AI Foundry threads engine):

from openai import AzureOpenAI
from foundry_kit import ResponsesEngine, ToolRegistry, SearchTool

oai = AzureOpenAI(azure_endpoint="https://myoai.openai.azure.com")
registry = ToolRegistry()
registry.register(search_tool)

engine = ResponsesEngine(openai_client=oai, tool_registry=registry)

# Non-streaming
reply, sources = await engine.chat(
    model="gpt-4o",
    instructions="You are a policy expert.",
    message="What does my flood policy cover?",
)

# SSE streaming — yields SSEEvent objects
async for event in engine.stream(model="gpt-4o", instructions="...", message="..."):
    if event.type == "tool_use":
        print(f"Calling tool: {event.data['name']}")
    elif event.type == "text_delta":
        print(event.data["delta"], end="", flush=True)
    elif event.type == "sources":
        print(f"\nSources: {event.data['sources']}")

HTTP streaming via the REST API:

curl -N -X POST "http://localhost:8000/chat/policy-bot?stream=true" \
     -H "Content-Type: application/json" \
     -d '{"message": "What is covered under flood insurance?"}'
# data: {"type":"tool_use","data":{"name":"search_knowledge_base"},"agent":"policy-bot"}
# data: {"type":"text_delta","data":{"delta":"Flood insurance typically covers..."},"agent":"policy-bot"}
# data: {"type":"sources","data":{"sources":[...]},"agent":"policy-bot"}
# data: {"type":"done","data":{},"agent":"policy-bot"}

CLI Reference

Config

foundry-kit config init [--output foundry-kit.yaml]

Agents

foundry-kit agent init <name> [--prompt TEXT] [--model MODEL]
foundry-kit agent update <name> [--prompt TEXT] [--model MODEL]
foundry-kit agent list
foundry-kit agent show <name>
foundry-kit agent delete <name>

Index

foundry-kit index create <index-name>
foundry-kit index upsert --index <name> --file <path>
foundry-kit index query --index <name> --query TEXT [--top-k N]
foundry-kit index ingest <path> --index <name> [--chunk-size N] [--overlap N] [--dry-run] [--no-ai]

Server

foundry-kit serve [--host 0.0.0.0] [--port 8000] [--reload]

Deploy

foundry-kit deploy [--app-name NAME] [--resource-group RG]

Python SDK

from foundry_kit import (
    FoundryClient,
    SearchManager,
    ResponsesEngine,
    DocumentProcessor,
    create_app,
    load_config,
    # Models
    ChatRequest, ChatResponse, Source,
    SSEEvent, IngestResult, DocumentMetadata, ProcessedChunk,
    # Tools
    tool, FunctionTool, ApiTool, SearchTool, ToolRegistry,
)

# Load config
config = load_config("foundry-kit.yaml")

# Foundry client (threads/runs)
client = FoundryClient(config.foundry_endpoint)
reply, sources = asyncio.run(client.chat("asst_abc123", "Hello!"))

# Search
search = SearchManager(
    search_endpoint=config.search_endpoint,
    openai_endpoint=config.openai_endpoint,
    embedding_model=config.embedding_model,
)
results = search.query("my-index", "cloud AI", top_k=5)

# Document ingestion (requires foundry-kit[documents])
result = search.ingest("./documents/", index_name="policies")
print(f"{result.processed} files → {result.chunks} chunks")

# FastAPI app
app = create_app("foundry-kit.yaml")

REST API

Method Path Description
GET /health Health check
GET /agents List all configured agents
GET /agents/{name} Get agent info
POST /chat/{name} Send a message (JSON body: ChatRequest)
POST /chat/{name}?stream=true SSE streaming chat

ChatRequest schema:

{
  "message": "Hello!",
  "history": [],
  "context_override": null,
  "previous_response_id": null
}

Development

git clone https://github.com/corvis-labs/foundry-kit
cd foundry-kit
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev,documents]"

# Run tests
pytest

# Lint + format
ruff check foundry_kit/
ruff format foundry_kit/ tests/

# Type check
mypy foundry_kit/ --ignore-missing-imports

License

MIT © Corvis Labs


Configuration (foundry-kit.yaml)

See foundry-kit.yaml.example for full schema documentation.


Authentication

foundry-kit uses DefaultAzureCredential. Set one of:

  • AZURE_CLIENT_ID + AZURE_CLIENT_SECRET + AZURE_TENANT_ID (service principal)
  • az login (developer workstation)

Requirements

  • Python 3.11+
  • Azure AI Foundry project (pre-provisioned)
  • Azure AI Search resource (for RAG features)
  • Azure OpenAI resource with an embeddings deployment (for vector search)
  • Azure Functions Core Tools (func) for deployment

About

A Python framework for building, managing, and serving Azure AI Foundry agents with FastAPI, Azure AI Search, and Azure Functions.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors