BUFI Minions — Self-Improving Financial-Ops Agents

Devpost: "Building Agents for Real-World Challenges" — Arize track submission. Gemini 3-powered background agents for BUFI (a stablecoin-first financial workspace for global teams), with production-grade Arize Phoenix observability and a working self-improvement loop: the agents read their own traces, recall curated fixes, and get measurably better every morning.

The real-world challenge

BUFI's ops team drowns in recurring engineering/back-office missions (reconciliation checks, incident triage, repo chores). This fork turns Vercel's open-agents into BUFI's autonomous "minion" workforce:

Linear daily plan ──(cron 12:00)──► BUFI bridge ──► open-agents (Gemini 3 Pro)
                                                        │  plan (todo) · subagents (task)
                                                        │  human gate (ask_user_question)
                                                        ▼
                                            OpenInference traces ──► Arize Phoenix
                                                        │   ▲
                                   agent introspects ───┘   │ LLM-as-judge evals
                                   (Phoenix MCP +           │ (gemini-3-flash)
                                    recall/find_resolved_gap)
                                                        ▼
                    morning digest (cron 13:00) ──► Slack #bu-minions
                    + promotes successes/fixes to Phoenix datasets ⟲ self-improvement

Arize track requirements → where they live

Requirement	Implementation
Code-owned Gemini agent runtime	`packages/agent/open-agent.ts` — AI SDK v6 `ToolLoopAgent`, default `google/gemini-3-pro-preview` via AI Gateway
OpenInference instrumentation → Phoenix	`packages/arize-phoenix/otel.ts` + `apps/web/instrumentation.ts` + `experimental_telemetry` (sessionId/chatId/source/linearTaskId/repo metadata)
Phoenix MCP in the agent	`packages/agent/tools/phoenix-mcp.ts` — `@arizeai/phoenix-mcp` over stdio via `@ai-sdk/mcp`, 14 trace/span/session/dataset tools merged into the toolset (`PHOENIX_MCP_ENABLED`)
Evals on traces	`scripts/phoenix-eval.ts` — Gemini LLM-as-judge → Phoenix span annotations + eval badges in the sessions UI
Self-improvement loop	`packages/agent/tools/phoenix-introspection.ts` (`recall_similar_runs`, `find_resolved_gap`) + `app/api/bufi/digest/promote` (auto-curation to `bufi-recall` / `bufi-resolved-gaps` datasets)
Multi-step + human-in-control	`todo_write` planning, `task` subagents, `ask_user_question` HITL, sessions UI

Demo flow (scripts/demo-dispatch.ts + scripts/phoenix-seed-resolved-gap.ts): a treasury-reconciliation mission fails on run 1 (missing script), an engineer curates the fix into the bufi-resolved-gaps dataset, and run 2 self-heals — the agent's find_resolved_gap call returns the curated fix and it completes the mission, citing its own trace history.

Hosted at https://open-agents-bay.vercel.app · BUFI-side crons live in BuFi007/desk-v1 (apps/app/src/app/api/cron/daily-plan-coffee, coffee-digest). Cloud Run-deployable (standard Next.js standalone build); hosted on Vercel for the demo.

Open Agents

Open Agents is an open-source reference app for building and running background coding agents on Vercel. It includes the web UI, the agent runtime, sandbox orchestration, and the GitHub integration needed to go from prompt to code changes without keeping your laptop involved.

The repo is meant to be forked and adapted, not treated as a black box.

What it is

Open Agents is a three-layer system:

Web -> Agent workflow -> Sandbox VM

The web app handles auth, sessions, chat, and streaming UI.
The agent runs as a durable workflow on Vercel.
The sandbox is the execution environment: filesystem, shell, git, dev servers, and preview ports.

The key architectural decision: the agent is not the sandbox

The agent does not run inside the VM. It runs outside the sandbox and interacts with it through tools like file reads, edits, search, and shell commands.

That separation is the main point of the project:

agent execution is not tied to a single request lifecycle
sandbox lifecycle can hibernate and resume independently
model/provider choices and sandbox implementation can evolve separately
the VM stays a plain execution environment instead of becoming the control plane

Current capabilities

chat-driven coding agent with file, search, shell, task, skill, and web tools
durable multi-step execution with Workflow SDK-backed runs, streaming, and cancellation
isolated Vercel sandboxes with snapshot-based resume
repo cloning and branch work inside the sandbox
optional auto-commit, push, and PR creation after a successful run
session sharing via read-only links
optional voice input via ElevenLabs transcription

Runtime notes

A few details that matter for understanding the current implementation:

Chat requests start a workflow run instead of executing the agent inline.
Each agent turn can continue across many persisted workflow steps.
Active runs can be resumed by reconnecting to the stream for the existing workflow.
Sandboxes expose ports 3000, 5173, 4321, 8000, and 5001, use a build-prewarmed deployment template on Vercel, and hibernate after inactivity.
Auto-commit and auto-PR are supported, but they are preference-driven features, not always-on behavior.

Environment variables

See apps/web/.env.example for the full list. Summary:

Minimum runtime

POSTGRES_URL=
BETTER_AUTH_SECRET=

Required for sign-in (Vercel OAuth)

NEXT_PUBLIC_VERCEL_APP_CLIENT_ID=
VERCEL_APP_CLIENT_SECRET=

Required for GitHub repo access, pushes, and PRs

NEXT_PUBLIC_GITHUB_CLIENT_ID=
GITHUB_CLIENT_SECRET=
GITHUB_APP_ID=
GITHUB_APP_PRIVATE_KEY=
NEXT_PUBLIC_GITHUB_APP_SLUG=
GITHUB_WEBHOOK_SECRET=

Optional

REDIS_URL=
KV_URL=
OPEN_AGENTS_RESOURCE_PROFILE=
VERCEL_PROJECT_PRODUCTION_URL=
NEXT_PUBLIC_VERCEL_PROJECT_PRODUCTION_URL=
VERCEL_SANDBOX_BASE_SNAPSHOT_ID=
ELEVENLABS_API_KEY=

REDIS_URL / KV_URL: optional skills metadata cache (falls back to in-memory when not configured).
OPEN_AGENTS_RESOURCE_PROFILE: optional deployment resource profile. Set to hobby to use Hobby-compatible defaults for chat and sandbox resources; leave unset for standard behavior.
VERCEL_PROJECT_PRODUCTION_URL / NEXT_PUBLIC_VERCEL_PROJECT_PRODUCTION_URL: canonical production URL for metadata and some callback behavior.
VERCEL_SANDBOX_BASE_SNAPSHOT_ID: optional explicit base snapshot override for fresh sandboxes. Vercel deployments normally resolve their automatically prewarmed named template without this value. Outside a Vercel deployment, leaving it unset starts from the standard Sandbox runtime.
ELEVENLABS_API_KEY: voice transcription.

Deploy your own copy on Vercel

Fork this repo.
Import the repo into Vercel. Neon Postgres is auto-provisioned if you use the deploy button above.

Generate a secret for session signing:

openssl rand -base64 32   # BETTER_AUTH_SECRET

Add env vars in Vercel project settings:
```
POSTGRES_URL=
BETTER_AUTH_SECRET=
```
Deploy once to get a stable production URL.

Create a Vercel OAuth app with callback URL:

https://YOUR_DOMAIN/api/auth/callback/vercel

Add these env vars and redeploy:

NEXT_PUBLIC_VERCEL_APP_CLIENT_ID=
VERCEL_APP_CLIENT_SECRET=

If you want the full GitHub-enabled coding-agent flow, create a GitHub App using:
- Homepage URL: https://YOUR_DOMAIN
- Callback URL: https://YOUR_DOMAIN/api/auth/callback/github
- Setup URL: https://YOUR_DOMAIN/api/github/app/callback
In the GitHub App settings:
- use the GitHub App's Client ID and Client Secret for NEXT_PUBLIC_GITHUB_CLIENT_ID and GITHUB_CLIENT_SECRET
- make the app public if you want org installs to work cleanly
Add the GitHub App env vars and redeploy.
Optionally add Redis/KV, OPEN_AGENTS_RESOURCE_PROFILE=hobby for Hobby-compatible resource defaults, the canonical production URL vars, and VERCEL_SANDBOX_BASE_SNAPSHOT_ID only if you need to override the automatically prewarmed sandbox template.

Local setup

Install dependencies:
```
corepack enable
pnpm install
```
Create your local env file:
```
cp apps/web/.env.example apps/web/.env
```
Fill in the required values in apps/web/.env.
Start the app:
```
pnpm web
```

If you already have a linked Vercel project, you can pull env vars locally with vc env pull.

OAuth and integration setup

Vercel OAuth

Authentication is handled by Better Auth with Vercel and GitHub as social providers. All auth routes are served from the /api/auth/[...all] catchall.

Create a Vercel OAuth app and use this callback:

https://YOUR_DOMAIN/api/auth/callback/vercel

For local development, use:

http://localhost:3000/api/auth/callback/vercel

Then set:

NEXT_PUBLIC_VERCEL_APP_CLIENT_ID=...
VERCEL_APP_CLIENT_SECRET=...

GitHub App

You do not need a separate GitHub OAuth app. Open Agents uses the GitHub App's OAuth credentials as a Better Auth social provider, plus the App's installation tokens for repo access.

Create a GitHub App for installation-based repo access and configure:

Homepage URL: https://YOUR_DOMAIN
Callback URL: https://YOUR_DOMAIN/api/auth/callback/github
Setup URL: https://YOUR_DOMAIN/api/github/app/callback
make the app public if you want org installs to work cleanly

For local development, use http://localhost:3000 as the homepage URL, http://localhost:3000/api/auth/callback/github as the callback URL, and http://localhost:3000/api/github/app/callback as the setup URL.

Then set:

NEXT_PUBLIC_GITHUB_CLIENT_ID=...   # GitHub App Client ID
GITHUB_CLIENT_SECRET=...           # GitHub App Client Secret
GITHUB_APP_ID=...
GITHUB_APP_PRIVATE_KEY=...
NEXT_PUBLIC_GITHUB_APP_SLUG=...
GITHUB_WEBHOOK_SECRET=...

GITHUB_APP_PRIVATE_KEY can be stored as the PEM contents with escaped newlines or as a base64-encoded PEM.

Useful commands

pnpm web                    # run dev server
pnpm check                  # lint + format check
pnpm fix                    # lint + format fix
pnpm typecheck              # typecheck all packages
pnpm run ci                 # full CI: check, typecheck, tests, migration check
pnpm harness:smoke:sandbox:create # create a caller-owned sandbox for harness smoke tests
pnpm harness:smoke:codex    # run one Codex turn against an existing sandbox
pnpm sandbox:snapshot-base  # manually layer a new sandbox snapshot from an existing snapshot

Repo layout

apps/web         Next.js app, workflows, auth, chat UI
packages/agent   agent implementation, tools, subagents, skills
packages/sandbox sandbox abstraction and Vercel sandbox integration
packages/shared  shared utilities

Name		Name	Last commit message	Last commit date
Latest commit History 1,021 Commits
.agents/skills		.agents/skills
.github/workflows		.github/workflows
.vscode		.vscode
apps/web		apps/web
docs		docs
packages		packages
scripts		scripts
.gitignore		.gitignore
.oxfmtrc.jsonc		.oxfmtrc.jsonc
.oxlintrc.json		.oxlintrc.json
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
LICENSE.md		LICENSE.md
README.md		README.md
bun.lock		bun.lock
package.json		package.json
skills-lock.json		skills-lock.json
tsconfig.json		tsconfig.json
turbo.json		turbo.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

BUFI Minions — Self-Improving Financial-Ops Agents

The real-world challenge

Arize track requirements → where they live

Open Agents

What it is

The key architectural decision: the agent is not the sandbox

Current capabilities

Runtime notes

Environment variables

Minimum runtime

Required for sign-in (Vercel OAuth)

Required for GitHub repo access, pushes, and PRs

Optional

Deploy your own copy on Vercel

Local setup

OAuth and integration setup

Vercel OAuth

GitHub App

Useful commands

Repo layout

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

BUFI Minions — Self-Improving Financial-Ops Agents

The real-world challenge

Arize track requirements → where they live

Open Agents

What it is

The key architectural decision: the agent is not the sandbox

Current capabilities

Runtime notes

Environment variables

Minimum runtime

Required for sign-in (Vercel OAuth)

Required for GitHub repo access, pushes, and PRs

Optional

Deploy your own copy on Vercel

Local setup

OAuth and integration setup

Vercel OAuth

GitHub App

Useful commands

Repo layout

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages