Skip to content

harinadh76/PilotCLI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

PilotCLI

A powerful, lightweight, terminal-native autonomous coding agent built in TypeScript. It uses a Plan-Act-Observe loop powered by Groq's LLaMA 3.3 model to autonomously edit code, read files, and execute shell commands directly in your local environment.

✨ Features

  • Interactive REPL: A continuous interactive shell to converse with and issue commands to the agent.
  • Chat Mode: Use chat <msg> to talk directly with the LLM without triggering any tool executions.
  • Session Saving & Loading: Never lose your context. Use save <session_name> to serialize your agent's state, plan, and chat history to the local .sessions/ folder. Resume perfectly where you left off with load <session_name>.
  • Interactive Tool Editing: Complete control over the agent. When the agent proposes a tool call, you can press e to open the exact JSON payload in your default $EDITOR (or nano), fix any typos the AI made, and execute your corrected version seamlessly!
  • Dynamic Configuration: Change the API key, model, or provider URL on the fly without restarting the CLI (using provider, model, and apikey commands).
  • Auto-Evals (LLM-as-a-Judge): A cutting-edge evaluation pipeline. The eval command automatically reads your project directory, uses the LLM to generate synthetic tasks perfectly tailored to your code, runs them, and then uses the LLM to judge its own success!
  • Memory Compaction: A built-in garbage collector (compact) that summarizes the agent's history and clears raw trace data to save tokens while retaining high-level project context.
  • Dynamic Budgets: Prevent runaway API costs by configuring PILOT_MAX_TURNS, PILOT_MAX_TOKENS, and PILOT_MAX_DOLLARS in your .env file.

🛠️ Agent Capabilities

The agent is equipped with native tools to achieve your goals:

  • read_file & list_dir: Safely navigate and read local project files.
  • edit_file: Surgically modify specific lines of code without replacing entire files (saves massive amounts of context window tokens).
  • grep_search: Power search your codebase using regex patterns natively.
  • search_web: The agent can autonomously ping the DuckDuckGo Instant Answers API when it encounters unfamiliar syntax or cryptic errors!
  • run_shell: Execute bash/shell commands.

Prerequisites

Setup

  1. Install Dependencies Navigate to the code/ directory and install the required npm packages:

    cd code
    npm install
  2. Configure Environment Create a .env file in the code/ directory and add your keys/budgets:

    GROQ_API_KEY=gsk_your_api_key_here
    
    # Optional Agent Limits
    PILOT_MAX_TURNS=50
    PILOT_MAX_TOKENS=200000
    PILOT_MAX_DOLLARS=5.0

Usage

Start the interactive REPL:

cd code
npm run pilotcli

REPL Commands

  • run <task>: Give the agent a new autonomous task to perform.
  • chat <msg>: Chat directly with the model.
  • save <file>: Save current session state and chat history.
  • load <file>: Load a previous session.
  • eval: Run the autonomous offline evaluation test suite.
  • compact: Summarize the agent's recent history to save tokens.
  • provider <URL>: Set an OpenAI-compatible endpoint URL.
  • model <Name>: Switch the LLM model (e.g., llama-3.3-70b-versatile).
  • apikey <Key>: Update your API key live.
  • status: Show current configuration.
  • quit / exit: Exit the REPL.

Example Workflow

agent> run create a file called server.js with a basic express server
Starting agent loop for task: create a file called server.js with a basic express server
...
agent> run modify it to run on port 8080 instead
Starting agent loop for task: modify it to run on port 8080 instead
...
agent> compact
Compacting memory (this may take a few seconds)...
[Memory Compacted]
The agent successfully created an Express server in server.js and updated it to run on port 8080.

Architecture

  • Model: llama-3.3-70b-versatile (via Groq API)
  • Harness: Custom TypeScript harness with tsx for execution.
  • Hooks: Lifecycle hooks (SessionStart, PreToolUse, PostToolUse, Stop) for telemetry and extending agent capabilities.

About

A terminal based coding agent 🤖

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors