A complete autonomous vehicle stack implementing perception, trajectory planning, and control for Unity 3D simulation. This project implements single-camera lane following with ground truth following capabilities and comprehensive analysis tools.
- Perception: Segmentation-based lane detection (default) with CV fallback (color masks, edge detection, Hough lines) and polynomial fitting
- Trained Segmentation Model: Supports running a trained checkpoint via
--segmentation-checkpoint - Trajectory Planning: Rule-based path planning with reference point smoothing and bias correction
- Control: Pure Pursuit (default) with PID/Stanley alternatives; feedforward (path curvature) + feedback (error correction)
- Speed Planning: Jerk-limited speed planner for smooth curve and limit transitions
- ACC / Lead Following: Same-lane lead following on validated highway scenarios with producer-side lead association and continuity tracking
- Single-Actor Wrong-Target Coverage: Isolated adjacent-lane same-direction and oncoming straight reject scenarios for ACC validation
- Ground Truth Following: Direct velocity control mode for precise ground truth path following
- Data Recording: Automatic HDF5 recording of all frames, vehicle state (including Unity time/frame count), control commands, and ground truth data
- Analysis Tools: Comprehensive analysis suite for evaluating drive performance
- Debug Visualizer: Web-based tool for visualizing recorded data with overlays
- Testing: Extensive test suite covering control, trajectory, perception, and integration scenarios
- Standalone Unity Player Workflow: Build and run the Unity player directly from scripts for automated testing
See docs/ODD.md for the system's operational design domain β track constraints, sensor assumptions, control modes, and known limitations.
Current traffic-model scope:
- one scripted traffic actor at a time
- same-lane lead following supported
- isolated wrong-target reject scenarios supported
- mixed traffic is not yet supported
Unity Simulator (C#)
β (camera feed, vehicle state, ground truth)
Python Bridge/API (FastAPI)
β
Perception (Segmentation default, CV fallback) β Lane Detection
β (lane line coefficients, positions)
Trajectory Planner (Rule-based) β Path Planning
β (reference point: x, y, heading)
Control Stack (Pure Pursuit default + Feedforward) β Steering/Throttle/Brake
β (control commands)
Unity Simulator (C#) β Vehicle Control
β
Data Recorder (HDF5) β All sensor data + commands + ground truth
- Perception:
perception/inference.py- Segmentation default with CV fallback and temporal filtering - Trajectory:
trajectory/inference.py- Rule-based planner with reference point smoothing - Control:
control/pid_controller.py- Pure Pursuit (default), PID/Stanley alternatives; feedforward + feedback - Bridge:
bridge/server.py- FastAPI server for Unity-Python communication - Data:
data/recorder.py- HDF5 recording with ground truth support - Main Stack:
av_stack/orchestrator.py- Integration of all components (AVStack class)
See docs/ARCHITECTURE.md for the overall system design, layer responsibilities,
methods at each layer (perception, trajectory, control), and interface definitions.
For a canonical script map (what to run for each intent), see docs/SCRIPT_RUNBOOK.md.
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txtFollow the detailed instructions in setup_unity.md
Option A: Standard AV Stack (Perception β Trajectory β Control)
# Basic startup (segmentation default)
./start_av_stack.sh
# With Unity auto-launch and auto-play
./start_av_stack.sh --launch-unity --unity-auto-play
# Run for specific duration (e.g., 60 seconds)
./start_av_stack.sh --duration 60 --launch-unity
# Force CV-only mode
./start_av_stack.sh --use-cv
# Use a trained segmentation checkpoint
./start_av_stack.sh --segmentation-checkpoint /path/to/checkpoint.pt
# Force kill existing processes on port 8000
./start_av_stack.sh --forceOption B: Ground Truth Follower (Direct GT Path Following)
# One-command ground truth run (build optional)
./start_ground_truth.sh --track-yaml tracks/oval.yml --duration 60 --speed 8.0
# Constant speed (no GT PID braking)
./start_ground_truth.sh --track-yaml tracks/oval.yml --duration 60 --speed 8.0 --constant-speed
# Randomized start with reproducible seed
./start_ground_truth.sh --track-yaml tracks/oval.yml --random-start --random-seed 50
# Replay-quality GT capture (full stack running, low-overhead logging)
./start_ground_truth.sh --track-yaml tracks/s_loop.yml --duration 60 --strict-gt-pose --stream-sync-policy latest --log-level error
# Diagnosis run (more logs, better temporal pairing across streams)
./start_ground_truth.sh --track-yaml tracks/s_loop.yml --duration 60 --strict-gt-pose --stream-sync-policy aligned --diagnostic-logging
# Promote a recording to canonical golden naming (non-destructive copy by default)
./tools/promote_golden_gt.sh --source data/recordings/recording_YYYYMMDD_HHMMSS.h5 --track sloop --duration 45s --sync-policy latestOption C: Standalone Unity Player (Automated Workflow)
# Build and run Unity player for a 60s test (no editor interaction)
./start_av_stack.sh --build-unity-player --skip-unity-build-if-clean --run-unity-player --duration 60Script behavior reference: See docs/SCRIPT_RUNBOOK.md for canonical script definitions and mode defaults.
# Terminal 1 - Bridge Server
python -m bridge.server
# Terminal 2 - AV Stack
python av_stack.py # Data recording enabled by default
# Or Ground Truth Follower
python tools/ground_truth_follower.py --duration 60Option A: Auto-launch (Recommended)
./start_av_stack.sh --launch-unity --unity-auto-playOption B: Manual
- Open Unity project (
unity/AVSimulation) - Load the
SampleScenescene - Select your Car GameObject
- In AV Bridge component β Check "Enable AV Control"
- Press βΆ PLAY
The system will automatically:
- Capture camera frames (30 FPS)
- Run perception model
- Plan trajectory
- Control the vehicle
- Record all data to HDF5 files in
data/recordings/
Quick performance overview:
# Analyze latest recording
python tools/analyze/analyze_drive_overall.py --latest
# Analyze specific recording
python tools/analyze/analyze_drive_overall.py data/recordings/recording_YYYYMMDD_HHMMSS.h5
# List available recordings
python tools/analyze/analyze_drive_overall.py --listDetailed diagnostics:
# Comprehensive analysis with root cause identification
python tools/analyze/analyze_recording_comprehensive.py --latestProjection/trajectory baseline workflow (curve diagnosis):
# 1) Run GT capture (s_loop example)
./start_ground_truth.sh --track-yaml tracks/s_loop.yml --duration 30 --strict-gt-pose --stream-sync-policy latest --log-level error
# 2) Promote recording to golden tag
./tools/promote_golden_gt.sh --source data/recordings/recording_YYYYMMDD_HHMMSS.h5 --track sloop --duration 30s --sync-policy latest --force
# 3) Save baseline metrics artifact (projection + planner-vs-oracle gap)
# Example output path:
# tmp/analysis/gt_projection_baseline_recording_YYYYMMDD_HHMMSS.jsonThe baseline JSON should include:
- Right-lane fiducial reprojection error (mean/p95/max pixels; 5m/10m/15m bins)
- Planner-vs-oracle lateral gap stats (mean/p95 abs/max abs at 5m/10m/15m)
To keep visualization and debugging fundamentals reliable, this project now uses a sync-first workflow:
- Treat time alignment contract as the primary quality gate (
cameravstrajectory/controltimestamp agreement). - Treat frame-cadence indicators (for example frame-id deltas) as secondary diagnostics, not primary pass/fail.
- Require deterministic replay parity and acceptance reports before concluding behavior quality.
This ordering avoids conflating projection correctness with temporal misalignment and mirrors production AV debugging practice.
See tools/analyze/README.md for all analysis tools.
Start the visualizer server:
cd tools/debug_visualizer
python server.pyThe server runs on http://localhost:5001 and serves both the API and the HTML.
Open the visualizer:
# Recommended: Use the Flask server (serves HTML + API from port 5001)
open http://localhost:5001/
# Alternative: Separate static server (set port 8000, then open index.html)
# The visualizer will use API_BASE to reach port 5001 for API calls
cd tools/debug_visualizer && python -m http.server 8000
# Then open http://localhost:8000/index.htmlFeatures:
β Phase 1: Frame-Level Diagnostics (Complete)
- Polynomial Inspector: Analyze polynomial fitting for any frame
- Shows recorded vs. re-run detection
- Full system validation (what av_stack.py would do)
- Explains why detections would be rejected
- Provides recommendations for fixes
- On-Demand Debug Overlays: Generate edges, yellow_mask, and combined for ANY frame
- No longer limited to every 30th frame
- Visualize detected points/edges that led to bad polynomial fits
- Frame-by-frame navigation with keyboard controls
- Visual overlays for lane lines, trajectory, and ground truth
- Data side panel showing all frame data
- Export frames as PNG
π§ Phase 2: Recording-Level Analysis (In Progress)
- Recording Summary tab (overall metrics and health graphs)
- Issues Detection (auto-detect problematic frames and jump to them)
- Trajectory vs Steering Diagnostic (identify which component is failing)
See tools/debug_visualizer/README.md for full details and tools/debug_visualizer/CONSOLIDATION_PLAN.md for the consolidation roadmap.
# Trajectory accuracy analysis
python tools/analyze/analyze_trajectory.py --latest
# Oscillation root cause analysis
python tools/analyze/analyze_oscillation_root_cause.py --latest
# Jerkiness analysis
python tools/analyze/analyze_jerkiness.py --latest
# Perception quality analysis
python tools/analyze/analyze_perception_questions.py --latestanalyze_perception_questions.py now reports:
- Q1-Q7 scored checks (including heading correctness using
vehicle/car_heading_degvsground_truth/desired_heading) - Q8 diagnostic-only planner/heading contract check (
trajectory/reference_point_headingvsvehicle/heading_delta_deg)
av/
βββ unity/ # Unity project files
β βββ AVSimulation/
β βββ Assets/
β β βββ Scripts/ # C# scripts (AVBridge, CarController, etc.)
β β βββ Scenes/ # Unity scenes
β β βββ Materials/ # Lane marking materials
β β βββ Prefabs/ # Car prefab with camera
β βββ .unity_autoplay # Auto-play flag file
βββ perception/ # Perception module
β βββ inference.py # Segmentation + CV fallback
β βββ models/ # Model definitions (checkpoints are gitignored)
βββ trajectory/ # Trajectory planning
β βββ inference.py # Trajectory planning inference
β βββ models/ # Trajectory planning models
βββ control/ # Control stack
β βββ pid_controller.py # Pure Pursuit (default) / PID / Stanley
β βββ vehicle_model.py # Bicycle model
βββ bridge/ # Unity-Python communication
β βββ server.py # FastAPI server
β βββ client.py # Unity bridge client
βββ data/ # Data recording and replay
β βββ recorder.py # HDF5 data recorder
β βββ formats/ # Data format definitions
β βββ recordings/ # HDF5 recording files
βββ tools/ # Analysis and utility tools (see tools/README.md)
β βββ analyze/ # Analysis scripts (see tools/analyze/README.md)
β β βββ analyze_drive_overall.py # Primary overall analysis
β β βββ analyze_recording_comprehensive.py # Detailed diagnostics
β β βββ ... # Specialized analysis tools
β βββ debug_visualizer/ # Web-based debug visualizer (see tools/debug_visualizer/README.md)
β β βββ server.py # Visualizer backend server
β β βββ index.html # Visualizer frontend
β β βββ visualizer.js # Visualization logic
β β βββ backend/ # Analysis backend modules (Phase 2)
β β βββ CONSOLIDATION_PLAN.md # Tool consolidation roadmap
β βββ ground_truth_follower.py # Ground truth path follower
β βββ replay_perception.py # Perception replay tool
β βββ calibrate_perception.py # Perception calibration
βββ tests/ # Test suite (see tests/README.md)
β βββ test_control.py # Control system tests
β βββ test_trajectory.py # Trajectory planning tests
β βββ test_perception_*.py # Perception tests
β βββ test_integration.py # Integration tests
βββ start_av_stack.sh # Primary startup script
βββ launch_unity.sh # Unity launcher script
βββ av_stack.py # Main AV stack integration
βββ config/av_stack_config.yaml # Configuration file
Data recording is enabled by default when running av_stack.py or ground_truth_follower.py. All frames, vehicle state, control commands, and ground truth data are automatically saved to HDF5 files in data/recordings/.
Recording format:
camera/: Camera frames (images)vehicle_state/: Position, speed, heading, etc.perception/: Lane detection resultstrajectory/: Trajectory planning resultscontrol/: Control commands (steering, throttle, brake)ground_truth/: Ground truth lane positions (when available)
View recordings:
# List recordings
python tools/list_recordings.py
# Replay recording
python -m data.replay --file data/recordings/recording_YYYYMMDD_HHMMSS.h5# Run all tests
pytest tests/
# Run with verbose output
pytest tests/ -v
# Run specific test file
pytest tests/test_control.py -v
# Run tests by category (using markers)
pytest tests/ -m unit # Fast unit tests
pytest tests/ -m integration # Integration tests
pytest tests/ -m control # Control system tests
pytest tests/ -m trajectory # Trajectory planning tests
pytest tests/ -m perception # Perception tests
# Run with coverage
pytest tests/ --cov=perception --cov=trajectory --cov=control --cov-report=term-missing
# Run tests and drop into debugger on failure
pytest tests/ --pdb- Control Tests (
-m control): PID controller, steering logic, integral accumulation - Trajectory Tests (
-m trajectory): Reference point calculation, smoothing, bias correction - Perception Tests (
-m perception): Lane detection, coordinate conversion - Integration Tests (
-m integration): End-to-end scenarios, system stability - Unit Tests (
-m unit): Fast, isolated unit tests
See tests/README.md for comprehensive test documentation.
- Reproduce the bug - Create a minimal test case
- Write a failing test - Test should fail before fix, pass after
- Fix the bug - Address root cause, not symptoms
- Run full test suite - Ensure no regressions:
pytest tests/ - Verify in Unity - Test in actual simulation if applicable
Important: Every bug fix must include a test that reproduces the original issue.
Configuration is managed through config/av_stack_config.yaml. Key sections:
control/: PID gains, steering limits, rate limiting, longitudinal comfort (accel/jerk caps)trajectory/: Lookahead distance, smoothing parameters, bias correction, speed plannersafety/: Emergency stop thresholds, bounds checking
See CONFIG_GUIDE.md for detailed configuration options (including Longitudinal Comfort S1-M39).
perception/: Lane detection models and inferencetrajectory/: Path planning algorithmscontrol/: Vehicle control (PID with feedforward)bridge/: Unity-Python communication layerdata/: Data recording and replay utilitiestools/: Analysis and utility tools
Install pre-commit hooks to run tests and linting automatically:
pip install pre-commit
pre-commit installThis will run tests, format code, and check for issues before each commit.
- Python 3.8+
- PyTorch 1.12+ (required for segmentation model)
- Unity 2021.3 LTS or later
- FastAPI
- NumPy, OpenCV, h5py
See requirements.txt for complete list.
- README.md - This file (project overview and quick start)
- setup_unity.md - Unity setup instructions
- CONFIG_GUIDE.md - Configuration system guide
- docs/README.md - Documentation index
- docs/TODO.md - Active and backlog TODO tracker
- docs/ROADMAP.md - Unified robust full-stack roadmap with layered phases, incremental lane-keeping ladder, status tracking, and promotion gates
- docs/README_STARTUP.md - Detailed startup instructions and troubleshooting
- docs/DEVELOPMENT_GUIDELINES.md - Development best practices and critical lessons learned
- docs/AI_MEMORY_GUIDE.md - AI assistant memory and context guide
- docs/archive/ - Historical analysis and investigation notes (archived)
- tools/README.md - Tools directory documentation (data collection, tuning, diagnostics)
- tools/analyze/README.md - Analysis tools documentation
- tools/debug_visualizer/README.md - Debug visualizer documentation
- tests/README.md - Test suite documentation
See docs/ROADMAP.md for current stages, phases, and promotion gates (single source of truth).
Completed: Unity setup, bridge, recorder, lane detection (segmentation + CV), trajectory planner, control (Pure Pursuit/PID), ground truth following, analysis tools, PhilViz, test suite.
Future: Multi-camera, lidar, radar, sensor fusion, MPC trajectory planning, reinforcement learning.
MIT License
Contributions welcome! Please read docs/DEVELOPMENT_GUIDELINES.md before submitting PRs.