FLPerformance: Foundry Local Model Benchmark Tool

A local application with a web UI for benchmarking multiple small language models (SLMs) running via Microsoft Foundry Local.

Read the full story: How we built FLPerformance to learn about the architecture decisions, challenges faced, and how to obtain real-world LLM performance metrics on your local hardware.

Easy Startup Script

Windows users: If you have Node.js installed, run .\START_APP.ps1 to start everything. This opens two terminals and opens the browser automatically.

Features

Complete Benchmark System: Full end-to-end benchmarking with accurate metrics
Enhanced Visualisations: Performance cards, comparison charts, and radar graphs
Non-blocking Model Loading: Models download and load in the background with real-time status polling
Real-time Progress: Polling-based status updates every two seconds during runs
Pre-test Validation: Test button to verify model inference before benchmarking
Results Export: JSON and CSV export functionality
Hardware Detection: Comprehensive system information capture
Storage System: JSON-based storage with optional SQLite support
Custom Cache Support: Switch between model cache directories via the Cache tab
Multi-Model Comparison: Side-by-side performance analysis with visual insights

Overview

FLPerformance enables you to:

Manage the Foundry Local service using the official JavaScript SDK
Load and benchmark multiple models simultaneously
Run standardised benchmark tests across models
Display clear performance statistics with tables and charts
Export results for analysis

Dashboard showing system status, benchmark run history, and quick actions

Models page for loading, testing, and managing AI models

Benchmarks page with suite selection, model selection, and configuration

Comprehensive results with performance scores, comparison charts, and detailed metrics

Cache management page for switching model directories and viewing cached models

Settings page with system information, API endpoint, and application details

Quick Start

Before You Begin

Required: Install Microsoft Foundry Local first

# Windows
winget install Microsoft.FoundryLocal

# macOS
brew tap microsoft/foundrylocal
brew install foundrylocal

# Or download from: https://aka.ms/foundry-local-installer

Verify installation:

foundry --version

Installation (3 Steps)

Step 1: Navigate to project directory

cd C:\Users\YourUsername\path\to\FLPerformance

Step 2: Install Node.js (if not already installed)

# Windows - Install Node.js LTS
winget install --id OpenJS.NodeJS.LTS --accept-package-agreements --accept-source-agreements

# After installation, RESTART YOUR TERMINAL for PATH updates

macOS:

brew install node

Or download from: https://nodejs.org/

Step 3: Run installation script

# Windows 
.\scripts\install.ps1

# macOS/Linux
chmod +x scripts/install.sh && ./scripts/install.sh

Note: Installation uses --no-optional flag to skip SQLite database (requires build tools).
Results are saved as JSON files instead. This works perfectly for all features!

Step 4: Start the application

# Easy Mode - Opens 2 terminals + browser automatically (Windows)
.\START_APP.ps1

# Manual Mode - Starts both servers
npm run dev

Access the Application

Once the server starts, open your browser:

http://localhost:3000

You will see:

Models tab: Add and load AI models
Benchmarks tab: Run performance tests
Results tab: View comparison charts
Cache tab: Switch to custom model cache directories

First Time Setup (In the UI)

Click Models, then Initialise Foundry Local (one-time setup)
Click Add Model and select phi-3-mini-4k-instruct
Click Load Model (downloads roughly 2 GB; the model loads in the background while you see real-time status)
Go to Benchmarks, select your model, and click Run Benchmark
View results in the Results tab

Custom Models (Optional)

Use the Cache tab to switch the Foundry cache directory
Point to directories containing custom ONNX models
Custom models appear in the Models dropdown with a wrench badge
Benchmark custom models in the same way as catalogue models

Alternative: Manual Installation

If the automated installation script does not work, follow these manual steps:

Required Software

Microsoft Foundry Local
- Download from: https://aka.ms/foundry-local-installer
- Verify installation: foundry --version
- Note: Foundry Local CLI must be in your PATH
Node.js & NPM
- Node.js v18 or higher
- NPM v9 or higher
- Download from: https://nodejs.org/
- Verify: node --version and npm --version
System Requirements
- Windows 10/11, macOS, or Linux
- Minimum 16GB RAM (32GB+ recommended for multiple models)
- GPU with CUDA support (optional but recommended)
- Adequate disk space for model storage (varies by model, typically 5-50GB per model)

Installation Steps

1. Install Dependencies

# Skip optional SQLite (requires build tools)
npm install --no-optional

# Install frontend dependencies
cd src/client
npm install
cd ../..

# Create results directory
mkdir results

Want SQLite database support? Install Visual Studio Build Tools first:

# Windows only - needed for better-sqlite3
winget install Microsoft.VisualStudio.2022.BuildTools --silent --override "--wait --passive --add Microsoft.VisualStudio.Workload.VCTools"

# Then install with optional dependencies
npm install

# Create results directory
mkdir results

2. Start the Application

# Development mode (with hot reload)
npm run dev

Access the application at: http://localhost:3000

The application will be available at:

Frontend UI: http://localhost:3000
Backend API: http://localhost:3001

Using the Application

Open the UI at http://localhost:3000
Navigate to the Models tab
Click Initialise Foundry Local to start the service
Click Add Model
Select a model from the available Foundry Local catalogue (for example, phi-3-mini-4k-instruct)
Click Load Model to download (if needed) and load the model into memory

Note: Foundry Local uses a single service instance that can load multiple models simultaneously. Models are differentiated by their model ID when making inference requests.

4. Run Your First Benchmark

Navigate to the Benchmarks tab
Select the default benchmark suite
Choose one or more models to benchmark
Configure settings (iterations, concurrency, and so on)
Click Run Benchmark
Watch live progress as tests execute

Viewing Results

Navigate to the Results tab
View comparison tables and charts
Filter by run, model, or benchmark type
Export results as JSON or CSV

Project Structure

FLPerformance/
├── src/
│   ├── server/              # Backend API
│   │   ├── index.js         # Express server entry point
│   │   ├── orchestrator.js  # Foundry Local service orchestration
│   │   ├── benchmark.js     # Benchmark engine
│   │   ├── cacheManager.js  # Model cache management (filesystem-based)
│   │   ├── storage.js       # Results storage (JSON + SQLite)
│   │   └── logger.js        # Structured logging
│   └── client/              # Frontend UI (React + Vite)
│       └── src/
│           ├── pages/       # Page views
│           └── utils/       # Client utilities
├── benchmarks/
│   └── suites/
│       └── default.json     # Default benchmark suite definition
├── docs/
│   ├── architecture.md      # System architecture
│   ├── api.md               # REST API reference
│   ├── setup.md             # Setup documentation
│   ├── BENCHMARK_GUIDE.md   # Troubleshooting guide
│   ├── QUICK_REFERENCE.md   # Commands and code patterns cheat sheet
│   ├── TESTING_CHECKLIST.md # Comprehensive test cases
│   ├── VALIDATION_STEPS.md  # Validation procedures
│   └── images/              # Screenshots and diagrams
├── scripts/
│   └── helpers/            # Utility scripts
├── results/
│   └── example/            # Example benchmark results
├── package.json
└── README.md

Key Features

Model & Service Management

Unified service management using foundry-local-sdk
Add/remove models from Foundry Local catalog
Load multiple models simultaneously in a single service
Custom Model Support: Benchmark custom ONNX models from alternate cache directories via Cache tab
Monitor model health and status in real-time
Automatic model download and caching

Benchmark Suite

Throughput (TPS): Tokens generated per second (overall)
Latency: Time to first token (TTFT), time per output token (TPOT), and end-to-end completion time
Generation Speed (GenTPS): Token generation rate after first token (1000/TPOT)
Percentile Metrics: P50, P95, and P99 latency measurements for reliability analysis
Performance Scoring: 0-100 score based on throughput, latency, and reliability
Stability: Error rate and timeout tracking
Resource Usage: CPU, RAM, and GPU utilisation (platform-dependent)

Results & Comparison

Performance Score Cards: Visual 0-100 ratings for each model
"Best Model For..." Cards: Automatic recommendations for throughput, latency, reliability, and TTFT
Side-by-side Comparison Table: Detailed metrics with colour-coded scores
Interactive Charts:
- Throughput comparison (TPS)
- Latency comparison (P50/P95/P99)
- Generation performance (TTFT, TPOT, GenTPS)
- Performance radar chart showing multidimensional analysis
Detailed Results Table: Per-scenario breakdowns with all metrics
Export Options: JSON and CSV export for further analysis

Configuration

Default settings can be modified in the Settings tab:

Default iterations per benchmark
Concurrency level
Request timeout values
Results storage path
Streaming mode (if supported)

Architecture

FLPerformance uses the official foundry-local-sdk JavaScript package to manage the Foundry Local service:

Single Service Instance: One Foundry Local service handles all models
Multiple Loaded Models: Models are loaded on-demand and run simultaneously
OpenAI-Compatible API: Standard OpenAI client for inference requests
Model Differentiation: Models are identified by their model ID in API calls

See Architecture Documentation for details.

Troubleshooting

Service fails to start

Ensure Foundry Local is installed: foundry --version
Verify Foundry Local CLI is in your PATH
Check that port 8080 is available (default Foundry Local port)
View logs in the Models tab for specific error messages

Model fails to load

Verify sufficient disk space for model download
Check network connectivity for first-time downloads
Ensure adequate RAM for model size
Try manually loading with Foundry Local CLI: foundry model run <model-name>

Benchmark timeouts

Increase timeout values in Settings
Reduce concurrency level
Check system resource availability (RAM, GPU memory)

Test Models Before Benchmarking

Use the Test button in the Models tab to verify inference works
Successful test ensures model will work in benchmarks
Test validates both model loading and inference response
Quick way to catch configuration issues early

Installation Issues

Run the appropriate installation script (install.ps1 or install.sh) for detailed diagnostics
Check Quick Start Guide for common installation issues
Verify Node.js version: node --version (must be v18+)

Documentation

For more detailed information, see:

Quick Start Guide - Comprehensive getting started guide
Quick Reference - Commands and code patterns cheat sheet
Architecture Documentation - System design and SDK integration
API Reference - REST API endpoint documentation
Setup Guide - Detailed installation and configuration
Benchmark Guide - Troubleshooting and testing guide
Testing Checklist - Comprehensive test cases

Resources

Support

For issues or questions:

Check the documentation in /docs
Review logs in the UI under each service
Examine results in /results directory

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
benchmarks/suites		benchmarks/suites
docs		docs
results		results
scripts		scripts
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
BLOGPOST.md		BLOGPOST.md
CHANGELOG.md		CHANGELOG.md
CHECK_STATUS.ps1		CHECK_STATUS.ps1
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
QUICK_START.md		QUICK_START.md
README.md		README.md
SECURITY.md		SECURITY.md
START_APP.ps1		START_APP.ps1
START_APP.sh		START_APP.sh
START_HERE.md		START_HERE.md
package-lock.json		package-lock.json
package.json		package.json
vitest.config.js		vitest.config.js

Folders and files

Latest commit

History

Repository files navigation

FLPerformance: Foundry Local Model Benchmark Tool

Easy Startup Script

Features

Overview

Quick Start

Before You Begin

Installation (3 Steps)

Access the Application

First Time Setup (In the UI)

Custom Models (Optional)

Alternative: Manual Installation

Required Software

Installation Steps

1. Install Dependencies

2. Start the Application

Using the Application

4. Run Your First Benchmark

Viewing Results

Project Structure

Key Features

Model & Service Management

Benchmark Suite

Results & Comparison

Configuration

Architecture

Troubleshooting

Service fails to start

Model fails to load

Benchmark timeouts

Test Models Before Benchmarking

Installation Issues

Documentation

Resources

Support

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages