MirageIA

Lire en francais

Intelligent pseudonymization proxy for LLM APIs.

The API never sees your real data — it sees a mirage.

Your app  -->  MirageIA (local proxy :3100)  -->  LLM API (Anthropic / OpenAI)
                │                                    │
                ├─ Detects PII (regex + ONNX)        │
                ├─ Pseudonymizes before sending       │
                └─ Restores in the response  <────────┘

The Problem

When you use Claude, ChatGPT, or any other LLM via API, your data travels in plain text to external servers: names, emails, IP addresses, API keys, phone numbers... This sensitive data is exposed without you knowing.

The Solution

MirageIA sits between your application and the LLM API. It automatically detects sensitive data, replaces it with consistent fake values, and restores the originals in the response.

Original data	What the API receives	What you get back
`jean.dupont@acme.fr`	`alice@example.com`	`jean.dupont@acme.fr` (restored)
`192.168.1.22`	`10.0.84.12`	`192.168.1.22` (restored)
`06 12 34 56 78`	`06 47 91 28 53`	`06 12 34 56 78` (restored)
`sk-abc123def456...`	`sk-xR9mK2pL7wQ4...`	`sk-abc123def456...` (restored)

The LLM works with fake but consistent data — its response is just as relevant, and your data never left your machine.

Quick Start

Install

Download the binary for your OS from GitHub Releases:

# Linux / macOS
curl -sSfL https://github.com/ctardy/mirageia/releases/latest/download/mirageia-linux-x86_64.tar.gz \
  | tar xz -C ~/.local/bin/

# Or from source (requires Rust 1.75+)
git clone https://github.com/ctardy/mirageia.git && cd mirageia && cargo build --release

On Windows, download mirageia-windows-x86_64.zip from the Releases page.

Setup

# The wizard guides you: port, LLM providers, whitelist, shell
mirageia setup

Usage

# Start the proxy
mirageia

# Use Claude Code via the proxy (this session only)
mirageia wrap -- claude

# Monitor requests in real time (in another terminal)
mirageia console

# Web dashboard
# Open http://localhost:3100/dashboard in your browser

Per-session activation — mirageia wrap launches your command with the proxy enabled, without modifying your shell. When the command exits, the proxy is no longer used:

mirageia wrap -- claude          # Claude Code protected
mirageia wrap -- python app.py   # Python script protected
claude                           # Claude Code direct (no proxy)

Temporarily Disable

# Option 1: Passthrough mode (proxy relays without pseudonymizing)
mirageia proxy --passthrough

# Option 2: Stop the proxy — does NOT affect apps launched normally
# Only those launched via `mirageia wrap` go through the proxy

Verification

# Health check
curl http://localhost:3100/health
# -> {"status":"ok","passthrough":false,"pii_mappings":0}

# Test request (requires an Anthropic API key)
curl -X POST http://localhost:3100/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "max_tokens": 100,
    "messages": [{"role": "user", "content": "My email is jean@acme.fr and my IP is 192.168.1.50"}]
  }'

In the MirageIA logs, you will see the detected and pseudonymized PII. The request sent to Anthropic will contain neither the original email nor the IP.

Configuration

MirageIA works without configuration (zero config). To customize, create ~/.mirageia/config.toml:

[proxy]
listen_addr = "127.0.0.1:3100"  # Listening address
log_level = "info"               # debug, info, warn, error
add_header = false               # Add X-MirageIA: active header to requests
fail_open = true                 # true = forward unmasked on error (logs warning); false = return HTTP 502
passthrough = false              # Passthrough mode: relay without pseudonymizing
proxy_token = ""                 # Optional bearer token (leave empty to disable auth)

[detection]
confidence_threshold = 0.75      # Confidence threshold (0.0-1.0)
whitelist = [                    # Terms to never pseudonymize
    "localhost",
    "127.0.0.1",
    "Thomas Edison",
]

Environment variables take precedence over the file:

Variable	Description
`MIRAGEIA_LISTEN_ADDR`	Listening address (e.g., `0.0.0.0:3100`)
`MIRAGEIA_ANTHROPIC_URL`	Anthropic base URL
`MIRAGEIA_OPENAI_URL`	OpenAI base URL
`MIRAGEIA_LOG_LEVEL`	Log level
`MIRAGEIA_PASSTHROUGH`	Enable passthrough mode (any value = enabled)
`MIRAGEIA_PROXY_TOKEN`	Bearer token required on all LLM requests (optional)

Detected PII Types

The regex detector (v1) covers fixed-pattern PII:

Type	Examples	Generated pseudonym
Email	`jean@acme.fr`	`alice@example.com`
IPv4	`192.168.1.50`	`10.0.84.12`
IPv6	`2001:db8::1`	`fd00::a1b2:c3d4`
Phone	`06 12 34 56 78`	`06 47 91 28 53` (format preserved)
Credit card	`4111 1111 1111 1111`	`4892 7631 0458 2173` (Luhn valid)
IBAN	`FR7612345678901234567890`	`FR8398765432109876543210`
API key / token	`sk-abc123def456...`	`sk-xR9mK2pL7wQ4...` (prefix preserved)
Social security #	`1 85 07 75 123 456 78`	`2 91 03 69 847 215 34`

The contextual ONNX detector (v2, in progress) will add detection of person names, postal addresses, and will understand context ("Thomas Edison" in a history lesson = not masked).

Architecture

src/
├── main.rs                  CLI (proxy / setup / detect / wrap / console)
├── lib.rs                   Public modules
├── config/
│   └── settings.rs          AppConfig, TOML + env loading
├── proxy/
│   ├── server.rs            axum handler, full pipeline, dashboard
│   ├── router.rs            Anthropic / OpenAI routing by path
│   ├── client.rs            Upstream HTTP client (reqwest)
│   ├── extractor.rs         JSON extraction/rebuild per provider
│   └── error.rs             Proxy error types
├── detection/
│   ├── regex_detector.rs    PII detector via regex (v1)
│   ├── types.rs             PiiType, PiiEntity, label_to_pii_type
│   ├── model.rs             ONNX model (feature-gated, v2)
│   ├── tokenizer.rs         HuggingFace tokenizer, segmentation
│   ├── postprocess.rs       Softmax, BIO decode, entity merging
│   └── error.rs             Detection errors
├── pseudonymization/
│   ├── generator.rs         Pseudonym generator by type
│   ├── replacer.rs          Text replacement (offsets)
│   ├── depseudonymizer.rs   De-pseudonymization (AhoCorasick)
│   └── dictionaries.rs      Embedded first/last names
├── mapping/
│   ├── table.rs             Bidirectional table (SHA-256 + AES-256-GCM)
│   ├── crypto.rs            Encryption/decryption, zeroization
│   └── error.rs             Mapping errors
└── streaming/
    ├── sse_parser.rs        Parse/rebuild SSE Anthropic/OpenAI
    └── buffer.rs            Buffer for pseudonyms split between tokens

Processing Pipeline

INCOMING REQUEST
    │
    ▼
[JSON Extraction]    <- extractor.rs (Anthropic/OpenAI text fields)
    │
    ▼
[PII Detection]      <- regex_detector.rs (emails, IPs, phones, CC, IBAN, keys)
    │                   + whitelist filtering
    ▼
[Pseudonymization]   <- replacer.rs (descending positions)
    │                   + generator.rs (consistent pseudonyms by type)
    │                   + mapping/table.rs (AES-256-GCM in memory)
    ▼
[Reconstruction]     <- extractor.rs (rebuild JSON)
    │
    ▼
[Forward]            <- client.rs -> upstream API
    │
    ▼
UPSTREAM RESPONSE
    │
    ▼
[De-pseudonymize]    <- depseudonymizer.rs (AhoCorasick, longest-first)
    │                   or buffer.rs (streaming SSE, split between tokens)
    │                   Note: complete tokens are restored. If the LLM
    │                   decomposes a value char-by-char in an array, the
    │                   pseudonymized chars remain — proof the real data
    │                   never reached the API.
    ▼
CLIENT RESPONSE (original data restored for complete tokens)

Tests

# All tests (233)
cargo test

# Unit tests only
cargo test --lib

# E2E tests (proxy + mock upstream)
cargo test --test e2e_proxy

# Specific module tests
cargo test -- detection::regex_detector
cargo test -- mapping::crypto
cargo test -- pseudonymization

Test Coverage

Module	Tests	Coverage
config	6	Default config, TOML parsing, partial, empty, passthrough
proxy/router	8	Anthropic/OpenAI routing, URLs
proxy/extractor	9	JSON extraction/rebuild, content string/array, system
detection/types	8	Labels, thresholds, aliases, display
detection/postprocess	13	Softmax, extraction, merging, multi-token, thresholds
detection/tokenizer	6	Segmentation, overlap, progression
detection/regex_detector	29	Email, IP, phone, CC, IBAN, API key, password, whitelist
detection/model	2 + 7	Models directory, missing files, model manager
detection/validator	18	Luhn, MOD-97 IBAN, secret detection
detection/mod	4	Label map loading
extraction/pdf	3	PDF text extraction
extraction/docx	4	DOCX text extraction
extraction/tests	6	Multipart, nested content
mapping/crypto	6	AES-256-GCM roundtrip, nonces, unicode
mapping/table	9	Bidirectional, concurrent, unique IDs
pseudonymization/generator	13	All PII types, Luhn, format
pseudonymization/replacer	9	Positions, session coherence
pseudonymization/depseudonymizer	6	Roundtrip, longest-first
pseudonymization/fragment_restorer	13	IP octets, CC groups, NSS segments (SPB)
pseudonymization/dictionaries	2	Name pools, uniqueness
streaming/sse_parser	7	Anthropic, OpenAI, DONE, rebuild
streaming/buffer	8	Split pseudonym, flush, phone with spaces
setup/wizard	6	Config generation, shell detection
update/tests	5	Version check, download
e2e	26	Full pipeline, passthrough, SSE, events, dashboard, PDF/DOCX, password
Total	233

Project Status

Component	Status	Notes
HTTP proxy (explicit client config)	Done	axum, Anthropic/OpenAI routing — requires setting `ANTHROPIC_BASE_URL` / `OPENAI_BASE_URL` in the client
Regex PII detection	Done	8 PII types
Reversible pseudonymization	Done	AES-256-GCM mapping
Response de-pseudonymization	Done	Non-streaming + SSE buffer
TOML config + whitelist	Done	~/.mirageia/config.toml
Fail-open	Done	Passthrough on error
Passthrough mode	Done	`--passthrough` / config / env var
Per-session activation	Done	`mirageia wrap -- claude`
Monitoring console	Done	`mirageia console` (real-time SSE)
Web dashboard	Done	`/dashboard` embedded in binary
Docker + deployment	Done	Dockerfile, ops guide, Apache reverse proxy
E2E tests	Done	233 tests (207 unit + 26 e2e)
Contextual ONNX detection	Structured	Code ready (optional feature), ONNX Runtime blocked by MSVC toolchain on Windows; model ~337 MB downloaded separately
Tauri dashboard	Dropped	Removed — dashboard is a web page served by Axum at `/dashboard`

Documentation

	FR	EN
Installation	`docs/fr/installation.md`	`docs/en/installation.md`
Ops Deployment	`docs/fr/deploiement-ops.md`	`docs/en/deployment-ops.md`
Distribution	`docs/fr/distribution.md`	`docs/en/distribution.md`
Contributing	`docs/fr/contribution.md`	`docs/en/contributing.md`
Architecture	`docs/fr/architecture/`	`docs/en/architecture/`
Security	`docs/fr/securite/`	`docs/en/security/`
Technical	`docs/fr/technique/`	`docs/en/technical/`
Research	`docs/recherche/`
Tickets	`docs/tickets/`

License

Free for personal and non-commercial use under the PolyForm Noncommercial 1.0.0 license.

Commercial use (within a company, for revenue-generating activities, or as part of a paid product/service) requires a commercial license. Contact: contact@uitguard.com

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
.github/workflows		.github/workflows
bucket		bucket
dictionaries		dictionaries
docker		docker
docs		docs
src		src
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
config.example.toml		config.example.toml
install.sh		install.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MirageIA

The Problem

The Solution

Quick Start

Install

Setup

Usage

Temporarily Disable

Verification

Configuration

Detected PII Types

Architecture

Processing Pipeline

Tests

Test Coverage

Project Status

Documentation

License

About

Uh oh!

Releases 34

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MirageIA

The Problem

The Solution

Quick Start

Install

Setup

Usage

Temporarily Disable

Verification

Configuration

Detected PII Types

Architecture

Processing Pipeline

Tests

Test Coverage

Project Status

Documentation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 34

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages