Improve callsign matching for spelled-out vs. numeric variants

Whisper transcribes callsigns inconsistently depending on how the controller/pilot speaks and how the audio comes through — e.g. "United two five", "United 25", and "UAL25" can all refer to the same aircraft. This fragments analysis and weakens ADS-B correlation (which matches on callsign).

## Goal
Add a normalization pass that canonicalizes a callsign into a consistent form (airline telephony + flight number) **before** the transcript batch is sent to Gemini and before ADS-B correlation.

## Suggested approach
- Map airline telephony names → ICAO prefixes (e.g. `United` → `UAL`, `Speedbird` → `BAW`, `Cathay` → `CPA`).
- Convert spelled-out and word-number digits to numeric (`two five` → `25`, `niner` → `9`).
- Produce a canonical token (e.g. `UAL25`) while preserving the raw transcript text for display.
- Be conservative: when confidence is low or no airline match is found, leave the raw text untouched rather than guessing.

## Where to look
- `backend/core/batcher.py` (batch assembly + `AIRPORT_GEO` / ADS-B correlation)
- The Gemini prompt assembly path

## Acceptance
- A short unit test covering several spelled-out / numeric / ICAO variants resolving to the same canonical callsign.
- Raw transcript text remains visible on the observation card.

This maps directly to the "callsign matching isn't perfect yet" limitation in the README — one of the most impactful accuracy fixes available.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve callsign matching for spelled-out vs. numeric variants #4

Goal

Suggested approach

Where to look

Acceptance

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Improve callsign matching for spelled-out vs. numeric variants #4

Description

Goal

Suggested approach

Where to look

Acceptance

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions