Lexora Academy

A full-stack language learning ecosystem for English, Ukrainian, Greek, and Polish — built on Odoo 18 Community with spaced repetition science, AI-powered vocabulary intelligence, real-time PvP duels, and a Chrome Extension that turns the entire web into a classroom.

1. Concept

Language acquisition research shows that vocabulary sticks when learners encounter words in authentic contexts — not flashcard drills in isolation. Lexora is built around this principle:

Immersion-first capture: the Chrome Extension watches every page you visit. See an unknown word on YouTube, Netflix, or any website? One click saves it with full sentence context, automatic translation, and LLM-generated enrichment.
Spaced repetition science: SM-2 algorithm schedules reviews at the scientifically optimal moment — when you're about to forget.
Social pressure through competition: PvP duels with real opponents or a bot put your vocabulary under fire and award XP that feeds a visible leaderboard.
Ecosystem integration: the same word you saved from a YouTube subtitle shows up in your morning new-tab practice card, your grammar exercises, and as a distractor option in an opponent's PvP round.

The entire stack runs on a single low-resource VPS. No cloud GPU, no per-request API fees, no external databases — just Docker Compose on a CPU-only Linux server.

2. Feature Catalogue

Core Learning

Feature	Details
Vocabulary manager	Add words, phrases, and sentences in EN/UK/EL; automatic dedup via normalisation pipeline; sharing toggle
Auto-translation	On every save: async `deep_translator` (Google Translate / MyMemory fallback) for all learning languages
LLM enrichment	On-demand: Qwen2.5-1.5B Q4_K_M via `llama-cpp-python`; synonyms, antonyms, 3–7 example sentences, explanation — always in the source language
Anki import	`.apkg` + `.txt` formats; auto field-mapping; Zstd-compressed modern decks; embedded audio extraction; persistent dedup import log
Audio	Browser mic recording + Microsoft Edge TTS (online, zero-RAM, en/uk/el/pl with `pl-PL-ZofiaNeural`); Whisper `base` STT transcription; all stored in Odoo filestore
Spaced repetition	SM-2 algorithm; `/my/practice` flashcard portal; due-card counter on portal home
PDF export	Printable cheat sheets: personal vocabulary, Gold Vocabulary by CEFR level, Grammar sections

Community & Social

Feature	Details
Posts & articles	Draft → moderator review → publish flow; rich-text body; @mention comments
Public channels	Language-specific discuss channels (English / Ukrainian / Greek / Polish); visible to all registered users
Private DMs	1-to-1 chat initiated from user profile; "Save to My List" inline popup from any message
Copy-to-list	Select text in any post or chat message → floating popup → creates `language.entry` + auto-queues translation

Gamification

Feature	Details
XP system	Earned from practice reviews (5/10/15/20 XP by grade), duel wins, grammar practice, sentence builder
Levels	`level = 1 + floor(sqrt(xp / 50))`, capped at 20; displayed as a badge on the leaderboard
Daily streaks	Consecutive-day learning streaks; resets on missed day; frozen by Streak Freeze shop item
XP Shop	Spend XP on: Streak Freeze (50 XP), Profile Frame (100 XP), Double XP Booster (80 XP / 5 reviews)
Leaderboard	Top-20 by XP, paginated; current-user highlight; language-pair filter
PvP arena	Real-time async word duels; matchmaking by language pair; Lexora Bot opponent; 10-round battles; XP stake

Practice Modes

Feature	Details
Grammar Pro	110 cloze-test exercises (EN A1–B2 + Greek A1–A2); instant green/red feedback; CEFR / category filters
Sentence Builder	Word-ordering game using grammar dataset sentences; click-to-order tiles; XP reward
AI Roleplay	6 scenarios (café, job interview, doctor, hotel, airport, market); LLM native speaker with inline grammar corrections; conversation history persisted
AI Speaking Coach	Browser-mic recording → Faster-Whisper sync transcription → Qwen2.5-1.5B feedback (corrections / synonyms / improved version). 90 s soft cap; works in en/uk/el/pl; sessions persisted at `/my/speaking/<id>`
Lexora Writer (browser)	Floating "L" FAB on every focused `<textarea>` / `[contenteditable]` across the web. One click → grammar fixes + polished version + Apply-to-text. Compatible with React/Vue controlled inputs; strict eligibility skips passwords, search, code editors
Slang & Idiom Explainer (browser)	"💡 Explain Slang/Idiom" button in the Quick Look + YouTube subtitle overlays. Classifies the phrase (idiom / slang / phrasal verb / literal) and renders figurative + literal meaning + a usage example in the user's chosen native language
Webpage Shadowing (browser)	"🎤 Practice Pronunciation" button in the Quick Look + YouTube overlays. ▶ Play Original streams Edge TTS for the selected sentence; click-to-toggle 🎙 Start/Stop Recording captures the user's voice on a `chrome.offscreen` document (mic permission granted once per extension). Whisper transcribes; deterministic Python word-diff scores accuracy and flags missed/mispronounced words; LLM writes the localised feedback line
Phrasebook	6 tourist kits × ~15 phrases × 3 languages; one-click "Practice in Roleplay"
Idioms Hub	100+ phrasal verbs (EN) + idioms (UK/EL); flip-card UI; save-to-vocabulary button

Library & Tools

Feature	Details
AI Translator	Google-Translate-style `/translator` page; en↔uk↔el; "Add to Vocabulary" CTA
Gold Vocabulary	3,184 most common English words with CEFR level, POS, UK + EL translations; tabbed by level
Grammar Encyclopedia	6 sections: 12 tenses, 200 irregular verbs, articles, conditionals, modals, passive/reported speech

3. The Browser & Mobile Ecosystem (M22–M37)

The Chrome Extension is the centrepiece of the immersion strategy. It turns every browser tab into a capture and practice surface.

M22 — Companion Extension Scaffold

A Manifest V3 Chrome Extension with a glassmorphism popup that lets users save vocabulary without leaving the current tab.

Glassmorphism popup: type a word, select language (EN/UK/EL), optionally add context and a translation, click Add — the entry lands in Lexora with translation auto-queued.
Options page: configure the Lexora server URL (default http://localhost:5433).
Session bridge: the popup reads the Odoo session cookie via chrome.cookies.get and forwards it as X-Lexora-Session-Id so the Odoo API controller recognises the user without a CORS/SameSite issue.
Odoo API endpoint POST /lexora_api/add_word: auth='none' with manual session resolution; returns {"status":"ok","entry_id":N} or {"status":"duplicate"}.

M23 — Contextual Capture

Right-click any selected text on any page → "Add to Lexora" context menu item.

The background service worker captures the surrounding sentence by walking the DOM text node containing the selection and splitting on .!? boundaries.
The word and context are posted to /lexora_api/add_word directly from the background script (no CORS restrictions for background fetch).
A glassmorphism toast notification slides in from the bottom-right with a shrinking progress bar confirming the save (✓), duplicate (=), or error (!).
Opening the popup immediately after shows the word pre-filled from the context-menu capture via chrome.storage.session.

M24 — YouTube Subtitle Integration

Every word in a YouTube subtitle track becomes clickable.

How it works:

A MutationObserver watches .ytp-caption-window-container for subtitle DOM changes.
On each new subtitle line, each word is wrapped in a <span class="lx-word"> element.
Clicking a word pauses the video and opens a glassmorphism definition overlay positioned adjacent to the clicked word.
The overlay fetches GET /lexora_api/define?word=X&lang=Y:
- First checks the user's own saved translations.
- If none found: calls the Translation Service's /translate endpoint synchronously (live translation, not persisted — user controls persistence via "Add").
- Returns definition + live badge if translation was on-the-fly.
"Add to Vocabulary" in the overlay saves with source_url set to <youtube_url>#t=<timestamp> so the entry links back to the exact moment.
Retry button: if the definition lookup times out (5 s), a retry button re-runs the full lookup cycle.

Additionally, a Quick Look overlay (Shadow DOM, fully isolated CSS) activates on any text selection ≥ 2 characters across all pages. A floating "L" icon appears at the right edge of the selection; clicking it opens the same definition overlay with language auto-detection via Unicode block ranges (Cyrillic → uk, Greek → el, Polish-diacritics [ąćęłńóśźż] → pl, else en).

M25 — Premium New Tab Dashboard

Replacing the browser's default new tab with a Lexora vocabulary card.

Animated dark gradient background with two floating orbs (same design language as the Lexora portal hero).
Live clock (hours:minutes, updates every second).
Personalised greeting: fetches the user's name from /lexora_api/whoami; "Good morning / afternoon / evening, {first name}".
Daily vocabulary card: random entry from the user's own vocabulary with at least one completed translation (Priority 1), falling back to a random idiom (Priority 2), or an empty state prompting the user to add words.
The card shows: source word, source language flag, all completed translations with flags, and a "Practice →" CTA linking to /my/practice.
Refresh button: fetches a different card without reloading the tab.
Disable override: a toggle in the extension options restores Chrome's native new tab page without uninstalling the extension.
Authentication-aware: if the user is logged out, the card shows "Sign in to Lexora" and links to the portal.

M27 — Review in the Wild

Every webpage becomes a passive vocabulary review surface.

Automatic highlighting: the content script runs a single-pass TreeWalker over every text node via requestIdleCallback. Words that exist in the user's vocabulary receive a coloured dotted underline (border-bottom: 2px dotted) colour-keyed by SRS state: indigo = due for review, green = in learning, amber = new.
SRS-aware tooltip: hovering a highlighted word shows a glassmorphism card with the word, SRS age ("Reviewed 3 days ago"), and — simultaneously — the Ukrainian 🇺🇦, Greek 🇬🇷, and Polish 🇵🇱 translations rendered side by side.
Multi-language support: GET /lexora_api/get_learned_words returns a translations: {"uk": "...", "el": "..."} dict. The content script stores data-trans-uk / data-trans-el attributes on each <span> so the tooltip renders both translations without an extra network call.
15-minute local cache: word list is fetched once and stored in chrome.storage.local with a generated_at timestamp. Cache is automatically invalidated when the user adds a new word via the popup or context menu.
SPA-safe: a MutationObserver on document.body (debounced 500 ms) re-highlights after React/Vue/Angular route changes without thrashing the DOM.
YouTube safety: subtitle spans (lx-yt-word) are excluded from the walker so subtitle highlighting and the M24 Quick Look overlay are never double-applied.

M28 — AI Grammar Explainer

One click produces a 2-sentence linguistic explanation of any selected phrase, powered by the local Qwen 1.5B model.

"Explain Grammar" button appears in both the global Quick Look overlay (any webpage text selection) and the YouTube subtitle word-click overlay.
LLM endpoint: POST /explain-grammar on the LLM service (FastAPI sync); max_tokens=150, temperature=0.3, repeat_penalty=1.1. System prompt requests a 2-sentence linguistics explanation in the same language as the input phrase.
Odoo proxy: POST /lexora_api/explain_grammar forwards to the LLM service with a 60-second timeout (same synchronous pattern as the /roleplay proxy).
Sentence-length support: _QL_MAX_LEN raised to 1000 characters so full sentences can be selected for grammar analysis (not just individual words).
Draggable overlays: both the Quick Look card (Shadow DOM) and the YouTube overlay (page DOM) are draggable by their header bars. Viewport-clamped repositioning; the YouTube overlay converts from bottom/transform to pure top/left positioning on first drag so delta arithmetic is clean.
Scrollable content: a flex-column sandwich layout (header → scroll body → footer) with !important on all structural flex/overflow properties survives YouTube's aggressive stylesheet overrides.
Latency UX: button shows "Explaining…" immediately; overlay stays open so the user can read translations while the model generates (~10–40 s on E5-2680v2 CPU).

M31 — Lexora Writer (Active Writing Assistant)

A floating "L" button appears beside every focused <textarea> and [contenteditable] element on every webpage — Reddit comment boxes, Gmail compose, GitHub PR descriptions, Notion, Odoo backend long-text fields. One click sends the field's value to the LLM and replaces it with a polished version.

Strict eligibility filter: skips [type=password], [role="search"] ancestors, code editors (Monaco / CodeMirror / ACE / github.dev / replit / codesandbox), login/signup forms, and our own Shadow-DOM overlays. False positives erode trust faster than false negatives erode utility.
Apply-to-text uses the canonical "native input setter" pattern — Object.getOwnPropertyDescriptor(HTMLTextAreaElement.prototype,'value').set .call(input, improved) — to bypass React's wrapped setter on controlled inputs, then dispatches InputEvent('input', {bubbles:true, inputType:'insertReplacementText'}) and Event('change', {bubbles:true}) so React/Vue/Svelte/Solid all pick up the change. Verified on Reddit's React-controlled textarea (the character counter visibly updates after Apply).
Server-side safety net in /analyze-writing synthesises a catch-all correction entry whenever the LLM returns a polished improved but an empty corrections array — guarantees the user always sees a documented reason for the text change (ADR-031).
Privacy disclosure in the popup footer: "Text is sent to your Lexora server for analysis." The Writer is proactive (FAB on every field), so the disclosure is non-optional.
Options-page toggle to disable the FAB globally; live-hides via chrome.storage.onChanged without refreshing open tabs.

M32 — Slang & Idiom Explainer

A new "💡 Explain Slang/Idiom" button alongside the M28 "Explain Grammar" button in both the Quick Look (any webpage) and YouTube subtitle overlays. Sends the selected phrase to POST /explain-slang and renders the response in an amber-themed scrollable block.

Five-key JSON contract: {kind, figurative_meaning, literal_meaning, example, confidence}. kind ∈ {idiom, slang, phrasal_verb, literal, unknown}; confidence ∈ {high, medium, low}. Both enums clamped server-side defensively.
Dual language clamp: figurative + literal explanations in the user's native language (set via the Options-page dropdown); example sentence in the phrase's source language so the user sees the idiom in its natural habitat.
kind:'literal' UI branch — when the model classifies a phrase as literal, the renderer shows "This phrase translates literally — no figurative meaning" instead of inventing a fake idiomatic reading. Better to acknowledge "this is just a sentence."
confidence:'low' UI branch — appends "⚠ AI is uncertain — consider checking a dictionary" so the user knows Qwen 1.5B may be wobbly on regional slang or obscure idioms.

M33 — Webpage Shadowing (Pronunciation Practice)

Brings the /my/speaking mic-and-feedback flow from the portal into the browser extension. Select a sentence on any webpage → "🎤 Practice Pronunciation" → expands a teal Shadowing block with two affordances:

▶ Play Original: Edge TTS audio streamed from POST /tts-sync, played inline via <audio> so the user hears a perfect rendering before practising.
🎙 Start / ⏹ Stop Recording (click-to-toggle): captures the user's voice on a chrome.offscreen document. Pivoted from hold-to-record in M33-S6-FIX2 because hold cut recordings off after 1-2 s on micro mouse movements (ADR-032 § 32e).

After Stop, the flow runs:

Whisper transcription via /lexora_api/shadow_evaluate → /transcribe-sync.
Pronunciation evaluation via /evaluate-pronunciation — the structured fields (score, missed_words, mispronounced_words) come from a deterministic Python word-diff (multiset-correct Counter-based, Levenshtein ≤ 2 OR shared 3-char prefix for "mispronounced"); only the localised feedback string comes from the LLM, with a per-language template fallback if the model drifts to the wrong script (ADR-032 § 32d).
UI render: a colour-tiered score badge (green ≥80 / amber 60-79 / red <60), the reference paragraph re-rendered with red strikethrough on missed words and amber wavy underline on mispronounced words, plus the feedback line.

MV3 mic permission: the offscreen-document strategy means users grant mic access once per extension instead of once per webpage origin (ADR-032 § 32a). When Chrome silently blocks the offscreen getUserMedia on first use, the Options page exposes a "🎙️ Grant Microphone Permission" button on a visible UI surface that explicitly registers the grant (ADR-032 § 32b).

Privacy default: shadowing attempts are not persisted — every record→evaluate cycle is ephemeral. The portal's /my/speaking (M30) remains the persistent surface for users who want to review progress over time (ADR-032 § 32f).

M34 — YouTube Vocab Radar

Turns YouTube viewing into a passive vocabulary review surface. The extension fetches the user's saved vocabulary via GET /lexora_api/my_vocab (capped 1000 entries, pvp_eligible=True, cached client-side for 15 min) and uses a main-world script injection to intercept the page's own /api/timedtext XHR / fetch requests for caption tracks. The injected script (extension/youtube_radar_inject.js) patches XMLHttpRequest.prototype and window.fetch, parses JSON3 (modern YouTube default) or SRV3 / SRV1 XML (fallback), and window.postMessages the normalised cue array [{startMs, endMs, text}, ...] back to the isolated-world content script (ADR-033 § 34b).

The content script (extension/youtube_radar.js) builds a hit timeline from the cue track using a longest-match sliding window — phrases sorted descending by token count are tried first, so kick the bucket beats kick when both are in vocabulary (ADR-033 § 34e). On <video>.timeupdate (throttled 250 ms), a binary search finds the next upcoming hit; if it lands within the look-ahead window (default 4 s) the radar:

Pauses the video before the word is spoken.
Renders a glassmorphism Shadow-DOM card with the matched word, all non-source-language translations (🇺🇦 / 🇬🇷 / 🇵🇱 / 🇬🇧 rows in fixed order), and the surrounding cue with the matched word highlighted in <mark>.
Offers four footer actions: ⏪ Rewind 5 s & Play (sets currentTime -= 5), ▶ Continue, 🔕 Skip this word (per-tab, in-memory), ✖ Disable radar for this video (per-video kill switch).

Cooldown timer starts at OVERLAY CLOSE, not at fire (ADR-033 § 34c). A user reading the alert for 30 s doesn't lose 30 s of their 120 s cooldown — the clock starts ticking only after they dismiss the card. Implemented via an _overlayOpen flag that short-circuits the tick while a card is up, plus a single _lastFiredAt = performance.now() write inside _closeOverlay(). Initial value -Infinity (not 0) so the first fire on a fresh page isn't gated by performance.now() - 0 treating page-load time as cooldown.

Options page surface (M34-S6): Enable Radar toggle, Cooldown seconds (10-3600, default 120), Look-ahead seconds (1-15, default 4). All three keys live in chrome.storage.sync; youtube_radar.js subscribes to chrome.storage.onChanged so changes take effect on the next tick without a page reload.

Top-level overlay host, not nested in the M24 click-on-word Quick Look (ADR-033 § 34d). The two surfaces have different trigger paths (auto-pause vs user-click) and live at independent z-indexes (2147483600 for the radar; 2147483601 for Quick Look — click-on-word always wins if both happen to be open). Teal+amber palette distinguishes the radar from the indigo M28 grammar block and the amber M32 slang block.

Privacy default: radar events are not persisted server-side. No DB write per fire, no language.review update, no telemetry log. The user's YouTube watch history stays in their browser — Lexora's backend only learns that the user has certain words saved (the static /my_vocab GET) (ADR-033 § 34f).

Verified end-to-end: browser smoke captured 3621 cues from a real YouTube video; the radar paused exactly 3.8 s before "donkey" was spoken on first test, then with the full UI in place auto-paused on the word "apparently" with the full glassmorphism card rendering all translations and footer buttons working as documented.

M35 — Multi-word YouTube Subtitle Selection

Lets the user look up multi-word phrases ("kick the bucket", "give up", "il est en train de") on YouTube subtitles via Ctrl-click (Cmd-click on macOS) multi-select, with every downstream Quick Look feature (Add to Vocabulary, M28 Explain Grammar, M32 Explain Slang/Idiom, M33 Practice Pronunciation) inheriting phrase support unchanged because the card reads its word from internal state rather than re-querying the caption DOM.

The pivot story. Strategy A — native browser selection via user-select: text !important override + capture-phase event firewall + queueMicrotask getSelection() capture — was implemented end-to-end and committed (ad92887) but failed in browser smoke ("щось воно ніфіга не тягнеться"). Root cause analysis: YouTube re-applies user-select: none via JS on every cue render (so our static CSS rule loses to dynamically-applied inline styles), AND its selectstart interception runs below the event-listener level so the selection never starts even when CSS wins. Strategy B (manual drag state machine on mouseenter) was rejected without smoke for the same cue-segment-volatility reason plus its custom-highlight UX downgrade. The user proposed Strategy C — Ctrl/⌘-Click multi-select, which sidesteps the problem entirely by never calling getSelection() (ADR-034 § 34a-c).

State machine. Ctrl-clicked spans land in _multiWordSelection (an ordered buffer) and pick up the .lx-multi-selected highlight (stronger alpha than M24's :hover so the user can clearly see what they've selected). Toggle semantics: Ctrl-clicking an already-selected span removes it from the buffer (ADR-034 § 34e). The buffer is finalised when the user releases the last Ctrl/⌘ key:

window.addEventListener('keyup', (e) => {
  if (e.key !== 'Control' && e.key !== 'Meta') return;
  if (e.ctrlKey || e.metaKey) return;        // other side still held
  if (!_multiWordSelection.length) return;
  _finaliseMultiSelection();
}, true);

Multi-key safe: releasing one side of Ctrl while the other is still held does NOT finalise — only the LAST release counts (ADR-034 § 34f). _finaliseMultiSelection concatenates the buffered spans in click order (NOT spatial order, ADR-034 § 34d), runs the result through _normalisePhrase, and dispatches to the same _openLookupOverlay(phrase, 'phrase') pipeline used by the M24 single-word click — so all four downstream Quick Look buttons just work on phrases.

Three escape hatches clear the buffer if the natural keyup- finalise doesn't fire: a plain click (no modifier) anywhere on the page, the Escape key, and SPA navigation (yt-navigate-finish).

Why this approach won. Strategy C uses only click and keyup events — the exact event surface M24 has been running on stably since 2026. We never call getSelection(), so YT's user-select: none and selectstart interception are both irrelevant. The buffer stores DOM references, but _clearMultiSelection guards every classList.remove with try/catch in case YT recycled a cue mid-session. Deterministic (always whole-word, never partial), forgiving (toggle-out for misclicks), visually distinct (stronger-than-hover indigo).

Verified end-to-end: 16/16 state-machine cases pass in a Node sandbox (happy path, toggle in/out, out-of-order clicks preserved, deselect-in-middle, abort via _clearMultiSelection, empty-buffer no-op, single-Ctrl-click degenerate case, Polish / Greek / Ukrainian token preservation). Browser smoke confirmed by the user: multi-word phrases lookup correctly, all four downstream features (Add to Vocabulary, Grammar, Slang/Idiom, Shadowing) inherit phrase support unchanged.

The full Strategy A / B post-mortem is preserved in PLAN.md §M35 and ADR-034 — future readers will see exactly what we tried and why it failed before being tempted to re-attempt the same dead end.

M36 — Mobile PWA & Offline Sync

Lexora installs to the iPhone / Android home screen and grades SRS cards in airplane mode. Open /my/practice/mobile once online, then the entire flashcard review flow works offline — translations, flip-cards, swipe gestures, the lot. Reviews queue in IndexedDB and push to the server the moment Wi-Fi returns; the SM-2 state advances exactly as if the desktop /my/practice page had been used. Zero new services, zero new RabbitMQ queues, no new LLM endpoints — pure client-side machinery on top of two new Odoo routes.

The first time the codebase carries a Service Worker, an IndexedDB data plane, and an explicit offline-first sync protocol. Seven sub-decisions locked in ADR-035; the architecture is summarised here.

Web App Manifest + Service Worker — served from controllers at /lexora.webmanifest and /sw.js (NOT from static/). The SW lives at the root path so its scope can intercept /my/practice/mobile/* without Service-Worker-Allowed header gymnastics. Both routes are auth='public' so the offline shell can be installed on the very first visit before login. The SW carries Cache-Control: no-cache so the browser revalidates on every page load — the update-detection floor (ADR-035 § 35a).

IndexedDB layer — Jake Archibald's idb library 7.1.1 vendored locally (ISC licence; ~5.8 KB total including the embedded licence text). A PWA whose offline mode is bootstrapped by a CDN is contradictory; we never depend on a third-party host (ADR-035 § 35b). Two object stores:

cards_to_review keyed by id — wholesale-replaced on every successful GET /lexora_api/offline_batch prefetch (default 7 days / 200 cards).
sync_queue keyed by client_uuid from crypto.randomUUID() — one row per offline grade. The server-side language.review.offline.log table mirrors this with a UNIQUE(user_id, client_uuid) constraint, so re-uploading the same batch after a mid-flight network drop is a clean no-op (ADR-035 § 35c). Six dedicated tests in test_offline_sync.py cover the idempotent-replay / foreign-user / grade-clamp / mixed-batch / clamping / translations matrix; 79 / 0 with no regression on the 73 pre-M36 tests.

Mobile UI — /my/practice/mobile renders a touch-first glassmorphism flashcard via a standalone QWeb template (no portal chrome — full viewport, looks native after Add-to-Home-Screen).

Tap the card → 3D flip (CSS transform: rotateY(180deg) on the inner; backface-visibility hidden to prevent bleed-through).
Swipe left or right → grade (60-px commit threshold, 600 ms time budget; vertical-scroll detection abandons the gesture cleanly so rage-scrolling never triggers an accidental grade).
Bottom action row: two huge buttons — Forgot (red, grade 0) and Remembered (green, grade 2). Each is ≥ 80 px tall, near-half- viewport wide, gradient background + 8-px ring shadow. Built for one-handed thumb use on a metro train.
Mobile UI exposes only 2 grades (not desktop's 4) — deliberate trade-off for touch UX and cognitive load on the move (ADR-035 § 35e). Power users keep the full 4-grade UI on /my/practice.

Service Worker caching — strictly scoped (ADR-035 § 35f):

7 stable static-asset URLs precached at install (manifest, CSS, 3 JS files, 2 icons) — cache-first with stale-while-revalidate.
/my/practice/mobile HTML — network-first with cache fallback, so the offline shell stays usable for at least the most recent visit.
/lexora_api/{offline_batch,sync_offline} — always network. The IndexedDB layer owns offline data; cached batches would surface stale due-dates and corrupt SM-2 scheduling.
Everything else — passthrough. The SW never serves a stale Odoo page outside its declared scope. 27/27 sandbox routing assertions pass covering precache hits, mobile-nav variants, API hits, default passthrough, method/origin guards, and four regex-strictness near-misses.

User-controlled update flow (ADR-035 § 35d): subsequent SW versions install but wait in the registration's installing → waiting chain. The mobile UI shows a bottom banner ("Lexora was updated — refresh to load the new version"); the user clicks Refresh when they're ready, which posts {type: 'SKIP_WAITING'} to the waiting SW; controllerchange fires; the page reloads with the new shell — never mid-review, never with an unflushed sync_queue. First-install gets skipWaiting() because there's no prior SW to displace; the state.hadControllerAtBoot snapshot suppresses the otherwise-annoying first-visit reload flash.

Authentication — same-origin session cookie (ADR-035 § 35g). No JWT, no PWA-specific token, no auth bridge. On 401, the queue is preserved (never dropped on auth failure) and a toast prompts re- sign-in.

Verified end-to-end: 79 tests / 0 failures; 27/27 sandbox routing assertions; curl smokes confirm /sw.js and /lexora.webmanifest serve with correct MIME + cache headers; all 7 precache URLs return 200 so the SW install succeeds; live browser smoke confirms airplane- mode grading queues to IDB, reconnect drains the queue, SM-2 advances on the desktop site within seconds; the update banner appears on a VERSION bump and the reload chain works cleanly.

4. Backend Architecture

                        ┌─────────────────────────────────────┐
                        │           Browser / Client           │
                        │  (Chrome Extension + Portal UI)      │
                        └────────────┬────────────────────────┘
                                     │ HTTP / WebSocket
                        ┌────────────▼────────────────────────┐
                        │          Nginx (reverse proxy)       │
                        │   SSL termination · WebSocket proxy  │
                        └────────────┬────────────────────────┘
                                     │
                        ┌────────────▼────────────────────────┐
                        │        Odoo 18 Community             │
                        │  website · portal · mail · auth      │
                        │  Custom modules: language_* (×11)    │
                        │  Odoo bus (WebSocket / long-poll)    │
                        └─┬──────┬──────┬──────────┬──────────┘
                          │      │      │          │
              RabbitMQ    │  Reads/     │     Redis PvP
              publish     │  writes     │     ephemeral state
                          │      │      │
              ┌───────────▼──┐   │  ┌───▼────┐  ┌─▼────────┐
              │  RabbitMQ    │   │  │Postgres│  │  Redis   │
              │  (event bus) │   │  │  (DB)  │  │   (PvP)  │
              └──┬──┬──┬──┬──┘   │  └────────┘  └──────────┘
                 │  │  │  │      │
       ┌─────────┘  │  │  └──────────────────────────┐
       │            │  │                              │
┌──────▼──────┐ ┌───▼───▼────┐ ┌──────────────┐ ┌───▼────────────┐
│ Translation │ │   Anki     │ │  LLM Service │ │  Audio/TTS     │
│  Service    │ │  Import    │ │  (llama.cpp  │ │  Service       │
│ (FastAPI +  │ │  Service   │ │  Qwen2.5-1.5B│ │  edge-tts +    │
│deep_trans.) │ │ (FastAPI)  │ │  enrichment) │ │  Whisper STT)  │
└─────────────┘ └────────────┘ └──────────────┘ └────────────────┘

Key design decisions:

Odoo is the single system of record. All business data — users, vocabulary, translations, enrichments, audio metadata, PvP results, leaderboard — lives in Postgres via Odoo ORM. External services are stateless processors.
RabbitMQ for async jobs. Translation, enrichment, Anki import, and TTS generation are all async. Each job carries a UUID job_id for idempotency. Odoo drains result queues via a 1-minute cron (ADR-023).
Redis for PvP ephemeral state only. Matchmaking queues, live round state, and reconnect grace timers live in Redis with short TTLs. Odoo persists the final result.
CPU-only throughout. No GPU is assumed anywhere in the stack. The LLM service (Qwen2.5-1.5B Q4_K_M via llama-cpp-python) runs on AVX-capable x86 CPUs.

5. Async Microservices

Translation Service (port 8001)

Library: deep_translator==1.11.4 (MIT)
Primary provider: GoogleTranslator (free, no API key, sub-second latency)
Fallback provider: MyMemoryTranslator (auto-engaged on primary error)
Languages: en, uk, el, pl — all 12 directional pairs handled directly, no two-hop routing (M29: Polish added with pl-PL MyMemory locale)
Sync endpoint: POST /translate for the AI Translator portal tool
Config: TRANSLATE_PROVIDER, TRANSLATE_FALLBACK_PROVIDER, TRANSLATE_TIMEOUT_SECONDS — swap to DeepL or Google Cloud in one env-var change

LLM Enrichment Service (port 8002)

Runtime: llama-cpp-python with llama.cpp C++ engine (AVX-only compatible)
Model: Qwen/Qwen2.5-1.5B-Instruct-GGUF — qwen2.5-1.5b-instruct-q4_k_m.gguf (~0.95 GiB on disk, ~1.2 GiB resident)
Model delivery: downloaded on first start via huggingface_hub into a Docker named volume llm_models; subsequent restarts load from disk in ~1 s
Enrichment scope: always in the entry's source language — no translation, no cross-lingual output (ADR-028)
JSON enforcement: response_format={"type":"json_object"} + parse fallback to stub to prevent queue wedging
Sync endpoints: POST /roleplay for AI Roleplay; POST /explain-grammar for the Grammar Explainer button in the browser extension; POST /generate-topic and POST /analyze-speech for the AI Speaking Coach (M30); POST /analyze-writing for the Lexora Writer FAB (M31); POST /explain-slang for the Slang & Idiom Explainer button (M32); POST /evaluate-pronunciation for Webpage Shadowing (M33). All seven bypass RabbitMQ because the user can't proceed without the result; ADR-030 + ADR-031
- ADR-032 document the sync-over-async rule and the shared endpoint pattern (Pydantic + few-shot anchor + tolerant parser + defensive coerce + stub fallback + server-side status injection — eight sync endpoints across M17–M33 follow this exact shape)

Anki Import Service (port 8003)

Formats: .apkg (SQLite + zip, Zstd-compressed modern format supported) and .txt (tab-separated)
Auto field mapping: reads col.models JSON to detect Front/Back convention; falls back to user-defined mapping
Audio extraction: extracts MP3/OGG/WAV from .apkg media bundle; attaches to language.audio records as audio_type='imported'; extraction failures are logged but never block text import
Dedup: normalises each card through the same pipeline as manual entry saves; reports created/skipped/failed counts back to Odoo

Audio / TTS Service (port 8004)

TTS engine: edge-tts (Microsoft Edge online TTS API — no API key, zero RAM overhead, excellent quality for EN/UK/EL/PL)
TTS fallback: espeak-ng (system package, offline, lower quality)
STT engine: faster-whisper base model (~145 MB / ~300 MB resident); int8 quantization on CPU; 2–4× faster than openai-whisper
Voice map: en → en-US-JennyNeural, uk → uk-UA-PolinaNeural, el → el-GR-AthinaNeural, pl → pl-PL-ZofiaNeural
Sync endpoints:
- POST /transcribe-sync (M30) for the Speaking Coach — multipart audio upload, returns {transcript, duration, language}. 90 s soft cap (AUDIO_SYNC_MAX_SECONDS) and 15 MB hard guard (AUDIO_SYNC_MAX_BYTES), both env-configurable. Reuses the loaded Whisper model — no extra RAM.
- POST /tts-sync (M33) for Webpage Shadowing's "▶ Play Original" — JSON body {text, language}, returns audio/mpeg bytes directly so the browser feeds them to an <audio> element. 500-char cap (TTS_SYNC_MAX_CHARS); 25 s safety timeout (TTS_SYNC_TIMEOUT_SEC) wrapping the existing _generate_tts helper via asyncio.wait_for(loop.run_in_executor(...)) so a hung Edge TTS network call can't block the FastAPI event loop.

6. Spaced Repetition (SM-2)

Lexora implements the SM-2 algorithm — the same core algorithm behind Anki.

Review grades: Again (0) · Hard (1) · Good (2) · Easy (3)

Interval calculation:

EF  = ease factor (default 2.5, min 1.3, max 3.5)
n   = consecutive correct repetitions
I   = interval in days

Grade 0 (Again): n=0,   I=1,          EF unchanged  → state=learning
Grade 1 (Hard):  n=0,   I=max(1,I×1.2), EF−=0.15
Grade 2 (Good):  n+=1,  I=next_I(n,EF,I), EF unchanged
Grade 3 (Easy):  n+=1,  I=next_I()×1.3, EF+=0.15    → state=review

next_I(1, ef, _) = 1
next_I(2, ef, _) = 4
next_I(n, ef, I) = round(I × ef)

State machine: new → learning → review

Cards for all user entries are auto-created on first visit to /my/practice. The portal shows one flashcard at a time: source text → "Show answer" reveals all completed translations + an enrichment example sentence snippet. Four grade buttons submit to POST /my/practice/review/<card_id>.

7. PvP Word Duels

Entry point: /my/arena — requires ≥10 PvP-eligible entries in the chosen practice language (configurable system parameter language.pvp.min_entries).

Match flow:

User creates an open challenge (practice language + native language + XP stake).
Another user in the same language pair accepts within the matchmaking window — or the challenger clicks "Challenge Lexora Bot".
10 rounds — each round: the system picks one of the current player's PvP-eligible vocabulary entries and presents it with 4 translation choices (1 correct + 3 distractors pulled from the player's own dictionary).
Both players answer independently.
After all rounds: the player with more correct answers wins and gains the staked XP; loser loses the same amount (floor at 0). Draw: no XP change.

Lexora Bot: server-side opponent at ~70% accuracy. Bot battles count in history, win rate, and XP. Bot user is created automatically and reactivated if archived.

PvP eligibility: an entry is PvP-eligible when it has at least one completed translation record.

8. Knowledge Library

Gold Vocabulary

3,184 most common English words seeded from the Volka frequency list. Each word has:

CEFR level (A1–C2)
Part of speech
Ukrainian translation (A1/A2 fully translated; B1–C2 seeded with metadata only)
Greek translation (same coverage)

Portal at /useful-words — tabbed by CEFR level, 50 words/page, "Add to My List" button per word. Printable PDF cheat sheet per level via /useful-words/print?level=A1.

Grammar Encyclopedia

6 sections with full HTML content:

All 12 English Tenses — form + usage + timeline + Ukrainian/Greek equivalents
Irregular Verbs — ~200 verbs (Base/Past/Past Participle + Ukrainian translation)
Articles (a/an/the/zero) — rules with EN/UK/EL examples
Conditionals 0–3 — form + usage + translation pairs
Modal Verbs — can/could/may/might/must/should/would + equivalents
Passive Voice & Reported Speech — transformation rules + examples

Portal at /grammar — sidebar navigation; printable PDF per section.

9. AI Roleplay Scenarios

6 conversation scenarios where the LLM acts as a native speaker:

Scenario	Setting
☕ Café	Ordering food and drinks
💼 Job Interview	Professional English practice
🏥 Doctor's Office	Medical vocabulary and describing symptoms
🏨 Hotel Check-In	Hospitality and travel phrases
✈️ Airport	Check-in, customs, directions
🛒 Market / Shop	Haggling, prices, product descriptions

Each scenario:

Has a purpose-built system prompt (plain prose under 100 words — critical for reliable output from a 1.5B model; numbered lists cause the model to echo the list format)
Persists conversation history in Postgres (language.scenario.session.chat_history as a JSON string) — context survives page reloads
Uses repeat_penalty=1.15 and max_tokens=200 to prevent looping/hallucination

The LLM call is synchronous (direct requests.post to POST /roleplay on the LLM service) because conversation turns require an immediate response — RabbitMQ async would produce an unusable UX.

10. Deployment Guide

Hardware Profile

Lexora is explicitly optimised for low-resource CPU-only VPS hosting.

Minimum verified configuration:

Component	Spec
CPU	Intel Xeon E5-2680 v2 (AVX, no AVX2) · 6 vCPUs @ 2.8 GHz
RAM	8 GiB
Storage	40 GiB SSD (OS + Docker volumes)
Network	100 Mbit/s outbound (required for `deep_translator` + `edge-tts`)
GPU	None required

RAM budget at steady state (M25 stack):

Service	Resident RAM
Odoo (4 workers)	~1.5–2.0 GiB
PostgreSQL 15	~0.5–1.0 GiB
RabbitMQ (Erlang VM)	~0.3 GiB
Redis 7	~0.05 GiB
Translation Service	~0.1 GiB
LLM Service (Qwen2.5-1.5B Q4_K_M)	~1.2 GiB
Anki Service	~0.1 GiB
Audio Service	~0.4 GiB (Whisper base loaded)
Nginx	~0.05 GiB
Total	~4.2–5.2 GiB

Headroom of ~2.5–3.5 GiB on an 8 GiB host is sufficient for the M25 stack under normal portal traffic (< 10 concurrent users).

Why M26 (AI Helpdesk RAG) was postponed: The ai_mentor service adds ~1.5–2.0 GiB on top of the above (fastembed ONNX ~100 MB + Qwen2.5-1.5B GGUF second instance ~1.2 GiB + pgvector Postgres extension). Under peak portal traffic this pushes the host into swap, causing OOM kills on Odoo workers. The feature is architecturally complete (see the m26_ai_helpdesk git branch) and will be re-enabled when the server is upgraded to ≥16 GiB RAM.

Production Checklist

Set workers = 4 in src/configs/odoo.conf (already set)
Configure Nginx SSL (Let's Encrypt or pre-provisioned cert)
Replace .env defaults with production secrets
Set POSTGRES_MAX_CONNECTIONS=500 (already set)
Enable Redis AOF persistence for PvP state durability across restarts
Configure RabbitMQ durable queues (already set via durable=True in publishers)
Set TRANSLATE_PROVIDER=google (or switch to a paid provider)
Set TTS_ENGINE=edge-tts (requires outbound HTTPS to Microsoft)
Verify LLM_AUTO_DOWNLOAD=1 (first start downloads ~0.95 GiB model)

11. Development Setup

Prerequisites

Docker Engine ≥ 24 and Docker Compose V2
GNU Make
8 GiB RAM recommended (4 GiB minimum with LLM service disabled)
Outbound HTTPS (required for deep_translator + edge-tts + HuggingFace downloads)

Quick Start

# 1. Clone the repository
git clone https://github.com/YuriiDorosh/Lexora.git
cd Lexora

# 2. Create the shared Docker network (required once)
docker network create backend

# 3. Copy and edit environment variables
cp env.example .env
# Edit .env — at minimum set POSTGRES_PASSWORD

# 4. Start the full development stack
make up-dev
# Services start in order: postgres → rabbitmq → redis → odoo+nginx → translation → llm → anki → audio
# LLM service downloads ~0.95 GiB model on first start (allow 2–5 min)

# 5. Wait for Odoo to be ready
curl http://localhost:5433/web/health
# → {"status": "pass"}

# 6. Create the Odoo database (first time only)
# Open http://localhost:5433 in your browser → complete the setup wizard
# Database name: lexora

# 7. Install all custom modules
docker exec odoo odoo --config /etc/odoo/odoo.conf \
  -d lexora \
  --init language_security,language_core,language_words,language_translation,\
language_enrichment,language_audio,language_anki_jobs,language_chat,\
language_dashboard,language_pvp,language_portal,language_learning,\
base_search_fuzzy,web_notify,password_security,\
website_menu_by_user_status,website_require_login \
  --stop-after-init

# 8. Restart Odoo to load all modules
docker restart odoo

# 9. Verify service health
curl http://localhost:8001/health   # Translation → {"provider":"google","ready":true}
curl http://localhost:8002/health   # LLM → {"llm_ready":true,"consumer_alive":true}
curl http://localhost:8003/health   # Anki → {"status":"ok","consumer_alive":true}
curl http://localhost:8004/health   # Audio → {"whisper_ready":true,"consumer_alive":true}
curl http://localhost:15672         # RabbitMQ management UI (guest/guest)
docker exec redis redis-cli ping    # → PONG

Useful Make Targets

make up-dev          # Start full stack
make down-dev        # Stop full stack
make ps-dev          # Show running containers
make logs-dev        # Tail last 50 lines from every service
make logs-odoo       # Odoo logs only
make logs-llm        # LLM service logs (model loading progress)

make up-llm-no-cache         # Rebuild LLM service image
make up-translation-no-cache # Rebuild translation service image
make up-audio-no-cache       # Rebuild audio service image

make load-backup FILE=your_backup.dump  # Restore Postgres from pg_dump

12. Module Install Order

Custom Odoo modules must be installed in dependency order:

language_security
    └── language_core
            ├── language_words
            │       ├── language_translation
            │       ├── language_enrichment
            │       ├── language_audio
            │       └── language_anki_jobs
            ├── language_chat
            ├── language_dashboard
            ├── language_pvp
            ├── language_learning   ← SRS, XP, leaderboard, gamification, shop
            └── language_portal     ← all portal views, translator, roleplay, grammar, library

OCA addons (present in src/addons/, must be explicitly installed):

base_search_fuzzy — fuzzy vocabulary search via pg_trgm
web_notify — browser push notifications
password_security — password strength enforcement
website_require_login — redirect unauthenticated visitors
website_menu_by_user_status — show/hide nav items by auth state

13. Environment Variables

Key variables in .env (see env.example for the full list):

Variable	Default	Description
`POSTGRES_DB`	`lexora`	Odoo database name
`POSTGRES_USER`	`odoo`	DB user
`POSTGRES_PASSWORD`	(required)	DB password
`RABBITMQ_USER`	`guest`	RabbitMQ user
`RABBITMQ_PASS`	`guest`	RabbitMQ password
`TRANSLATE_PROVIDER`	`google`	`google` or `mymemory`
`TRANSLATE_TIMEOUT_SECONDS`	`10`	Per-request timeout
`TRANSLATE_FALLBACK_PROVIDER`	`mymemory`	Auto-engaged on primary error
`LLM_MODEL_REPO`	`Qwen/Qwen2.5-1.5B-Instruct-GGUF`	HuggingFace model repo
`LLM_MODEL_FILENAME`	`qwen2.5-1.5b-instruct-q4_k_m.gguf`	GGUF filename
`LLM_N_CTX`	`2048`	LLM context window
`LLM_AUTO_DOWNLOAD`	`1`	`0` to disable auto-download (air-gapped)
`TTS_ENGINE`	`edge-tts`	`edge-tts` or `espeak-ng`
`WHISPER_MODEL`	`base`	`base`, `small`, `medium`
`AUDIO_TRANSCRIPTION_ENABLED`	`1`	Enable STT transcription

14. Implementation Status

Milestone	Status	Description
M0	✅ Complete	Docker Compose stack, all services boot
M1	✅ Complete	11 Odoo modules scaffold, auth groups, auto-assignment
M2	✅ Complete	Vocabulary CRUD, dedup, language detection, sharing
M3	✅ Complete	Translation service, RabbitMQ events, portal display
M4	✅ Complete	LLM enrichment service, portal enrich button
M4b	✅ Complete	Real CPU-only LLM (Qwen2.5-1.5B GGUF via llama-cpp)
M4c	✅ Complete	Translation pivot to deep_translator; LLM restricted to enrichment
M5	✅ Complete	Anki .apkg + .txt import, Zstd support, audio extraction, import log
M6	✅ Complete	Audio recording + edge-tts TTS + Whisper STT
M7	✅ Complete	Posts, articles, comments, @mentions, copy-to-list
M8	✅ Complete	Public channels, private DMs, save-from-chat
M9	✅ Complete	SM-2 spaced repetition, /my/practice, SRS backend views
M10	✅ Complete	PvP duels, Lexora Bot, XP system, personal dashboard
M11	✅ Complete	XP Shop (Streak Freeze, Profile Frame, Double XP Booster)
M12	✅ Complete	Gold Vocabulary (3,184 words), Grammar Encyclopedia (6 sections)
M13	✅ Complete	PDF export suite (vocabulary, gold vocab by CEFR, grammar)
M14	✅ Complete	Premium dark UI, glassmorphism, Avantgarde Systems branding
M15	✅ Complete	AI Translator (/translator), sync translation API
M16	✅ Complete	Proprietary license, professional README
M17	✅ Complete	AI Roleplay (6 scenarios, LLM native speaker, grammar corrections)
M18	✅ Complete	Grammar Pro cloze tests (110 exercises, EN + EL, CEFR filters)
M18.5	✅ Complete	Header dropdown redesign (Practice / Library / Tools)
M19	✅ Complete	Idioms Hub (100+ phrasal verbs + idioms, flip-card UI)
M20	✅ Complete	Survival Phrasebook (6 scenarios, 3 languages, copy-to-roleplay)
M21	✅ Complete	Sentence Builder word-ordering game
M22	✅ Complete	Chrome Extension scaffold, /lexora_api/* Odoo endpoints
M23	✅ Complete	Context menu "Add to Lexora", surrounding sentence capture, toast
M24	✅ Complete	YouTube clickable subtitles, global Quick Look overlay (Shadow DOM)
M25	✅ Complete	New Tab vocabulary card, live clock, animated dark gradient
M26	⏸ Postponed	AI Helpdesk RAG — requires ≥16 GiB RAM; preserved on `m26_ai_helpdesk`
M27	✅ Complete	Known vocabulary highlighted on any webpage; SRS-aware tooltip with simultaneous 🇺🇦/🇬🇷 translations; 15-min local cache; MutationObserver re-scan
M28	✅ Complete	"Explain Grammar" in Quick Look + YouTube overlays; Qwen 1.5B via Odoo proxy; draggable scrollable overlays
M29	✅ Complete	Polish (`pl` / 🇵🇱) as a first-class language across DB, services, extension, portal; 1055 entries backfilled; canonical `LANGUAGE_SELECTION` import enforced (ADR-029)
M30	✅ Complete	AI Speaking Coach — `/my/speaking` portal: browser-mic recording → Faster-Whisper sync transcription → Qwen2.5-1.5B feedback (corrections / synonyms / improved version). 90 s soft cap; 4-language support; sessions persisted in `language.speaking.session` (ADR-030)
M31	✅ Complete	Lexora Writer — floating "L" FAB on every focused `<textarea>` / `[contenteditable]`; sends field text to `/analyze-writing`; React-compatible Apply-to-text via the native HTMLTextAreaElement setter + `InputEvent`; strict eligibility skips passwords / search / code editors / login forms; Options-page toggle; server-side safety net guarantees every change is documented (ADR-031)
M32	✅ Complete	Slang & Idiom Explainer — "💡 Explain Slang/Idiom" button alongside the M28 grammar button in Quick Look + YouTube overlays; `/explain-slang` returns five-key JSON (kind / figurative / literal / example / confidence); dual language clamp (explanation in user's native language, example in source); honest UI for `kind:'literal'` and `confidence:'low'` branches; native-language picker in Options (ADR-031)
M33	✅ Complete	Webpage Shadowing — "🎤 Practice Pronunciation" button in Quick Look + YouTube overlays. ▶ Play Original streams Edge TTS via `/tts-sync`; click-to-toggle Start/Stop Recording captures voice on a `chrome.offscreen` document (mic permission once per extension); `/transcribe-sync` → `/evaluate-pronunciation` → score badge (green/amber/red) + per-word red-strikethrough/amber-wavy-underline annotation + localised feedback. Deterministic Python word-diff is the source of truth for the structured fields; LLM only writes feedback. Click-to-toggle UX + Options-page mic-grant button + no-persistence default (ADR-032)
M34	✅ Complete	YouTube Vocab Radar — main-world script injection patches `XMLHttpRequest.prototype` + `window.fetch` to sniff `/api/timedtext` responses (JSON3 / SRV3 / SRV1 parsers; idempotent guard; transparent to the page via `response.clone()`). Content script builds a longest-match sliding-window index from new `GET /lexora_api/my_vocab` (cached 15 min), binary-searches `<video>.timeupdate` for upcoming hits, and pauses 4 s before a known word with a glassmorphism Shadow-DOM card (teal+amber palette; multi-language translation rows; matched word highlighted in cue). Footer: ⏪ Rewind 5 s & Play / ▶ Continue / 🔕 Skip this word / ✖ Disable for this video. Cooldown timer (default 120 s) starts at overlay close, not fire — gated by `_overlayOpen` flag; `_lastFiredAt` sentinel `-Infinity` so first fire isn't gated. Three Options-page controls + per-tab skip set + per-video kill switch. No persistence by default (ADR-033)
M35	✅ Complete	Multi-word YouTube Subtitle Selection — Ctrl/⌘-Click multi-select on subtitle spans, finalised on the LAST Ctrl/Meta keyup (multi-key safe via post-event `e.ctrlKey \|\| e.metaKey` check). Selected spans pick up a stronger-than-hover `.lx-multi-selected` indigo highlight; toggle semantics on re-Ctrl-click allow undo without releasing the modifier. Buffer holds spans in click order (NOT spatial order — out-of-order Ctrl-clicks produce the click-ordered phrase). Finalisation concatenates via `join(' ')`, runs through `_normalisePhrase`, and dispatches to the same `_openLookupOverlay(phrase, 'phrase')` pipeline as M24's single-word click — so every downstream Quick Look feature (Add to Vocabulary, M28 Grammar, M32 Slang/Idiom, M33 Shadowing) inherits phrase support unchanged. Three escape hatches: plain click anywhere, Escape, `yt-navigate-finish` — all clear the buffer. Strategy A (native browser selection via `user-select: text` override) was implemented and reverted after failing browser smoke — YT re-applies `user-select: none` via JS on every cue render and the `selectstart` interception runs below the event-listener level (ADR-034)
M36	✅ Complete	Mobile PWA & Offline Sync — Lexora installs to the iPhone / Android home screen and grades SRS cards in airplane mode. Web App Manifest at `/lexora.webmanifest` + Service Worker served from an Odoo controller at `/sw.js` (root scope; `Cache-Control: no-cache` for update detection). IndexedDB via vendored `idb` 7.1.1 UMD (ISC licence, no CDN dependency) with two stores: `cards_to_review` (prefetched via `GET /lexora_api/offline_batch`, capped 200 cards / 30 days) and `sync_queue` (offline reviews, keyed by `crypto.randomUUID()` for idempotent replay). Mobile route `/my/practice/mobile` renders a touch-first 3D flip card with swipe gestures (60-px commit, 600 ms budget, vertical-scroll detection); two huge bottom buttons (Forgot grade 0 / Remembered grade 2 — simplified from desktop's 4-grade UI per ADR-035 § 35e). `POST /lexora_api/sync_offline` dedupes via `language.review.offline.log` on `UNIQUE(user_id, client_uuid)`. SW caching: 7 stable static-asset URLs precached (cache-first + SWR); HTML cached opportunistically (network-first-nav); `/lexora_api/*` always-network; everything else passthrough — sandbox routing matrix 27/27. User-controlled update banner (no `skipWaiting` for updates; `state.hadControllerAtBoot` snapshot suppresses first-install reload flash). Same-origin session-cookie auth; queue preserved on 401. 79 / 0 tests; 6 new offline-sync tests cover idempotency / foreign-user / grade-clamp / mixed-batch / clamping / translations (ADR-035)
M37	✅ Complete	Mobile PWA — Offline Dictionary tab. New `GET /lexora_api/offline_vocabulary` returns the user's full active vocabulary (default 2000 cap, 5000 max) ordered alphabetically; `auth='user'` + `Cache-Control: no-store`. Filter divergence from M34's `/my_vocab`: NO `pvp_eligible` constraint, so entries with pending translations still appear in the dictionary (rendered with a "translations pending" hint). IndexedDB schema v1 → v2 via additive `upgrade(db, oldVersion)` callback — new `vocabulary_cache` store sits alongside M36's two stores; cascading `if (oldVersion < N)` blocks mean M36 users get only the new store while fresh installs create all three. New DB methods `replaceVocabulary` + `getVocabulary` + `stats().vocabCount`. Native-app-style bottom navigation bar toggles between Practice (M36 swipe-card flow, unchanged) and Dictionary (new). Dictionary panel: sticky search bar (16-px font to prevent iOS focus-zoom) + glassmorphism row cards + iOS-style 3-px gradient indicator under the active tab. Search filter is O(n)-per-keystroke DOM-stable: pre-computed `_dictRows = [{row, haystack}]` array; toggles `display:none` rather than rebuilding the DOM (sub-millisecond on 2000 rows). Background `_prefetchVocabulary` fires on boot + reconnect; only re-renders if Dictionary tab is currently active — silent IDB update otherwise. SW VERSION bumped to `lexora-pwa-v2` (triggers M36-S5 update-banner flow in production for the first time); `LEXORA_API_RE` regex extended with `offline_vocabulary` as the third always-network alternative. 19/19 IDB sandbox + 12/12 SW routing assertions pass; 79/0 regression. Four sub-decisions locked in ADR-036 (separate route, additive IDB upgrade without data migration, DOM-stable filter, prefetch respects active-tab state)

15. Roadmap

M26 — AI Helpdesk (Postponed): Full automated helpdesk with OdooBot replies generated by a local pgvector + Qwen2.5-1.5B RAG pipeline. Complete implementation exists on m26_ai_helpdesk branch. Blocked by server RAM constraints (requires ≥16 GiB).

Potential future milestones:

M37: PWA Push Notifications — Web Push subscriptions + an Odoo cron pinging due cards (deferred from M36 per ADR-035 revisit trigger)
M38: ELO rating system for PvP matchmaking
M39: Multi-language expansion (Spanish, German — Polish landed in M29)
M40: Collaborative vocabulary lists / class rooms
M41: Per-session pronunciation scoring on Speaking Coach (extends M30 with phoneme-level Whisper output)
M42: Touch-device long-press multi-select for M35 mobile users (ADR-034 revisit trigger)
M43: Netflix / Disney+ / Coursera adapters for the M34 radar (architecture is portable per ADR-033 revisit triggers)
M44: Opt-in radar history persistence — POST each YouTube fire to a new /lexora_api/radar_log endpoint for cross-device review continuity
M45: Offline vocabulary-add — queue + sync flow for new entries created without network (M36 ADR-035 revisit trigger)
M46: iOS share-target API — manifest.share_target so installed PWA appears in iOS Share Sheet (M36 revisit trigger)

16. License

This software is proprietary and confidential. No part of this codebase may be reproduced, distributed, or transmitted in any form or by any means without the prior written permission of the copyright holder.

For licensing inquiries: contact.yuriidorosh@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 278 Commits
.claude		.claude
.github		.github
docker_compose		docker_compose
docs		docs
extension		extension
requirements		requirements
scripts		scripts
services		services
src		src
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.env.prod.example		.env.prod.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.prod.yml		docker-compose.prod.yml
env.example		env.example
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Lexora Academy

Table of Contents

1. Concept

2. Feature Catalogue

Core Learning

Community & Social

Gamification

Practice Modes

Library & Tools

3. The Browser & Mobile Ecosystem (M22–M37)

M22 — Companion Extension Scaffold

M23 — Contextual Capture

M24 — YouTube Subtitle Integration

M25 — Premium New Tab Dashboard

M27 — Review in the Wild

M28 — AI Grammar Explainer

M31 — Lexora Writer (Active Writing Assistant)

M32 — Slang & Idiom Explainer

M33 — Webpage Shadowing (Pronunciation Practice)

M34 — YouTube Vocab Radar

M35 — Multi-word YouTube Subtitle Selection

M36 — Mobile PWA & Offline Sync

4. Backend Architecture

5. Async Microservices

Translation Service (port 8001)

LLM Enrichment Service (port 8002)

Anki Import Service (port 8003)

Audio / TTS Service (port 8004)

6. Spaced Repetition (SM-2)

7. PvP Word Duels

8. Knowledge Library

Gold Vocabulary

Grammar Encyclopedia

9. AI Roleplay Scenarios

10. Deployment Guide

Hardware Profile

Production Checklist

11. Development Setup

Prerequisites

Quick Start

Useful Make Targets

12. Module Install Order

13. Environment Variables

14. Implementation Status

15. Roadmap

16. License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 26

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages