Skip to content

web-order/web-order-os

Repository files navigation

web-order-os

An LLM-as-kernel proof of concept. The model is the OS: a LARQL semantic graph is the filesystem, HTMX is the display pipeline, and every user event mutates the graph before the model re-renders the interface.

See project.md for the architecture rationale.

Requirements

  • Python 3.11+
  • Ollama running locally (default http://localhost:11434)
  • Model: ollama pull gemma4:e4b (or reuse an already-pulled gemma4:latest)

Install

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Run

source .venv/bin/activate
uvicorn bridge.app:app --reload --port 8001

Open http://localhost:8001 — pick a preset seed (Blank / Notes / Trip Planner / Shell) or paste your own Turtle, then Boot. The model renders the interface from the resulting graph; each click or form submit mutates the graph and triggers a re-render.

If your pulled tag is gemma4:latest rather than gemma4:e4b, point the bridge at it:

OLLAMA_MODEL=gemma4:latest uvicorn bridge.app:app --reload --port 8001

How it works

User click / form submit
    │ HTMX hx-post
    ▼
Python bridge
    │  translator.ingest_event()  — writes LARQL triples
    │  graph.snapshot_turtle()    — serialises current state for prompt
    ▼
Ollama (gemma4:e4b) streams HTML
    │  renderer strips <|channel>thought ...<channel|> blocks on the fly
    │  graph.apply_larql()        — fenced ```larql``` blocks applied back
    ▼
HTMX swaps #os-root outerHTML → new interface

The graph is persisted to ./data/graph via pyoxigraph, so the session survives restarts.

Layout

bridge/
  app.py            FastAPI routes: /, /seed, /event
  config.py         Env-driven config
  graph.py          pyoxigraph store, Turtle snapshot + mutation extractor
  translator.py     Event → LARQL triples (deterministic, fast path)
  renderer.py       Ollama streaming client + thinking-block stripper
templates/
  shell.html        Outer shell, seed picker, loads HTMX + /static/os.css
static/
  os.css            All styles. No inline CSS anywhere.
seeds/
  blank.ttl         Elicitation UI (model asks what to do)
  notes.ttl         Notes app that emerges from interaction
  trip_planner.ttl  Weekend trip planner
  shell.ttl         Shell-like REPL interface
data/               Runtime LARQL store (gitignored)
scripting/          Temporary / non-live helper scripts
documentation/      Design docs and plans

The contract the model must obey

Defined in bridge/renderer.py::SYSTEM_PROMPT_BASE:

  • Output a single <main id="os-root">...</main> HTML fragment.
  • Optionally append one fenced ```larql block of Turtle triples that record UI-state decisions (so the same graph re-renders the same interface next turn).
  • Interactive elements use HTMX:
    • hx-post="/event" hx-target="#os-root" hx-swap="outerHTML"
    • hx-vals='{"kind":"...","target":"...","value":"..."}'
    • Forms submit named inputs kind, target, value, intent.
  • No inline CSS, no inline JS. Only the utility classes the shell ships with: .card .row .col .btn .btn-primary .input .muted .stack .grid .pad .title .subtitle .pill.
  • British English; &amp; &mdash; &nbsp; where applicable.

If the model deviates, the whole contract is re-sent on the next turn — no server-side HTML rewriting.

Config

All environment variables are optional:

Variable Default Purpose
OLLAMA_URL http://localhost:11434 Ollama HTTP endpoint
OLLAMA_MODEL gemma4:e4b Renderer model tag
LARQL_STORE_PATH ./data/graph Oxigraph store directory
THINKING 0 1 prepends <|think|> to the system prompt to enable Gemma 4 reasoning mode

Sampling

Per the Gemma 4 model card, the bridge uses temperature=1.0, top_p=0.95, top_k=64. Override in bridge/renderer.py if you need deterministic runs.

Observed latency (cold start, Apple Silicon)

  • First render: ~25s (model warm-up dominates).
  • Subsequent re-renders: ~10–15s streamed.
  • The HTMX UI begins swapping as soon as the first non-thinking token arrives, so perceived latency is shorter than wall time.

Known limitations in the MVP

  • Single session, single model, no auth, no CSRF.
  • Event translation is rule-based, not LLM-driven — rich free-text intent extraction is a future upgrade.
  • Graph snapshot for the prompt is capped at 400 quads; no retrieval/summarisation yet.
  • No request cancellation on new input mid-stream; the renderer module supports it, the HTMX frontend doesn't trigger it yet.
  • No KV-cache slot pinning; every turn re-embeds the full Turtle snapshot. Migration path: swap Ollama for llama-server and pin a slot per session.

Security posture for the MVP

Intentionally local-only. Do not expose this to the internet as-is:

  • No CSRF tokens on the POST /event and POST /seed endpoints.
  • No CORS policy.
  • No authentication.
  • The model can emit arbitrary HTML into the user's browser — for a hostile-user scenario, a sanitiser/allow-list pass over the streamed fragment is required.

Reset the session

rm -rf data/graph

File conventions for this project

  • snake_case for file and folder names.
  • All styles in static/os.css. No inline CSS or JS anywhere.
  • Temporary helpers live in scripting/. Docs and plans live in documentation/.

Licence

Copyright © 2026 Ilya Titov / Web Order Ltd.

Licensed under the GNU Affero General Public License v3.0 or later. See LICENSE and NOTICE. If you run a modified version of this programme as a network service, AGPLv3 §13 obliges you to offer the corresponding source to users interacting with it over the network.

About

An LLM-as-kernel proof of concept. The model is the OS: a LARQL semantic graph is the filesystem, HTMX is the display pipeline, and every user event mutates the graph before the model re-renders the interface.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors