mem

A local, searchable memory of your browser tabs, indexed by meaning instead of URLs.

Not bookmarks. Not notes. A time-indexed knowledge exhaust of what you read.

What it does

You press a shortcut. The page you're reading gets captured -- content extracted, stripped of navigation and ads, chunked, embedded as vectors, and stored in a local SQLite database. Later, you search by idea rather than by URL or title, and the system finds what you were reading even if you only vaguely remember the concept.

Zero-friction capture, high-quality recall. No cloud, no sync, no auth. Everything runs on your machine.

Capture (`Cmd+Shift+U`)

Press the shortcut on any page. A small overlay appears asking "Why are you saving this?" -- type a one-line note and press Enter, or press Escape to skip. The extension extracts the page content using Readability, sends it to the local API, and the backend chunks, embeds, and stores it. The whole flow takes under a second.

Search (`Cmd+Shift+M`)

Press the shortcut anywhere in the browser. An overlay appears with a single search bar. Type what you remember -- an idea, a problem, a concept -- and results appear ranked by semantic similarity. Arrow keys to navigate, Enter to open the original URL, Escape to close.

Architecture

Browser Extension (ClojureScript, Manifest V3)
   |
   | chrome.runtime.sendMessage
   v
Background Service Worker
   |
   | HTTP POST/GET to localhost
   v
Local API Server (Rust, Axum)  -->  127.0.0.1:7745
   |
   v
SQLite + FTS5 + sqlite-vec
   |
   +-- tab_artifacts      (one row per captured page)
   +-- artifact_chunks    (text split into ~500-word segments)
   +-- artifacts_fts      (FTS5 virtual table, auto-synced via triggers)
   +-- chunk_embeddings   (384-dim float vectors via sqlite-vec)

Everything stays local. The database lives at ~/.mem/mem.db.

How the search algorithm works

Search uses a hybrid approach that combines three signals: semantic similarity, keyword matching, and recency. The final score is a weighted blend that favors meaning over exact words.

Step 1: Embed the query

The query string is passed through the same embedding model used at capture time (all-MiniLM-L6-v2, 384 dimensions). This produces a single float vector representing the meaning of the query.

Step 2: Semantic path (weight: 70%)

The query vector is compared against all stored chunk embeddings using sqlite-vec's vector similarity search. This returns the top-K closest chunks by distance.

Results are grouped by artifact (a single page may have multiple chunks). For each artifact, the best distance is kept and up to 2 chunk snippets are collected. Distances are normalized to a [0, 1] similarity score:

semantic_score = 1.0 - (distance / max_distance)

Step 3: FTS path (weight: 20%)

The raw query string is also run through SQLite's FTS5 full-text search with BM25 ranking. This catches exact keyword matches that the embedding model might not surface -- abbreviations, proper nouns, code identifiers.

FTS scores are normalized against the maximum score in the result set:

fts_score = bm25_rank / max_bm25_rank

Step 4: Recency path (weight: 10%)

Each candidate gets a recency boost based on when it was captured. The decay function is:

recency = 1.0 / (1.0 + days_ago * 0.01)

This gives a gentle preference to recent pages without burying older ones. A page captured yesterday scores ~0.99, a page from a month ago scores ~0.77, a page from a year ago scores ~0.27.

Step 5: Merge and rank

Candidates from both paths are merged into a single map keyed by artifact ID. The final score for each artifact is:

score = 0.7 * semantic_score + 0.2 * fts_score + 0.1 * recency

If an artifact appears in both the semantic and FTS results, both contributions are added. Results are sorted by descending score and truncated to the requested limit.

Why this matters

Your brain does not remember URLs or exact titles. It remembers ideas, problems, and contexts. Semantic embeddings align with human recall. The FTS fallback catches the cases where you do remember a specific term. Recency handles the "I just read something about this" scenario.

Embedding model

The primary embedder is fastembed-rs running the all-MiniLM-L6-v2 ONNX model locally. It produces 384-dimensional vectors and requires no API key or network access.

If fastembed fails to initialize (missing ONNX runtime, unsupported platform), the system falls back to OpenAI's text-embedding-3-small API (1536 dimensions) if the OPENAI_API_KEY environment variable is set.

Text chunking

Captured page content is split into chunks of approximately 500 words with a 50-word overlap between consecutive chunks. This overlap ensures that ideas spanning a chunk boundary are still captured in at least one chunk. Each chunk is embedded independently and stored alongside its parent artifact.

Data model

tab_artifacts -- one row per captured page:

url (unique, upserted on re-capture)
title
content_text (full extracted text)
note (optional one-liner from the capture prompt)
created_at

artifact_chunks -- text segments for embedding:

artifact_id (foreign key)
chunk_index (ordering within the page)
chunk_text

chunk_embeddings -- vector storage via sqlite-vec:

chunk_id (matches artifact_chunks)
embedding (float[384])

artifacts_fts -- FTS5 virtual table over title + content_text, kept in sync via triggers on insert/update/delete.

Tech stack

Layer	Technology
Backend	Rust, Axum, rusqlite, sqlite-vec, fastembed-rs
Extension	ClojureScript, shadow-cljs, Manifest V3
Content extraction	@mozilla/readability
Storage	SQLite (WAL mode, FTS5, vec0)
Build	Cargo (Rust), shadow-cljs (ClojureScript), npm

Project structure

mem/
  crates/
    mem-core/          -- database, embedder, chunker, search algorithm
    mem-server/        -- Axum HTTP server (capture + search API, web UI)
  ui/
    src/dev/jotlabs/mem/
      extension/
        background.cljs      -- service worker (command dispatch, API proxy)
        capture.cljs         -- content script for capture overlay
        search_overlay.cljs  -- content script for search overlay
      web/
        app.cljs             -- standalone web search UI
    resources/
      extension/manifest.json
      web/index.html

Running

Start the backend:

cd mem
cargo run -p mem-server

The server binds to 127.0.0.1:7745. On first run it downloads the embedding model (~23MB).

Build the extension:

cd mem/ui
npm install
# we have these two options available
npm run build:chrome:ext
npm run build:firefox:ext

Load the extension in Chrome: go to chrome://extensions, enable Developer mode, click "Load unpacked", and select mem/ui/dist/extension.

Environment variables

Variable	Purpose	Default
`MEM_DB_PATH`	Override database file location	`~/.mem/mem.db`
`MEM_WEB_DIR`	Override web UI assets directory	auto-detected
`OPENAI_API_KEY`	Enable OpenAI embedding fallback	not set
`RUST_LOG`	Control log verbosity (e.g. `debug`, `info`)	`info`

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
bash		bash
bookmark		bookmark
crates		crates
img		img
ui		ui
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mem

What it does

Capture (`Cmd+Shift+U`)

Search (`Cmd+Shift+M`)

Architecture

How the search algorithm works

Step 1: Embed the query

Step 2: Semantic path (weight: 70%)

Step 3: FTS path (weight: 20%)

Step 4: Recency path (weight: 10%)

Step 5: Merge and rank

Why this matters

Embedding model

Text chunking

Data model

Tech stack

Project structure

Running

Environment variables

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

mem

What it does

Capture (Cmd+Shift+U)

Search (Cmd+Shift+M)

Architecture

How the search algorithm works

Step 1: Embed the query

Step 2: Semantic path (weight: 70%)

Step 3: FTS path (weight: 20%)

Step 4: Recency path (weight: 10%)

Step 5: Merge and rank

Why this matters

Embedding model

Text chunking

Data model

Tech stack

Project structure

Running

Environment variables

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Capture (`Cmd+Shift+U`)

Search (`Cmd+Shift+M`)

Packages