Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@ venv/
ENV/
.env

# Bun / Node
node_modules/

# Python
__pycache__/
*.py[cod]
Expand Down
73 changes: 30 additions & 43 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,16 @@
# weft 🪢

A vim-like terminal reader to chat with your books
A vim-like, AI-native terminal reader for books and documents.

Weft starts with EPUBs today: it converts chapters into Markdown-like text, renders them in a fast Bun-powered terminal reader, and lets you navigate with keyboard-first controls.

The next direction is bigger: EPUB/PDF/Markdown in → snappy Markdown reading surface → durable highlights/comments → AI tools that understand the book, your current location, and where they looked.

<https://x.com/dpunjabi/status/1854361314040446995>

## Features
See [`docs/WEFT_REBOOT.md`](docs/WEFT_REBOOT.md) for the reboot plan.

## Current features

### Vim-like navigation

Expand All @@ -13,66 +19,47 @@ A vim-like terminal reader to chat with your books
- Jump to start/end: `g`/`G`
- See table of contents: `t`

### Chat with your books

- `a` - Chat with your current text
- `s` - Generate summary
- `r` - Listen text
- `>` - Listen to the compass
### AI-native direction

Uses [LLM](https://github.com/simonw/llm) to interface with OpenAI, Anthropic, and other providers. You can also install [plugins](https://llm.datasette.io/en/stable/other-models.html) to run local models on your machine.
The original Python prototype can chat, summarize, and read aloud. The Bun reboot is rebuilding that on top of a stronger document model first, so AI can operate over chapters, blocks, source spans, annotations, and search results instead of only the current page.

## Getting started

Clone this repo and setup & activate venv using either [uv](https://github.com/astral-sh/uv) (recommended)
## Reboot direction

```bash
uv venv
source .venv/bin/activate
```
Weft should stay small and sharp, but grow a real document spine:

Or, standard Python tools:
- **Normalized document model** — chapters, sections, blocks, and source spans instead of raw strings
- **Stable annotations** — highlights/comments in a sidecar file that can later export to Markdown
- **Reader-native AI tools** — `toc`, `current_location`, `get_section`, `search_text`, and eventually `repl_exec` over book blocks
- **Visible AI navigation** — show the reader what the model inspected, inspired by `recrsv`'s long-document exploration
- **Recrsv-style exploration rail** — web preview includes `toc`, `search_text`, and `context_get` slices so you can watch document tools move through the book
- **Global reading tape** — `d/u` turns pages across the book, `h/l` jumps sections, and `j/k` moves through actual passages/blocks
- **Minimal reading tracker** — book / section / page progress plus quiet estimated time remaining

```bash
python3 -m pip install virtualenv
python3 -m virtualenv .venv
source .venv/bin/activate
```
## Getting started

Install dependencies with:
Install dependencies with Bun:

```bash
uv pip install -r requirements.txt # if using `uv` - faster!
# or
pip install -r requirements.txt
bun install
```

Bring your keys from OpenAI (default):
Open the modern Markdown reader preview:

```bash
llm keys set OPENAI_API_KEY
bun run web path/to/book.epub
# then open http://localhost:4173
```

Or use Anthropic's Claude:
You can also point Weft at Gutenberg HTML URLs. Source page anchors like `#Page_100` are preserved and used as source-page coordinates:

```bash
llm install llm-claude-3
llm keys set ANTHROPIC_API_KEY
llm models default claude-3-5-sonnet-latest
bun run web 'https://www.gutenberg.org/files/56852/56852-h/56852-h.htm#Page_100'
```

Or, install a local model and run it on your machine:
Or use the minimal terminal reader:

```bash
llm install llm-gpt4all
llm models list # shows a list of available models
llm -m orca-mini-3b-gguf2-q4_0 '3 names for a pet cow' # tests the orca model locally (and downloads it first if needed)
bun run read path/to/book.epub
```

## Try it!

Get a book from [Project Gutenberg](https://www.gutenberg.org/) and try it out:

```bash
uv run reader.py path/to/book.epub
```
For now, the original Python prototype remains in `reader.py` as a reference implementation for chat/TTS experiments. The reboot path is Bun + TypeScript under `src/`.
78 changes: 78 additions & 0 deletions bun.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

180 changes: 180 additions & 0 deletions docs/WEFT_REBOOT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
# Weft Reboot

For the fuller shaped plan, see [`docs/shaped/ai-native-reader-reboot.md`](shaped/ai-native-reader-reboot.md).

Weft should stay Weft: a vim-like reader for books and documents, with AI woven into the act of reading instead of bolted on as a chat box.

## North star

Weft is an AI-native reader for EPUBs, PDFs, and long Markdown documents.

It turns source files into a fast, navigable Markdown-like reading surface where readers can move with vim keys, leave durable annotations, and ask AI questions that understand the book, their current location, and their reading history.

## What Weft already has

The current public repo has the right seed:

- EPUB ingestion via `ebooklib`
- HTML-to-Markdown conversion via `html2text`
- terminal rendering via Rich Markdown
- vim-like navigation across sections/pages
- current-page AI chat and summarization through `llm`
- text-to-speech and an audio “compass” guide

The reboot keeps that spirit, but moves the forward path to **Bun + TypeScript** so the document pipeline, AI tools, and terminal UI can share one typed model.

## Reference points

### recrsv / RLM

`recrsv` proves the important AI pattern: do not paste a whole long document into the model. Give the model tools to explore it strategically.

Useful ideas to bring into Weft:

- `context_get(offset, limit)` as a primitive read tool
- `repl_exec(code)` as a computational exploration tool
- reading timeline / dive mode for reviewing where the AI looked
- minimap-style coverage of document exploration

For Weft, raw character offsets should evolve into reader-native anchors: chapters, sections, blocks, pages, and source spans.

### Roughdraft

Roughdraft is adjacent, not competitive. It is a local-first Markdown review app for collaborating with agents through comments, replies, and suggestions stored in Markdown with CriticMarkup.

Useful ideas to bring into Weft:

- annotations as portable text, not trapped app state
- comments/replies/suggestions as first-class collaboration objects
- CLI/agent-friendly workflows
- optional export to Roughdraft-flavored Markdown

Weft should not become a Markdown review app. It should be a reader for books/docs whose annotations can round-trip to Markdown when useful.

## Product pillars

1. **Snappy Markdown reading**
- EPUB/PDF/Markdown in
- normalized Markdown-like blocks out
- fast terminal-first rendering
- vim navigation by page, section, heading, search result, and mark

2. **Stable source mapping**
- every rendered block has a durable id
- every annotation points at a source span or block id
- PDF/EPUB quirks are hidden behind a common document model

3. **Annotations as knowledge**
- highlights, comments, replies, and AI-suggested notes
- stored in a sidecar file first
- exportable to Markdown / Roughdraft-flavored Markdown later

4. **AI as a reading companion**
- ask about current page, section, chapter, or whole book
- summarize since last mark
- explain selected passage
- find motifs, definitions, contradictions, references
- show where the AI looked, not just what it answered

## Proposed document model

```text
Document
id
title
authors[]
source_path
source_type: epub | pdf | markdown | text
sections[]

Section
id
title
level
parent_id?
block_ids[]
source_span

Block
id
section_id
kind: heading | paragraph | quote | list | code | table | image | page_break
markdown
plain_text
source_span
```

A source span is intentionally abstract:

```text
SourceSpan
source_path
href? # EPUB item path
page? # PDF page when available
char_start?
char_end?
selector? # future: text quote selector / CFI / PDF coordinates
```

## AI tool surface

Start with safe reader tools before embeddings:

```text
current_location()
toc()
get_block(block_id)
get_section(section_id)
get_near(anchor, before, after)
search_text(query)
list_annotations(filter?)
```

Then add heavier tools:

```text
summarize_range(start_anchor, end_anchor)
repl_exec(code) # over normalized blocks, inspired by recrsv
semantic_search(query) # later, optional
```

The model should cite block ids / sections in answers so the reader can jump there.

## First build slice

Keep it terminal-first and incremental, using Bun + TypeScript.

1. Extract EPUB ingestion into a document model module.
2. Render from blocks instead of raw section strings.
3. Add block ids and section ids to the reader state.
4. Add a sidecar annotations file:

```text
book.epub
book.weft.json
```

5. Add minimal commands:

```text
m mark current block
c comment on current block
n/N next/previous annotation
```

6. Upgrade AI context from current page text to current location + surrounding blocks + section metadata.

## Non-goals for the reboot slice

- no web app yet
- no cloud sync
- no account system
- no embeddings until structure/source maps work
- no replacement of the terminal reader core
- no full Roughdraft clone

## Naming

This remains Weft.

The concept is not “another AI document app.” It is a woven reading surface: source file, rendered text, reader marks, and AI exploration all tied together by durable anchors.
Loading