Skip to content

samikki/better-tekstitv

Repository files navigation

Parempi Teksti-tv

A personal news digest built on Yle Teksti-TV. Crawls teletext pages via the Yle JSON API, filters out stale content, and sends the collected text to an LLM for a tailored summary — in flowing prose, not bullet points.

The editorial tone, summary style, and reader personalisation are all configurable through the web UI or config files. The default prompt leads with interesting and meaningful stories rather than doom and gloom.

Content license: Yle Teksti-TV content is published under CC BY-SA 3.0. Attribution: © Yle.

Parempi Teksti-tv web interface

Requirements

  • Python 3.10+
  • macOS or Linux (Windows: use WSL)
  • API keys for Yle (free) and OpenAI (paid)

Installation

git clone https://github.com/samikki/better-tekstitv.git
cd better-tekstitv

That's it. The run.sh script handles everything else automatically on first launch:

  • Creates a Python virtual environment (venv/)
  • Installs all dependencies
  • Creates config.py from the template

Getting API keys

Service How to get Cost
Yle API Register at tunnus.yle.fi/api-avaimet Free
OpenAI Sign up at platform.openai.com/api-keys Pay-per-use

Usage

Web interface (recommended)

./run.sh --web

Opens a local web server at http://127.0.0.1:5834. On first launch, a setup wizard walks you through entering your API keys — no need to edit config files manually.

Features:

  • LLM-generated news summary with inline links to original Yle Teksti-TV pages
  • Kuuntele / Listen — audio summary via Finnish neural TTS (generated in parallel, non-blocking)
  • Päivitä / Refresh — re-fetch and re-summarise
  • ⚙ Settings — configure everything: API keys, crawling parameters, LLM prompts, UI language, and personalisation profiles
  • Teksti-TV inspired dark design with scanline overlay

CLI mode

./run.sh

Prints the summary to the terminal and saves it to news_output_summary.txt. You'll need to edit config.py with your API keys first (the script creates it from the template on first run).

Child-friendly mode

./run.sh -pg            # CLI
./run.sh --web -pg      # Web

Produces a calm, age-appropriate summary suitable for primary school children.

Configuration

All settings live in config.py (gitignored, never committed). Edit through the web settings page (⚙ icon) or directly in the file.

Setting Description Default
YLE_APP_ID Yle API app ID
YLE_APP_KEY Yle API app key
OPENAI_API_KEY OpenAI API key
OPENAI_MODEL OpenAI model gpt-5.4
ROOT_PAGES Page numbers to start crawling from [100, 102, 130, 160]
MAX_DEPTH Link levels to follow (0 = root only) 1
MAX_AGE_HOURS Skip pages older than this 24
REQUEST_DELAY Seconds between API requests 0.15
SUMMARY_PERSONA System prompt — reporter persona see template
SUMMARY_PROMPT User prompt ({news_text} placeholder) see template
PG_PERSONA System prompt for child-friendly mode see template
PG_PROMPT User prompt for child-friendly mode see template
TASTE_PROFILE_FILE Reader taste profile path (or None) taste_profile.md
LOCAL_PROFILE_FILE Local focus profile path (or None) local_profile.md
LANGUAGE UI language: "fi" or "en" "fi"
PORT Web server port 5834
OUTPUT_FILE Output path for raw text news_output.txt

UI language

The web interface supports Finnish (fi, default) and English (en). Change the language in Settings → Language or set LANGUAGE in config.py. All UI labels, buttons, and messages switch accordingly. The news summary language is controlled by the LLM prompts, not this setting.

To add a new language, add a translation dict to i18n.py.

Personalisation profiles

Two optional markdown files sharpen the summary for a specific reader. Both are gitignored.

taste_profile.md — who you are as a reader

Describes your intellectual interests, reading style, and what makes content worth your time.

Copy taste_profile.example.mdtaste_profile.md and rewrite in your own voice.

local_profile.md — local focus

Defines geographic relevance (where local news matters), topic weights, and editorial preferences.

Copy local_profile.example.mdlocal_profile.md and customise.

Without profiles the summary is still useful — just not personalised.

Useful Teksti-TV page numbers

Page Section
100 Etusivu (front page)
102 Kotimaa (domestic)
130 Ulkomaat (foreign)
160 Talous (economy)
190 News in English
201 Urheilu (sports)
400 Sää (weather)
500–519 Alueuutiset (regional)
891 Viikkomakasiini (weekly)

How it works

  1. Crawl — fetches root pages from the Yle Teksti-TV JSON API, extracts page links, follows them up to MAX_DEPTH levels. Index pages are crawled for links but their navigation text is excluded.

  2. Filter — pages older than MAX_AGE_HOURS are skipped. Each page (with all its subpages) is fetched once.

  3. Collect — body text extracted from all subpages (headers stripped). Combined text saved to news_output.txt.

  4. Summarise — text sent to OpenAI with persona, prompt, and optional profiles. The LLM writes prose with markdown-style page links that become clickable in the web interface.

  5. Audio (web only) — TTS generated in parallel via edge-tts using Finnish neural voices. The page is viewable immediately; audio button activates when ready.

API rate limits (10 req/s, 300/hour) are respected via configurable delay and exponential backoff.

Project structure

better-tekstitv/
├── server.py                # Web interface — routes, pipeline, settings
├── main.py                  # CLI entry point
├── crawler.py               # Recursive page crawler with freshness filtering
├── scraper.py               # Yle API client with rate limiting and retries
├── parser.py                # Extracts text, links, timestamps from API JSON
├── summarizer.py            # OpenAI integration with profile injection
├── tts.py                   # Text-to-speech via edge-tts
├── i18n.py                  # UI translations (Finnish, English)
├── templates/
│   ├── index.html           # Main news view
│   └── settings.html        # Configuration / onboarding page
├── static/                  # Generated audio files
├── config.example.py        # Config template (committed)
├── taste_profile.example.md # Reader profile template
├── local_profile.example.md # Local focus template
├── run.sh                   # Setup & launch script
├── requirements.txt         # Python dependencies
├── .gitignore
└── README.md

Troubleshooting

Port already in use:

# Find and kill the process using port 5834
lsof -ti :5834 | xargs kill

Python not found or too old:

# macOS
brew install python

# Ubuntu / Debian
sudo apt install python3 python3-venv

# Fedora
sudo dnf install python3

Missing edge-tts (audio not working):

source venv/bin/activate
pip install edge-tts

License

Source code provided as-is. Yle Teksti-TV content is licensed under CC BY-SA 3.0 by Yle.

About

Personal news digest from Yle Teksti-TV — LLM-summarised, with a Teksti-TV inspired web interface

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors