A Python tool for scraping Telegram channels into local SQLite databases, with built-in real-time message forwarding between channels and a self-contained offline HTML viewer for browsing the results.
___________________ _________
\__ ___/ _____/ / _____/
| | / \ ___ \_____ \
| | \ \_\ \/ \
|____| \______ /_______ /
\/ \/
Scraping
- Scrape any channel you're a member of into a per-channel SQLite database (
<channel_id>/<channel_id>.db) - Captures message text, sender info, dates, replies, views, forwards, reactions, and post authors
- Optional media download (photos, videos, documents) with a toggle
- Continuous (live) scraping mode that keeps the DB up to date as new messages arrive
- Concurrent media downloads with configurable batch size and automatic FloodWait handling
- Rescrape and "fix missing media" tools to repair partial downloads
Forwarding
- Real-time forwarding rules from a source channel to a destination channel
- Filter by content type (text, images, videos, documents)
- Two modes: copy (sends as a new message, no "Forwarded from" header) or forward (preserves the header)
- Toggle rules on/off, edit, or delete without restarting
- Backfill historical messages over a chosen date range with adjustable delay
- Combined mode runs scraping and forwarding together in a single session
Search
- Query scraped databases by keyword/regex, date range, sender/username, or media type
- Advanced search combines multiple filters
- Search across one channel, several, or all of them at once
Auth & networking
- Sign in via QR code (displayed as ASCII art in the terminal) or phone number
- Two-factor authentication supported
- Optional SOCKS5, SOCKS4, or HTTP proxy with auth, plus a built-in proxy connectivity test
Export
- One-shot export of every tracked channel to CSV and JSON, written next to the database
Offline viewer (telegram-viewer.html)
- Single-file HTML app you can open directly in a browser, no server or install required
- Loads
.dbfiles via sql.js (compiled SQLite running in the browser) - Drop in the channel folder to attach scraped media (photos and videos render inline)
- Load multiple chats at once and switch between them
- Per-chat search and a global search across all loaded chats
- Strict CSP, so it runs entirely offline with no network requests after load
- Python 3.9+
- A Telegram account
- Telegram API credentials (
api_idandapi_hash), available at https://my.telegram.org
pip install telethon qrcodeOptional, only needed if you want to route through a proxy:
pip install pysocks "python-socks[asyncio]"- Install dependencies (above).
- Run the scraper:
python telegram_scraper_with_forwarding.py
- On first launch it will ask for your
api_idandapi_hash, then walk you through QR or phone login. Credentials and session state are saved tostate.jsonin the working directory.
[S] Scrape channels [W] Forwarding rules
[C] Continuous scraping [D] Search database
[B] Combined scrape + forward [P] Proxy settings
[M] Media scraping: ON/OFF [Q] Quit
[L] List & add channels
[R] Remove channels
[E] Export data (CSV + JSON)
[T] Rescrape media
[F] Fix missing media
Typical first-run flow: L to list and add channels you're in, then S to scrape, or C for live, or B if you also have forwarding rules configured.
Open with [W] from the main menu.
[A]Add: pick source, pick destination (a tracked channel, a@username, or a raw-100...ID), pick content types, pick copy vs forward.[E]Edit,[T]toggle,[D]delete an existing rule.[S]Start forwarding: listens for new messages and forwards them according to enabled rules.[H]Backfill: replay messages from a date range through a rule (useful when you set up a rule after the fact). The source channel must already be scraped locally.
After running, your working directory will look something like:
.
├── telegram_scraper_with_forwarding.py
├── telegram-viewer.html
├── state.json # config, session state, rules
├── -1001234567890/ # one folder per channel
│ ├── -1001234567890.db # SQLite messages DB
│ ├── -1001234567890_<name>.csv # created by [E] export
│ ├── -1001234567890_<name>.json # created by [E] export
│ └── media/
│ ├── photos/
│ ├── videos/
│ └── documents/
└── channels_list.csv # optional, from [L]
- Open
telegram-viewer.htmlin any modern browser, or serve it overlocalhost(see below). - Click Choose Files and select one or more
.dbfiles (and optionally exported.jsonfiles). - To make media render inline, also point the media picker at the channel folder so the viewer can resolve the relative paths stored in the DB.
- Use the per-chat search box to filter messages within a chat, or the top search to search across every chat you've loaded.
The viewer is fully client-side. Nothing is uploaded anywhere; your DB never leaves the browser tab.
Some browsers restrict what a page opened via file:// can load. sql.js's WebAssembly, the directory picker, and media blobs all behave better when the viewer is served over HTTP. Run a tiny local server in the folder that contains telegram-viewer.html and your channel folders, then visit http://localhost:8000/telegram-viewer.html.
Linux / macOS
cd /path/to/your/scraper-folder
python3 -m http.server 8000Windows (PowerShell or CMD)
cd C:\path\to\your\scraper-folder
python -m http.server 8000Stop the server with Ctrl+C. The server only listens on your own machine; nothing is exposed to the network. If port 8000 is taken, pick another (for example, 8080).
If you don't have Python on the path, any other static server works too: npx serve, php -S localhost:8000, VS Code's "Live Server" extension, and so on.
Each <channel>/<channel>.db contains a single messages table:
| Column | Type | Notes |
|---|---|---|
message_id |
INTEGER UNIQUE | Telegram message ID |
date |
TEXT | ISO timestamp |
sender_id |
INTEGER | |
first_name, last_name, username |
TEXT | |
message |
TEXT | message body |
media_type |
TEXT | photo, video, document, etc. |
media_path |
TEXT | relative path to the downloaded file |
reply_to |
INTEGER | replied-to message id |
post_author |
TEXT | for channels with signed posts |
views, forwards |
INTEGER | |
reactions |
TEXT | serialized reaction summary |
The DB uses WAL mode and indexes on message_id and date. The script auto-migrates older DBs that are missing newer columns.
- You can only scrape and forward channels your account is a member of, since Telegram enforces this server-side.
- Telegram applies aggressive rate limits. The scraper handles
FloodWaitErrorautomatically by sleeping the requested duration, but very large backfills will still take time. state.jsoncontains yourapi_id,api_hash, and references to your session, so treat it like a credential file.- Be mindful of Telegram's Terms of Service and the rules of any channel you scrape or repost from.
- "Proxy dependencies not installed": run
pip install pysocks "python-socks[asyncio]"and restart. - Login keeps failing: delete the
.sessionfile in the working directory and log in again. - Media is missing in the viewer: make sure you also picked the channel folder (or its parent) in the media picker so relative paths resolve.
- Some messages show no media even after scraping: use
[F] Fix missing mediafrom the main menu to retry just the missing files.