A clean, professional desktop app to scrape Reddit subreddits by flair.
Built with PySide6 and Playwright.
- Clean, modern desktop UI built with PySide6 — native-looking on Windows, macOS, and Linux
- Filter by flair — pull only posts tagged with a specific flair (case-sensitive, exact match)
- Fully dynamic — change subreddit, flair, sort, time range, limit, and fields on every run
- Rich field extraction — title, URL, author, created date, score, comments, flair, post body, and auto-extracted GitHub links
- Export to JSON or CSV with a native save dialog
- Live progress + log — watch posts stream into the results table as they're scraped
- Cancellable — hit Stop any time; partial results are preserved
- Standalone binaries — download and run, no Python required for end users
Add screenshots to
docs/screenshots/after your first run.
- Go to Releases
- Download the package for your OS:
RedScrape-Linux.tar.gzRedScrape-Windows.zipRedScrape-macOS.tar.gz
- Extract the archive
- Run the
RedScrapeexecutable inside - On first run you may need to install Chromium (a one-time step):
python -m playwright install chromium
git clone https://github.com/nvssim950/redscrape.git
cd redscrape
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -e .
python -m playwright install chromium
redscrape # or: python -m redscrape-
Launch RedScrape
-
Fill in the form on the left:
Field Description Subreddit Subreddit name without r/(e.g.n8n,python)Flair filter Exact flair text — e.g. Workflow - Github Included. Leave empty for no filterNumber of posts 1–1000 Sort by new,hot,top, orrisingTime range Only active for top: hour / day / week / month / year / allFields Check every field you want in the output Headless Hide the browser window (default on) -
Click ▶ Start
-
Watch progress fill the progress bar and results appear in the Results table. Detailed activity appears in the Log tab.
-
Click Export JSON or Export CSV to save your results.
Example run for your first scrape:
Subreddit: n8n
Flair filter: Workflow - Github Included
Number of posts: 100
Sort by: new
Fields: Title, URL, Permalink, Author, Created, Score, Comments, Flair, GitHub links
| Field | Description |
|---|---|
| Title | Post title |
| URL | Outbound link (for link posts) or Reddit post URL |
| Permalink | Canonical Reddit URL for the post |
| Author | Username of the poster |
| Created (UTC) | ISO-8601 timestamp |
| Score | Net upvotes |
| Comments | Number of comments |
| Flair | Flair text (when visible on the listing) |
| Post body | Body text for self-posts — requires opening each post, slower |
| GitHub links | github.com URLs extracted from the post body and title link — requires opening each post, slower |
See docs/usage.md for the complete user guide.
RedScrape drives a real Chromium browser via Playwright against old.reddit.com (stable HTML, scrape-friendly). Scraping runs on a background QThread so the UI stays fully responsive; results, progress, and log lines stream into the main window through Qt signals.
┌──────────────────────────┐ Qt signals ┌──────────────────────────┐
│ MainWindow (app.py) │ ◀──────────────▶ │ ScrapeWorker (worker) │
│ PySide6 UI │ │ QThread host │
└──────────────────────────┘ └─────────────┬────────────┘
│ asyncio.run
┌────────────▼────────────┐
│ RedditScraper │
│ (Playwright + parsers) │
└─────────────────────────┘
git clone https://github.com/nvssim950/redscrape.git
cd redscrape
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
python -m playwright install chromiumRun the app:
python -m redscrapeRun tests:
pytestBuild a standalone executable locally:
./build.sh # Linux/macOS
# or: pyinstaller --clean --noconfirm redscrape.specFor architecture notes, see docs/development.md.
Pushing a tag like v0.1.0 to GitHub triggers the build workflow which:
- Builds standalone executables for Windows, macOS, and Linux
- Packages each as a
.zipor.tar.gz - Creates a GitHub Release with all three artifacts attached
Users can then download the binary for their OS from the Releases page — no Python install needed.
Pull requests welcome. See CONTRIBUTING.md for local setup, project layout, and coding guidelines.
RedScrape is intended for personal research, learning, and lightweight data collection. Please:
- Respect Reddit's User Agreement
- Keep the built-in per-page delay (1 second) in place
- For production or heavy use, switch to the official Reddit API via PRAW — it's more reliable and rate-limit aware
MIT © 2026 nvssim950