RedScrape

A clean, professional desktop app to scrape Reddit subreddits by flair.
Built with PySide6 and Playwright.

Features

Clean, modern desktop UI built with PySide6 — native-looking on Windows, macOS, and Linux
Filter by flair — pull only posts tagged with a specific flair (case-sensitive, exact match)
Fully dynamic — change subreddit, flair, sort, time range, limit, and fields on every run
Rich field extraction — title, URL, author, created date, score, comments, flair, post body, and auto-extracted GitHub links
Export to JSON or CSV with a native save dialog
Live progress + log — watch posts stream into the results table as they're scraped
Cancellable — hit Stop any time; partial results are preserved
Standalone binaries — download and run, no Python required for end users

Screenshots

Add screenshots to docs/screenshots/ after your first run.

Install

Option 1 — Download the release (recommended for users)

Go to Releases
Download the package for your OS:
- RedScrape-Linux.tar.gz
- RedScrape-Windows.zip
- RedScrape-macOS.tar.gz
Extract the archive
Run the RedScrape executable inside
On first run you may need to install Chromium (a one-time step):
```
python -m playwright install chromium
```

Option 2 — Run from source (recommended for developers)

git clone https://github.com/nvssim950/redscrape.git
cd redscrape

python -m venv .venv
source .venv/bin/activate            # Windows: .venv\Scripts\activate

pip install -e .
python -m playwright install chromium

redscrape                            # or:  python -m redscrape

Usage

Launch RedScrape

Fill in the form on the left:

Field	Description
Subreddit	Subreddit name without `r/` (e.g. `n8n`, `python`)
Flair filter	Exact flair text — e.g. `Workflow - Github Included`. Leave empty for no filter
Number of posts	1–1000
Sort by	`new`, `hot`, `top`, or `rising`
Time range	Only active for `top`: hour / day / week / month / year / all
Fields	Check every field you want in the output
Headless	Hide the browser window (default on)

Click ▶ Start
Watch progress fill the progress bar and results appear in the Results table. Detailed activity appears in the Log tab.
Click Export JSON or Export CSV to save your results.

Example run for your first scrape:

Subreddit:       n8n
Flair filter:    Workflow - Github Included
Number of posts: 100
Sort by:         new
Fields:          Title, URL, Permalink, Author, Created, Score, Comments, Flair, GitHub links

Fields you can extract

Field	Description
Title	Post title
URL	Outbound link (for link posts) or Reddit post URL
Permalink	Canonical Reddit URL for the post
Author	Username of the poster
Created (UTC)	ISO-8601 timestamp
Score	Net upvotes
Comments	Number of comments
Flair	Flair text (when visible on the listing)
Post body	Body text for self-posts — requires opening each post, slower
GitHub links	`github.com` URLs extracted from the post body and title link — requires opening each post, slower

See docs/usage.md for the complete user guide.

How it works

RedScrape drives a real Chromium browser via Playwright against old.reddit.com (stable HTML, scrape-friendly). Scraping runs on a background QThread so the UI stays fully responsive; results, progress, and log lines stream into the main window through Qt signals.

┌──────────────────────────┐     Qt signals     ┌──────────────────────────┐
│  MainWindow  (app.py)    │  ◀──────────────▶  │  ScrapeWorker (worker)   │
│  PySide6 UI              │                    │  QThread host            │
└──────────────────────────┘                    └─────────────┬────────────┘
                                                              │ asyncio.run
                                                 ┌────────────▼────────────┐
                                                 │  RedditScraper          │
                                                 │  (Playwright + parsers) │
                                                 └─────────────────────────┘

Development

git clone https://github.com/nvssim950/redscrape.git
cd redscrape
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
python -m playwright install chromium

Run the app:

python -m redscrape

Run tests:

pytest

Build a standalone executable locally:

./build.sh          # Linux/macOS
# or:  pyinstaller --clean --noconfirm redscrape.spec

For architecture notes, see docs/development.md.

Distribution

Pushing a tag like v0.1.0 to GitHub triggers the build workflow which:

Builds standalone executables for Windows, macOS, and Linux
Packages each as a .zip or .tar.gz
Creates a GitHub Release with all three artifacts attached

Users can then download the binary for their OS from the Releases page — no Python install needed.

Contributing

Pull requests welcome. See CONTRIBUTING.md for local setup, project layout, and coding guidelines.

Responsible use

RedScrape is intended for personal research, learning, and lightweight data collection. Please:

Respect Reddit's User Agreement
Keep the built-in per-page delay (1 second) in place
For production or heavy use, switch to the official Reddit API via PRAW — it's more reliable and rate-limit aware

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
docs		docs
src/redscrape		src/redscrape
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
build.sh		build.sh
pyproject.toml		pyproject.toml
redscrape.spec		redscrape.spec
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RedScrape

Features

Screenshots

Install

Option 1 — Download the release (recommended for users)

Option 2 — Run from source (recommended for developers)

Usage

Fields you can extract

How it works

Development

Distribution

Contributing

Responsible use

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RedScrape

Features

Screenshots

Install

Option 1 — Download the release (recommended for users)

Option 2 — Run from source (recommended for developers)

Usage

Fields you can extract

How it works

Development

Distribution

Contributing

Responsible use

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages