Scribus

Scribus is a small Python toolkit for turning audio into text—and, for longer material, turning YouTube lectures into readable PDFs. It ships as a library plus two entry points: a command-line interface and an optional desktop GUI (PyQt5).

What it does

Capability	How it works	Best for
Local file → text (Google)	pydub loads audio; SpeechRecognition calls Google’s public Web Speech API.	Short clips, quick experiments. Needs internet. Subject to Google’s limits and terms—not a bulk production service.
YouTube → PDF (Whisper)	yt-dlp downloads audio; faster-whisper transcribes on your machine; ReportLab writes a structured PDF.	Long talks, lectures, interviews. Works offline after download; first run may fetch Whisper model weights.
GUI	PyQt5 front end around the YouTube → Whisper → PDF pipeline.	Users who prefer a window over a terminal.

You are responsible for complying with YouTube’s Terms of Service, the rights of content owners, and applicable law when downloading or reusing material.

Which path should I use?

I have a file on disk and only need a short transcript → use the Google CLI (minimal install is enough).
I have a YouTube link and want a full session as a PDF → use the GUI or scribus youtube (default requirements.txt or [gui] / [lecture] extras).
I need maximum control or automation → use the CLI with flags for model, device, and language.

Prerequisites

Python 3.10 or newer
ffmpeg on your PATH (used by pydub and by yt-dlp’s audio post-processing)

Installing ffmpeg

Ubuntu / Debian

sudo apt-get update && sudo apt-get install -y ffmpeg

macOS (Homebrew)

brew install ffmpeg

Windows
Install ffmpeg from a trusted build (for example the official downloads) and ensure the ffmpeg executable is on your PATH. Package managers such as Chocolatey (choco install ffmpeg) or Scoop are common choices.

Installation

Clone the repository, create a virtual environment, then install the package in editable form so the scribus and scribus-gui commands are available.

Default (recommended): full stack

requirements.txt lists pydub, SpeechRecognition, yt-dlp, faster-whisper, reportlab, and PyQt5, so one file covers the usual YouTube → PDF and GUI workflows.

git clone https://github.com/Leli254/Scribus.git
cd Scribus
python3 -m venv .venv
source .venv/bin/activate          # Windows CMD: .venv\Scripts\activate.bat
                                   # Windows PowerShell: .venv\Scripts\Activate.ps1
python -m pip install --upgrade pip
python -m pip install -r requirements.txt
python -m pip install -e .

Alternative: extras only (no `requirements.txt`)

Equivalent dependency set via pyproject.toml:

python -m pip install -e ".[gui]"

Use [lecture] if you want YouTube + Whisper + PDF without the PyQt5 GUI.

Minimal: Google-only CLI

Smaller download; no YouTube, Whisper, PDF export, or GUI:

python -m pip install -r requirements-minimal.txt
python -m pip install -e .

Optional dependency groups (reference)

Extra	Installs
`[youtube]`	yt-dlp
`[whisper]`	faster-whisper
`[pdf]`	reportlab
`[lecture]`	youtube + whisper + pdf
`[gui]`	lecture + PyQt5
`[dev]`	pytest and reportlab (for tests)

Quick start

After Default or Alternative installation:

# Desktop app
scribus-gui

# Or from the repo without activating scripts:
python -m scribus_gui

# YouTube → PDF (replace URL and output path)
scribus youtube "https://www.youtube.com/watch?v=VIDEO_ID" --pdf lecture.pdf --model small

# Local file, one segment, plain text (Google)
python -m scribus path/to/audio.mp3 --start 1:45 --end 1:55 --out transcript.txt

The first Whisper run can download large model files; pick a smaller --model (for example tiny or base) if you are constrained on RAM or disk.

Desktop application (GUI)

Start scribus-gui (or python -m scribus_gui).
Paste a YouTube watch URL.
Choose the output PDF path (use Browse…).
Optionally set language (ISO-639-1, for example en); leave blank for automatic detection.
Choose a Whisper model (small is a reasonable default balance of speed and quality).
Click Start and watch the log and progress bar.

Cancel asks the pipeline to stop between major steps. Transcription itself may not interrupt instantly inside a single model call.

Command line: YouTube → PDF (Whisper)

scribus youtube "https://www.youtube.com/watch?v=VIDEO_ID" --pdf lecture.pdf

Option	Description
`--pdf PATH`	Required. Output PDF file.
`--model NAME`	Whisper model: `tiny`, `base`, `small`, `medium`, `large-v2`, `large-v3` (default: `small`).
`--language CODE`	ISO-639-1 (for example `en`). Omit for auto-detect.
`--device VALUE`	`auto`, `cpu`, or `cuda`.
`--compute-type VALUE`	Passed to faster-whisper (for example `int8`, `float16`); see upstream docs.
`-v`, `--verbose`	Debug logging.

Command line: local audio → text (Google)

Works with any format pydub can decode (via ffmpeg). Requires network access for Google recognition.

python -m scribus path/to/audio.mp3 --start 1:45 --end 1:55 --out transcript.txt

Shim (same CLI):

python main.py path/to/audio.mp3 --start 1:45 --end 1:55 --out transcript.txt

If the package is not installed but the repo is on disk:

PYTHONPATH=. python -m scribus path/to/audio.mp3 --start 0s --end 30s --out out.txt

Time formats for `--start` / `--end`

Form	Example	Meaning
Milliseconds	`90000` or `90000ms`	90 seconds
Seconds	`90s` or `12.5s`	Wall-clock seconds
Minutes and seconds	`1:45`	1 minute 45 seconds
Hours (optional)	`0:1:05`	1 minute 5 seconds

A bare integer is interpreted as milliseconds (for example 105000 is 1:45).

Full file with Google (chunked)

Omit --start and --end to transcribe the entire file in chunks (configurable):

python -m scribus path/to/episode.mp3 --max-chunk-ms 55000 --out whole_episode.txt

Useful flags (Google path)

Flag	Purpose
`--out FILE`	UTF-8 transcript with trailing newline
`--audio-out FILE`	Export WAV (selected segment, or full decode in full-file mode)
`--language CODE`	BCP-47 tag (default `en-US`)
`--max-chunk-ms N`	Maximum chunk length in full-file mode
`--max-segment-ms N`	Maximum allowed span for `--start`/`--end` (default: 30 minutes)
`--allow-huge-segment`	Allow spans wider than `--max-segment-ms`

Exit codes

Code	Meaning
`0`	Success
`1`	Validation, I/O, API, or processing error
`2`	Invalid CLI usage (argparse)
`130`	Interrupted (Ctrl+C)

Troubleshooting

“yt-dlp is required for YouTube downloads”
You are missing optional dependencies. Install them with:

python -m pip install -r requirements.txt

or python -m pip install -e ".[youtube]" / ".[lecture]" / ".[gui]", then try again.

“Could not decode audio” / ffmpeg errors
Install ffmpeg and confirm ffmpeg -version works in the same terminal (and virtual environment) you use to run Scribus.

Whisper is slow or uses too much memory
Use a smaller --model (tiny, base). On CPU, prefer modest chunking defaults already used by faster-whisper; for GPU, set --device cuda if drivers and CUDA builds match your environment.

GUI does not start
Confirm PyQt5 installed (python -m pip show PyQt5) and that you are launching from an environment where scribus-gui was installed (pip install -e . after dependencies).

Project layout

Scribus/
├── Assets/                 # Branding and sample notes
├── scribus/                # Core package (CLI, audio, YouTube, Whisper, PDF, pipeline)
├── scribus_gui/            # PyQt5 application
├── tests/                  # pytest
├── main.py                 # Compatibility entry → CLI
├── pyproject.toml
├── requirements.txt        # Default (full) dependency set
├── requirements-minimal.txt
├── README.md
└── LICENSE

Development

python -m pip install -e ".[dev]"
pytest

License and responsible use

This project is released under the MIT License.

Use of third-party services (Google speech endpoints) and of YouTube content is governed by their respective terms. Scribus is a tool; you remain responsible for how you use it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scribus

Table of contents

What it does

Which path should I use?

Prerequisites

Installing ffmpeg

Installation

Default (recommended): full stack

Alternative: extras only (no `requirements.txt`)

Minimal: Google-only CLI

Optional dependency groups (reference)

Quick start

Desktop application (GUI)

Command line: YouTube → PDF (Whisper)

Command line: local audio → text (Google)

Time formats for `--start` / `--end`

Full file with Google (chunked)

Useful flags (Google path)

Exit codes

Troubleshooting

Project layout

Development

License and responsible use

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.vscode		.vscode
Assets		Assets
scribus		scribus
scribus_gui		scribus_gui
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
requirements-minimal.txt		requirements-minimal.txt
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Scribus

Table of contents

What it does

Which path should I use?

Prerequisites

Installing ffmpeg

Installation

Default (recommended): full stack

Alternative: extras only (no requirements.txt)

Minimal: Google-only CLI

Optional dependency groups (reference)

Quick start

Desktop application (GUI)

Command line: YouTube → PDF (Whisper)

Command line: local audio → text (Google)

Time formats for --start / --end

Full file with Google (chunked)

Useful flags (Google path)

Exit codes

Troubleshooting

Project layout

Development

License and responsible use

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Alternative: extras only (no `requirements.txt`)

Time formats for `--start` / `--end`

Packages