Content of the Active Inference Journal — transcripts, metadata, captions, translations, and curated materials from the Active Inference Institute video library.
Learn more: https://www.activeinference.org/research/journal · Tooling: https://github.com/ActiveInferenceInstitute/Journal-Utilities
Content is source-namespaced so other channels and non-video sources can live
alongside, e.g. data/video/<other-channel>/… or data/<other-type>/<source>/…:
data/video/activeinferenceinstitute/<Series>/<item>/
metadata.json # canonical: series, item, parts[{video_id, url, title, duration, upload_date}]
transcript.txt # clean text (part-tagged when multi-part)
transcript.json # timestamped segments (where available)
captions/ # original-language .srt
translations/ # translated .srt (per language)
assets/ # images, html, prose, appendices, bibliography, …
README.md # human nav: titles, links, contents
docs/ # technical documentation (SCHEMA.md, …)
INDEX.json # machine entry point: every item, its videos, paths
INDEX.md # human index, grouped by series
573 items · 724 videos · 22 series. Every Institute channel video is represented
(uncategorized videos live under Other/).
main— everything above, without audio (lightweight to clone).audio—main+<item>/audio/<name>.64k.m4a(audio re-encoded to 64 kbps).git checkout audioto get the media.
Transcripts and metadata are pulled completely and idempotently from the Institute
YouTube channel (captions, or local Whisper where captions are absent) by
Journal-Utilities
(scripts/refactor_journal.py, scripts/download_channel.py). See docs/SCHEMA.md.