tiktok-transcript

Command-line tool that prints the transcript (and optionally metadata) of a TikTok user's latest video — or all videos since a given date.

Pulls TikTok's native auto-generated captions directly from the public web page (__UNIVERSAL_DATA_FOR_REHYDRATION__ → cla_info.caption_infos[].url). No authentication, no API key, no video download, no local speech-to-text.

Requirements

Python 3.10 or newer
macOS, Linux, or Windows (tested on macOS)

Install

Using uv (recommended — gives you an isolated, global tiktok-transcript binary):

uv tool install .

Or with pip (prefer a virtualenv):

pip install .

After installation, the tiktok-transcript command is on your PATH.

Quick start

tiktok-transcript tiktok

Prints the description and auto-generated transcript of @tiktok's latest video to stdout.

Usage

tiktok-transcript USERNAME [options]

The leading @ is optional — tiktok-transcript @tiktok and tiktok-transcript tiktok behave identically.

Options

Flag	Default	Description
`--since DATE`	—	Fetch all videos posted on or after `DATE`. Accepts `YYYY-MM-DD` or ISO-8601 datetime. Assumed UTC if no timezone. Triggers multi-video output.
`--max N`	`30`	With `--since`: cap on how many videos to examine before giving up. A warning is printed to stderr if the cap is hit.
`--lang CODE`	first available	Preferred caption language code (e.g. `eng-US`).
`--format {text,vtt,json}`	`text`	Output format. `vtt` emits raw WebVTT and is not compatible with `--since`.
`--output PATH` / `-o`	stdout	Write output to a file instead of stdout.
`--video-url`	off	Also print the source video URL(s) to stderr.

Examples

# Plain-text transcript of the latest video (description, then transcript)
tiktok-transcript tiktok

# Structured JSON with full metadata (stats, author, music, video specs, …)
tiktok-transcript tiktok --format json

# Raw WebVTT caption file
tiktok-transcript tiktok --format vtt

# All videos from the last 30 days as a JSON array
tiktok-transcript tiktok --since 2026-03-22 --format json

# All videos since a date, as concatenated text blocks with date/URL headers
tiktok-transcript tiktok --since 2026-03-22

# Go back further, raising the safety cap
tiktok-transcript tiktok --since 2025-01-01 --max 500

# Pick a specific caption language if available
tiktok-transcript tiktok --lang eng-US

# Save to a file
tiktok-transcript tiktok --format json -o tiktok-latest.json

Composing with other tools

stdout contains only the transcript / JSON / VTT. Diagnostics go to stderr, so piping is clean:

tiktok-transcript tiktok | wc -w
tiktok-transcript tiktok --format json | jq -r .transcript | pbcopy
tiktok-transcript tiktok --since 2026-04-01 --format json \
    | jq -r '.[] | "\(.create_time[:10])  \(.stats.plays) plays  \(.description)"'

JSON schema

Single-video mode (--format json without --since) returns one object. --since returns an array of objects with the same shape.

{
  "username": "tiktok",
  "video_id": "7628619815690833183",
  "video_url": "https://www.tiktok.com/@tiktok/video/...",
  "description": "a reminder to romanticize the little moments ❤️",
  "create_time": "2026-04-14T14:21:33+00:00",
  "create_time_unix": 1776176493,
  "language": "en",
  "location": "US",
  "hashtags": [],
  "mentions": [],
  "is_ai_generated": false,
  "author": {
    "unique_id": "tiktok",
    "nickname": "TikTok",
    "id": "107955",
    "sec_uid": "MS4wLj...",
    "verified": true,
    "signature": "One TikTok can make a big impact"
  },
  "author_stats": {
    "followers": 93800000,
    "following": 3,
    "likes": 457300000,
    "videos": 1422
  },
  "stats": {
    "plays": 289500, "likes": 6674, "comments": 2188,
    "shares": 714, "bookmarks": 699, "reposts": 0
  },
  "music": {
    "title": "original sound", "author": "TikTok",
    "original": true, "duration_seconds": 48, "id": "7628..."
  },
  "video": {
    "duration_seconds": 48, "width": 720, "height": 1280,
    "format": "mp4", "codec": "h264",
    "cover_url": "https://...", "download_url": "https://..."
  },
  "transcript_lang": "eng-US",
  "transcript": "I would love to hear all of the like\nlittle ways that..."
}

When a video has no captions, transcript and transcript_lang are null; all other fields are still populated.

Exit codes

Code	Meaning
`0`	Success
`1`	Fetch or parse error (unknown user, TikTok blocked us, schema changed)
`2`	Latest video has no native captions (single-video mode), or invalid flag combination (e.g. `--since --format vtt`)

How it works

yt-dlp enumerates the user's most-recent post(s) via TikTok's internal signed feed API. We only use it to discover video IDs and URLs.
For each video, we fetch the public video page with curl_cffi, which impersonates a real Chrome's TLS/HTTP2 fingerprint — TikTok's WAF rejects bare requests with a "please wait…" stub.
The page embeds a JSON blob in a <script id="__UNIVERSAL_DATA_FOR_REHYDRATION__"> tag; we parse out the cla_info.caption_infos[].url field, which points to a direct WebVTT file.
The VTT is downloaded and converted to plain text (or kept raw with --format vtt).

Limitations

Schema drift: depends on TikTok's web JSON structure. If TikTok renames the caption fields or the WAF starts challenging curl_cffi's latest Chrome profile, the tool will break until updated.
Caption quality: transcripts are TikTok's machine-generated captions — generally good but not perfect, especially on music-heavy or multi-speaker content.
Missing captions: a small fraction of videos (~5–10%) have no caption track (music-only, very old, creator opted out). Single-video mode exits 2; multi-video mode includes them with transcript: null. An audio-transcription fallback via Whisper is planned (pip install tiktok-transcript[whisper]) but not yet implemented.
Rate limiting: TikTok's WAF will throttle rapid bursts of video-page requests. Single-video mode makes 2–3 HTTP calls and is fine for casual use. --since over a long window makes one request per video — be nice.

Troubleshooting

error: Could not find __UNIVERSAL_DATA_FOR_REHYDRATION__ in page
TikTok's WAF challenged the request, usually because your curl_cffi Chrome-fingerprint profile has aged. Upgrade: uv tool upgrade tiktok-transcript, or pip install -U curl_cffi inside the venv.

error: User @X not found
The handle doesn't exist, is suspended, or was misspelled. Double-check on tiktok.com.

Hit the --max limit without finding a cutoff video
Re-run with a larger --max (e.g. --max 500). The warning is advisory; the results already collected are still emitted.

License

MIT. See LICENSE if present; otherwise assume MIT unless you hear otherwise.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.claude		.claude
tiktok_transcript		tiktok_transcript
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tiktok-transcript

Requirements

Install

Quick start

Usage

Options

Examples

Composing with other tools

JSON schema

Exit codes

How it works

Limitations

Troubleshooting

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

tiktok-transcript

Requirements

Install

Quick start

Usage

Options

Examples

Composing with other tools

JSON schema

Exit codes

How it works

Limitations

Troubleshooting

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages