A command line utility for working with Quake 1 / 2 .pak files,
SiN .sin archives, Daikatana .pak archives (read-only),
Quake 3 .pk3 files, and Doom 3 .pk4 files.
Why 'pakka', well pak files, and I have kids and Makka Pakka is their favourite In the Night Garden character.
I am a novice C programmer. This code is the result of a learning exercise. Given that C programs can be very brittle, and behave in unexpected ways, I would strongly suggest you do not use this software :) If you are feeling very brave and do choose to use this software and find a problem, please let me know!
$ git clone https://github.com/ajbonner/pakka.git
$ make
Requires GNU make. On BSD systems, install gmake and use gmake in place of
make.
make produces two artifacts:
./pakka— the command-line binary documented below.build/lib/libpakka.a— a static library exposing pakka's archive operations as a C99 API. The public header is atinclude/pakka.h; every exported symbol is prefixedpakka_andmake symbol-auditfails the build if anything else leaks out. See the C library section below.
The Makefile is Unix-only. On Windows, build via CMake + cl.exe from a
Developer Command Prompt (or after running vcvarsall.bat x64):
> cmake -B build\cmake -G Ninja .
> cmake --build build\cmake
Produces build\cmake\pakka.exe. The CMake build is additive — it compiles
the same src/*.c (including src/pk3file.c and the vendored
src/vendor/puff/puff.c INFLATE decoder) plus the
src/vendor/wingetopt (getopt) and src/vendor/dirent
(opendir/readdir) polyfills under _WIN32.
Pakka has 6 major modes (one per invocation), each working on .pak
(Quake 1 / 2), .sin (SiN), Daikatana .pak (read-only), .pk3
(Quake 3), and .pk4 (Doom 3) archives. PK3 and PK4 are byte-identical
ZIP containers; the extension only changes the label returned by
pakka_format(). Daikatana shares Quake's "PACK" magic — pakka
probes both directory layouts at open time, and --format daikatana
pins the decision when the archive is ambiguous.
The archive path is the first positional argument; any further
positionals are entry names (for -x / -d) or source files (for -a
/ -c). All option flags (including -C <dest>) must come before the
archive — pakka's getopt is POSIX-strict and stops scanning at the
first positional.
- List pak contents:
./pakka -l <archive>--treerenders the listing as a UTF-8 box-drawing directory tree.
- Extract:
./pakka -x [-C <destination>] <archive> [path...]- With no path arguments, every entry is extracted.
- With one or more path arguments, only those entries are extracted; pakka errors if any requested path isn't in the archive.
-Cselects a destination directory; defaults to the current working directory.
- Create:
./pakka -c <archive> [file/dir...] [--as <entry_name> <source_path> ...]- Format is picked from the destination extension (case-insensitive):
.pk3→ PK3,.pk4→ PK4,.sin→ SiN, anything else → PAK.--format <name>(pak,sin,pk3,pk4) overrides the extension.--format daikatanais rejected on-c(Daikatana archives are read-only).
- Format is picked from the destination extension (case-insensitive):
- Add to archive:
./pakka -a <archive> [file/dir...] [--as <entry_name> <source_path> ...]--as <entry_name> <source_path>adds a single file under an explicit entry name (the source path on disk and the name stored in the archive can differ). Repeatable; may be mixed with plain path arguments.
- Delete from archive:
./pakka -d <archive> [path...] - Verify:
./pakka --verify <archive>- Walks every entry, runs the same name-safety check used at extract
time, streams each payload to confirm the directory's
offset/lengthpoint at readable bytes, and flags entries that would collide after portable-union normalization (case fold, slash/backslash, trailing dot/space). Exits non-zero on any error-level finding.
- Walks every entry, runs the same name-safety check used at extract
time, streams each payload to confirm the directory's
./pakka -h (or --help) prints a usage summary; ./pakka -V (or
--version) prints the linked libpakka version, build date, and the
supported-format matrix.
| Format | Read | List | Extract | Create | Add | Delete | Verify |
|---|---|---|---|---|---|---|---|
| Quake 1 / 2 PAK | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
SiN (SPAK, 120-byte names) |
✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Daikatana (PACK, 72-byte entries, custom codec) |
✓ | ✓ | ✓ | — | — | — | ✓ |
| Quake 3 PK3 (STORED + DEFLATE) | ✓ | ✓ | ✓ | ✓ (STORED) | ✓ (STORED) | ✓ | ✓ |
| Doom 3 PK4 (STORED + DEFLATE) | ✓ | ✓ | ✓ | ✓ (STORED) | ✓ (STORED) | ✓ | ✓ |
PK3/PK4 write produces STORED-method entries only (no compression). Quake and Doom 3 engines accept STORED archives; the output is larger than a DEFLATE-compressed equivalent. DEFLATE encode support is deferred for both formats.
Daikatana is read-only: the custom byte-codec has no published encoder,
so pakka_create(PAKKA_FORMAT_DAIKATANA, ...) and the mutation modes
(-c/-a/-d with --format daikatana) return PAKKA_ERR_UNSUPPORTED
before any state changes. Compressed Daikatana entries decode through
pakka_dk_inflate (reference: yquake2/pakextract).
Refused for PK3/PK4 archives: ZIP64, encrypted entries, data descriptors
(general-purpose bit 3), multi-disk archives, and compression methods
other than STORED (0) and DEFLATE (8). Each refusal returns
PAKKA_ERR_UNSUPPORTED with a clear message.
include/pakka.h exposes 23 functions for opening, inspecting,
extracting from, and mutating pak archives:
- Version:
pakka_versionreturns the linked libpakka version string (e.g."1.4.0"), useful for bindings that want to feature-gate on the loaded library. - Archive lifecycle:
pakka_open/pakka_open_ex(lets the caller pin the format) /pakka_create/pakka_close(close implicitly commits on dirty) - Read introspection:
pakka_format/pakka_entry_count/pakka_entry_at/pakka_find_entry - Entry accessors over the opaque
pakka_entry_t:pakka_entry_name/pakka_entry_size/pakka_entry_offset - Streaming reads:
pakka_open_entry/pakka_reader_read/pakka_reader_close - Mutation:
pakka_add_file(with separate source path + entry name) /pakka_add_memory/pakka_delete/pakka_commit - Integrity:
pakka_verify(drives a caller-suppliedpakka_report_fncallback) - Tuning:
pakka_set_max_decompressed_size(caps PK3/PK4 DEFLATE and Daikatana decompressed payload size; default 64 MiB, 0 disables) - Memory convenience:
pakka_read_entry_alloc/pakka_free
Library functions never call exit and never write to stdout/stderr;
every failure returns a pakka_status_t and optionally populates a
caller-provided pakka_error_t with structured detail (errno or Win32
GetLastError, operation name, entry index, file offset, message).
For archives with compressed entries (PK3/PK4 DEFLATE and Daikatana),
pakka_set_max_decompressed_size(archive, max_bytes, err) caps the
bytes any single pakka_open_entry / pakka_read_entry_alloc will
inflate — refuses zip-bomb-style high-ratio entries before they hit
RAM. Default 64 MiB; pass 0 to disable. pakka_verify with the
PAKKA_VERIFY_DEEP flag adds per-entry CRC32 and decompression checks
on top of the structural walk (CRC for PK3/PK4; byte-count check for
Daikatana — the custom codec has no CRC).
tests/c_api_test.c exercises every one of those functions against
libpakka.a only (no internal headers) and is the canonical example
of call patterns.
Pak entry names are attacker-controlled data (up to 56 bytes for Quake
PAK / Daikatana, 120 bytes for SiN, up to the ZIP cap for PK3/PK4).
Extract, add, and verify share a fail-fast validator
(pakka_unsafe_entry_name) that is format-independent — it inspects
the byte content, not the field width — and rejects entry names that
would escape the destination or materialize as something dangerous on
disk. The same rules apply to the entry-name side of pakka_add_file
/ --as so a pak built with pakka can be re-extracted with pakka
without surprises.
Rejected entry names:
- empty names
- names beginning with
/or\(absolute paths) - drive-letter prefixes (
C:...,D:...) and UNC\\server\share - any path component that is exactly
..(slash- or backslash-separated) - any path component containing a colon (Windows alternate-data-stream
hazard, e.g.
file:stream) - any path component ending in
.or space (Windows silently strips these, sofoo.andfoowould collide) - control bytes (
0x00-0x1For0x7F) anywhere in the name - Windows reserved device names (
CON,PRN,AUX,NUL,COM1-COM9,LPT1-LPT9), including with extensions (NUL.txt) and in subdirectories (foo/CON).COM10is allowed — onlyCOM0-COM9are reserved.
Substring matches like foo..bar or ..png remain legal — only an exact
.. component is rejected. Validation runs before any mkdir, fread,
or fopen, so a malicious pak fails fast with no partial writes. Both
POSIX and Windows traversal forms are checked regardless of host OS,
since pak archives are portable.
Additional extraction-time protections:
- Normalized-collision preflight: before any byte hits disk, pakka
sorts the selected entries' normalized names (case-folded, with
\mapped to/and trailing dot/space stripped) and rejects the whole extraction if two entries would materialize to the same path on Windows or HFS+. Pre-fix this could silently overwrite files on case-insensitive filesystems. - Symlink / reparse-point rejection during write: every directory
descent into the
-Cdestination tree (and every leaf open) usesopenat/O_NOFOLLOWon modern POSIX, anfchdir(O_NOFOLLOW)emulation on legacy POSIX, and aGetFileAttributesA-based reparse check on Windows. A planted symlink in the destination cannot redirect a write outside the requested directory. - Symlink rejection on recursive add: when
-arecurses into a directory, any symlink or reparse point found inside the tree is reported and skipped rather than silently followed.
pakka --verify runs the same name-safety check and the same
normalized-collision scan without touching disk, so an archive can be
audited before extraction.
The repo includes an integration test suite that downloads id's shareware Quake
pak0.pak (with a pinned SHA256) and exercises pakka against it: round-trip,
extract-specific, add, delete (including the head, the tail, and every entry),
--verify, --as aliasing, and rejection of malformed paks.
$ make test
make test depends on two gates that run before any bats test:
make symbol-audit— runsnm -g build/lib/libpakka.aand fails the build if any defined global lacks thepakka_prefix. Keeps the library namespace-clean.tests/c_api_test.c— a 900-line C-API exerciser linked only againstlibpakka.a(no internal headers). Covers NULL tolerance, round-trips, structured-error population, and thepakka_open_entry/pakka_reader_readstreaming surface that the bats CLI tests can't reach.
Requires bats-core
(brew install bats-core, apt install bats, pkg install bats-core, etc.),
plus curl, openssl, and tar for fetching and verifying the test fixture.
There's no make test one-shot on Windows because the build (MSVC cl.exe)
and the test runner (bats under MSYS2 bash) live in different shells. After
building pakka.exe via CMake (see Installation), open the MSYS2 MSYS
shell, cd to the repo, and run:
$ make fixture # download + SHA-verify pak0.pak
$ export PAKKA="$PWD/build/cmake/pakka.exe" # must be a fully-qualified path
$ bats tests/
Gotcha: PAKKA has to be absolute, not relative. bats may chdir
into a temp directory inside setup_file before invoking pakka, so a
relative path resolves wrong and every test fails to find the binary.
bats-core isn't in any MSYS2 repo; install from upstream once:
$ git clone --depth 1 --branch v1.13.0 \
https://github.com/bats-core/bats-core.git /tmp/bats-core
$ /tmp/bats-core/install.sh /usr/local
MSYS2 packages needed alongside bats: pacman -S --needed bash coreutils curl diffutils git openssl tar make.
make slow-test is an opt-in heavier suite. It downloads id's Q3 demo
wrapper from archive.org (~93 MiB, SHA256-pinned), uses pakka itself to
extract the inner 47 MiB pak0.pk3 (1,274 real id-built entries), and
runs the full read / list / extract / structural-verify / deep-verify
sequence against it. Not part of make test because of the download
size and archive.org's intermittent availability.
make lint runs clang-tidy with
a curated check set (.clang-tidy at the repo root). On macOS, install via
brew install llvm and either add $(brew --prefix llvm)/bin to PATH or
invoke as make CLANG_TIDY=$(brew --prefix llvm)/bin/clang-tidy lint. On Linux:
apt install clang-tidy.
The code aims to be POSIX-portable across Linux, BSD, and macOS, both
endiannesses, both word sizes. The on-disk pak format is canonically
little-endian; reads and writes go through pakka_read_u32_le /
pakka_write_u32_le helpers so the same binary works on big-endian
hosts.
CI coverage currently includes:
| OS | Architecture | libc / toolchain | Coverage |
|---|---|---|---|
| macOS 26 | Intel x86_64 | Apple clang | full bats |
| macOS 26 | Apple Silicon arm64 | Apple clang | full bats |
| Ubuntu latest | x86_64 | glibc / gcc | full bats |
| Ubuntu 24.04 | arm64 | glibc / gcc | full bats |
| Alpine 3.23 | x86_64 | musl / gcc | full bats |
| Ubuntu latest | x86_64 (-m32) |
glibc / gcc (32-bit ABI) | full bats |
| Debian bookworm-slim (Docker / QEMU) | s390x | glibc / gcc, big-endian | full bats |
| FreeBSD 15.0 | amd64 | clang | full bats |
| FreeBSD 15.0 | amd64 (-m32) |
clang (32-bit ABI) | full bats |
| OpenBSD 7.8 | amd64 | clang | full bats |
| Windows Server 2025 (VS 2026) | x86_64 | MSVC cl.exe + CMake/Ninja | full bats |
| Debian Sarge-derived (Docker) | i386 | glibc 2.3.2 / gcc 3.3 / GNU make 3.79.1 | build + symlink-safe extract check |
| Red Hat Linux 9 rootfs (Docker) | i386 | glibc 2.3.2 / gcc 3.2.2 / GNU make 3.79.1 | build + symlink-safe extract check |
| NetBSD 3.0 (QEMU) | sparc | BSD libc / gcc 3.3 / GNU make 3.81, big-endian | build + symlink-safe extract check |
The -m32 jobs build 32-bit binaries on a 64-bit kernel/libc — they exercise
the 32-bit code path but aren't a true i386 OS.
The s390x and NetBSD/sparc jobs run under QEMU emulation on x86_64
runners. They are the slowest jobs in the matrix but exist deliberately:
s390x catches byte-order regressions in any new code that touches the
on-disk format, and NetBSD/sparc is the only job that exercises the
pre-openat (PAKKA_LEGACY_EXTRACT) BSD code path. Modern BSDs in the
matrix all take the openat/mkdirat path.
The legacy build jobs run make + a banner + symlink-safe extract
check (not the full bats suite — bash 2.05b / minimal userland on
the older guests). The Debian Sarge and Red Hat Linux 9 jobs keep
pakka's Linux legacy floor intact: gcc 3.0+ (first FSF release with
adequate C99 support), glibc 2.2.5+, GNU make 3.79.1+, Linux kernel
2.4+. Concrete distros that meet that floor with default packages
include Red Hat Linux 8.0 (Sept 2002), Red Hat Linux 9 (March 2003),
and Debian 3.1 Sarge (June 2005). The NetBSD/sparc job is a separate
class — BSD libc + big-endian SPARC + the pre-openat BSD branch of
PAKKA_LEGACY_EXTRACT (NetBSD < 6.0, FreeBSD < 8.0, OpenBSD < 5.0).
The Windows job builds pakka.exe via CMake/MSVC and runs the bats suite
through MSYS2 bash. POSIX builds remain Makefile-driven; CMake is the
Windows-only path.
- Fork it!
- Create your feature branch:
git checkout -b my-new-feature - Commit your changes:
git commit -am 'Add some feature' - Push to the branch:
git push origin my-new-feature - Submit a pull request :D
Created as an excuse to re-learn some C circa December '15–January '16.
Picked up again in 2026 for a substantial cleanup pass — warning-free under
-Wall --std=c99 --pedantic, several real bugs fixed, integration test suite
added, and multi-platform CI added.
John Carmack for not only creating quake, but then open sourcing it and its successor games. He is the reason I am a programmer today.
I am unsure who the correct author is, and the original link is now long since dead, but I followed Quake PAK File Format in building this utility.
The Yamagi Quake II team's
pakextract (BSD-2-Clause) is
the canonical reference for the SiN and Daikatana on-disk formats —
SiN's SPAK + 120-byte filename layout, Daikatana's 72-byte directory
entry, and especially Daikatana's custom byte-codec compression
(documented opcode-by-opcode at
src/dk_codec.c).
MIT