Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
4557 commits
Select commit Hold shift + click to select a range
a70cb46
Merge commit 'caa680a26f4a7ccaac13465e8ab7d407e524dfb8' as 'docs'
pirate May 31, 2026
113fae8
docs: rewrite Configuration.md (remove plugin tree, add Database/craw…
pirate May 31, 2026
249cbcd
fix: run crawl delete test through orchestrator
pirate May 31, 2026
76c8ca4
refactor: remove ENABLED_PLUGINS config, use PLUGINS as the single pl…
pirate May 31, 2026
3b9d50c
release: v0.9.33rc56
pirate May 31, 2026
01d6ae2
docs: Configuration.md accuracy pass + remove internal-only options
pirate May 31, 2026
deb0940
fix(snapshots): honor SNAPSHOTS_PER_PAGE without silent clamping; bum…
pirate May 31, 2026
5b78481
fix: adapt replay asset response handling
pirate May 31, 2026
46b547b
docs: fix stale config refs across wiki pages + minor typo fixes
pirate May 31, 2026
f493243
docs: regenerate sphinx apidocs to reflect current source tree
pirate May 31, 2026
9bcba41
release: archivebox 0.9.33rc58
pirate May 31, 2026
fb35fdb
Add ArchiveBox agent skill
pirate May 31, 2026
2694f43
release: archivebox 0.9.33rc59
pirate May 31, 2026
ff64b91
release: archivebox 0.9.33rc60
pirate May 31, 2026
d4b3ea3
release: archivebox 0.9.33rc61
pirate May 31, 2026
f8bfc4b
release: archivebox 0.9.33rc62
pirate May 31, 2026
3bb8ce5
release: archivebox 0.9.33rc63
pirate May 31, 2026
2a13b15
release: archivebox 0.9.33rc64
pirate May 31, 2026
608e448
release: archivebox 0.9.33rc65
pirate May 31, 2026
d458bf5
release: archivebox 0.9.33rc66
pirate May 31, 2026
6950399
release: archivebox 0.9.33rc67
pirate May 31, 2026
7062f9f
fix: derive binary provider runtime paths
pirate May 31, 2026
a412603
fix: derive browser cache path in multistage docker
pirate May 31, 2026
8450143
fix: avoid lib bin docker permissions
pirate May 31, 2026
a1e518e
docs: clarify root data dir warning
pirate May 31, 2026
81212cb
fix: keep web add redirects on admin host
pirate May 31, 2026
0953923
ci: rerun with latest plugin fixtures
pirate May 31, 2026
4c2181e
test: expect compact machine admin urls
pirate May 31, 2026
cb0066c
ci: rerun with browser provider fixes
pirate May 31, 2026
e7fbb06
ci: rerun with puppeteer provider fix
pirate May 31, 2026
d4662bd
ci: rerun with puppeteer chromium deps fix
pirate May 31, 2026
d6f1d17
ci: rerun with browser provider normalization
pirate May 31, 2026
1d80da3
ci: rerun with chromium provider fallback
pirate May 31, 2026
bc7fa44
release: archivebox 0.9.33rc68
pirate May 31, 2026
ec932aa
release: archivebox 0.9.33rc69
pirate May 31, 2026
e4fb5a9
release: archivebox 0.9.33rc70
pirate Jun 1, 2026
bebd2d6
release: archivebox 0.9.33rc71
pirate Jun 1, 2026
ef016ac
release: archivebox 0.9.33rc72
pirate Jun 1, 2026
5063290
release: archivebox 0.9.33rc73
pirate Jun 1, 2026
3b6625d
release: archivebox 0.9.33rc74
pirate Jun 1, 2026
84bb9a2
release: archivebox 0.9.33rc75
pirate Jun 1, 2026
344ca46
release: archivebox 0.9.33rc76
pirate Jun 1, 2026
211c3e2
release: archivebox 0.9.33rc77
pirate Jun 1, 2026
b268926
release: archivebox 0.9.33rc79
pirate Jun 1, 2026
28860d0
release: archivebox 0.9.33rc80
pirate Jun 1, 2026
cab05eb
Refactor plugins search progress and config flows
pirate Jun 1, 2026
72a67bd
Project abxpkg binary events
pirate Jun 1, 2026
3871a4a
release: archivebox 0.9.33rc68
pirate Jun 1, 2026
78da98e
release: archivebox 0.9.34rc1
pirate Jun 1, 2026
96ffc76
release: archivebox 0.9.34rc2
pirate Jun 1, 2026
a840b0c
release: archivebox 0.9.34rc3
pirate Jun 1, 2026
ad3c466
release: archivebox 0.9.34rc4
pirate Jun 1, 2026
2b3065b
release: archivebox 0.9.34rc5
pirate Jun 1, 2026
4900327
release: archivebox 0.9.34rc6
pirate Jun 1, 2026
8fd6743
fix: ensure archivewebpage is prebaked in docker image
pirate Jun 1, 2026
bfeaebc
release: archivebox 0.9.34rc7
pirate Jun 1, 2026
e8be088
fix: retry docker binary projection for prebaked deps
pirate Jun 1, 2026
8dc6577
release: archivebox 0.9.34rc8
pirate Jun 1, 2026
d553b6a
fix: project docker prebaked binary deps
pirate Jun 1, 2026
11c0e37
release: archivebox 0.9.34rc9
pirate Jun 1, 2026
a0a9958
fix: project liteparse docker binary deps
pirate Jun 1, 2026
0538348
release: archivebox 0.9.34rc10
pirate Jun 1, 2026
193e646
release: archivebox 0.9.34rc11
pirate Jun 1, 2026
003f05b
fix: project mercury docker binary deps
pirate Jun 1, 2026
569fa55
release: archivebox 0.9.34rc12
pirate Jun 1, 2026
6dbe37f
fix: project readability docker binary deps
pirate Jun 1, 2026
086abfa
release: archivebox 0.9.34rc13
pirate Jun 1, 2026
ad13744
fix: project papersdl docker binary deps
pirate Jun 1, 2026
9bfb6fd
release: archivebox 0.9.34rc14
pirate Jun 1, 2026
78e7604
fix: pre-bake gallery and ocr docker deps
pirate Jun 1, 2026
cc2a594
release: archivebox 0.9.34rc15
pirate Jun 1, 2026
f72f8c9
release: archivebox 0.9.34rc16
pirate Jun 1, 2026
579ede0
fix: pre-bake rss parser docker deps
pirate Jun 1, 2026
28df802
release: archivebox 0.9.34rc17
pirate Jun 1, 2026
26ada33
release: archivebox 0.9.34rc18
pirate Jun 1, 2026
c46e826
fix: pass abxpkg install timeout in docker builds
pirate Jun 1, 2026
fdf57cb
release: archivebox 0.9.34rc19
pirate Jun 1, 2026
7cbbe50
fix: avoid aiohttp source builds in docker
pirate Jun 1, 2026
492c897
release: archivebox 0.9.34rc20
pirate Jun 1, 2026
bb40308
release: archivebox 0.9.34rc21
pirate Jun 1, 2026
c77ef74
fix: install docker validation binaries
pirate Jun 1, 2026
dd96e56
release: archivebox 0.9.34rc22
pirate Jun 1, 2026
a5bbdd4
fix: skip optional captcha docker preinstall
pirate Jun 1, 2026
36047a4
release: archivebox 0.9.34rc23
pirate Jun 1, 2026
813723f
fix: disable docker pip release age gate
pirate Jun 1, 2026
f07bdba
release: archivebox 0.9.34rc24
pirate Jun 1, 2026
c789e98
fix: avoid optional docker binary validation
pirate Jun 1, 2026
c9bba69
release: archivebox 0.9.34rc25
pirate Jun 1, 2026
ddb861b
fix: clean docker data dir before init
pirate Jun 1, 2026
ae43b27
release: archivebox 0.9.34rc26
pirate Jun 1, 2026
4e38ab9
fix: pin docker validation binary paths
pirate Jun 1, 2026
323ab9a
release: archivebox 0.9.34rc27
pirate Jun 1, 2026
75441eb
fix: restart supervised runner from add view
pirate Jun 1, 2026
bc8d8bd
fix: support replay assets without base url
pirate Jun 1, 2026
453d998
fix: schedule background admin crawls
pirate Jun 1, 2026
c075d65
Consolidate runtime config handling
pirate Jun 1, 2026
40d9ab5
release: archivebox 0.9.34rc28
pirate Jun 1, 2026
051d166
Tighten state machine model typing
pirate Jun 1, 2026
065fcfc
Preserve queued index jobs during reindex
pirate Jun 1, 2026
5c3161a
release: archivebox 0.9.34rc29
pirate Jun 2, 2026
62fb2c8
release: archivebox 0.9.34rc30
pirate Jun 2, 2026
42fd16b
release: archivebox 0.9.34rc31
pirate Jun 2, 2026
e1c84e7
release: archivebox 0.9.34rc32
pirate Jun 2, 2026
24e7552
release: archivebox 0.9.34rc33
pirate Jun 2, 2026
ac6e018
release: archivebox 0.9.34rc34
pirate Jun 2, 2026
bcfb8c1
release: archivebox 0.9.34rc35
pirate Jun 2, 2026
00cf5c9
release: archivebox 0.9.34rc36
pirate Jun 2, 2026
7dd738b
release: archivebox 0.9.34rc37
pirate Jun 2, 2026
96437e1
Publish local ArchiveBox changes
pirate Jun 2, 2026
b46d142
test cleanup
pirate Jun 2, 2026
7e9d765
release: archivebox 0.9.34rc38
pirate Jun 2, 2026
1fb2d10
ci: allow empty archivebox test modules
pirate Jun 2, 2026
c7cd4de
ci: stabilize archivebox parallel shards
pirate Jun 2, 2026
1320d08
ci: tolerate transient sqlite locks in api workflow
pirate Jun 2, 2026
ce6dbdf
test: keep cli update and extract shards focused
pirate Jun 2, 2026
8aaa236
test: stabilize archivebox ci shards
pirate Jun 2, 2026
2dadc43
test: stabilize cli add status shards
pirate Jun 2, 2026
f3af8a1
test: tolerate queued api archive results
pirate Jun 2, 2026
69619db
test: tolerate sealed paused crawl resume
pirate Jun 2, 2026
2cd8090
test: install chrome deps for cli add coverage
pirate Jun 2, 2026
042339c
test: use local site for recursive cli add
pirate Jun 2, 2026
d8fec2e
test: fix ui add runtime cli env call
pirate Jun 2, 2026
cdec5ea
test: stabilize archivewebpage browser preview
pirate Jun 3, 2026
e9968b7
test: rely on archivewebpage plugin dependencies
pirate Jun 3, 2026
39eac65
Stabilize frozen config CLI test flows
pirate Jun 1, 2026
e4ec848
fix: keep background add crawls runnable
pirate Jun 3, 2026
ad43842
test: stabilize update api and title ci checks
pirate Jun 3, 2026
81de752
test: provision chrome for title config check
pirate Jun 3, 2026
9b5aa68
test: stabilize api workflow and chrome title checks
pirate Jun 3, 2026
f4ec16c
test: create extract snapshots via cli
pirate Jun 3, 2026
cd7e7cd
test: make extract cli setup deterministic
pirate Jun 3, 2026
43606e1
test: let recursive crawl finish before assertions
pirate Jun 3, 2026
acc830d
test: exercise extract cli with real outputs
pirate Jun 3, 2026
9f05448
test: require success in cli workflows
pirate Jun 3, 2026
4e9c8b1
ci: install chrome through archivebox
pirate Jun 3, 2026
3669133
fix: allow install to initialize collections
pirate Jun 3, 2026
83a2099
fix: scope update search backfill runner
pirate Jun 3, 2026
6da1d3f
ci: give archivebox chrome install a real timeout
pirate Jun 3, 2026
c0fb8eb
release: archivebox 0.9.34rc39
pirate Jun 4, 2026
e0c3ca6
release: archivebox 0.9.34rc40
pirate Jun 4, 2026
1ab9902
release: archivebox 0.9.34rc41
pirate Jun 4, 2026
7698155
release: archivebox 0.9.34rc42
pirate Jun 4, 2026
1cf0d18
release: archivebox 0.9.34rc43
pirate Jun 4, 2026
afa958a
release: archivebox 0.9.34rc44
pirate Jun 4, 2026
74690eb
fix: wait for released deps before pip build
pirate Jun 4, 2026
b8b70f3
release: archivebox 0.9.34rc45
pirate Jun 4, 2026
bcd10ea
fix: resolve local package deps in archivebox ci
pirate Jun 4, 2026
9e06c32
fix: install archivebox extras explicitly in ci
pirate Jun 4, 2026
50ff26b
release: archivebox 0.9.34rc46
pirate Jun 4, 2026
8e1c7dc
fix: wait for released deps before docker build
pirate Jun 4, 2026
5b74eb5
fix: publish pip builds from pypi environment
pirate Jun 4, 2026
77ee827
release: archivebox 0.9.34rc47
pirate Jun 4, 2026
327e378
release: archivebox 0.9.34rc48
pirate Jun 4, 2026
d26e27d
release: archivebox 0.9.34rc49
pirate Jun 4, 2026
bb0bf81
release: v0.9.34rc49
pirate Jun 4, 2026
40d5dc0
release: v0.9.34rc50
pirate Jun 4, 2026
2e9686b
release: v0.9.34rc51
pirate Jun 4, 2026
3c6d94d
release: v0.9.34rc51
pirate Jun 4, 2026
79b5994
release: v0.9.34rc52
pirate Jun 4, 2026
083ee8c
Remove abx-plugins Docker commit pin
pirate Jun 4, 2026
70b325f
release: archivebox 0.9.34rc53
pirate Jun 4, 2026
caf0acd
release: archivebox 0.9.34rc54
pirate Jun 4, 2026
5de536e
release: archivebox 0.9.34rc55
pirate Jun 4, 2026
8367578
Update snapshot admin selection behavior
pirate Jun 4, 2026
cf4bd95
fix private snapshot replay auth
pirate Jun 4, 2026
91f6106
release: archivebox 0.9.34rc56
pirate Jun 4, 2026
9a1919c
fix docker env syntax
pirate Jun 4, 2026
364efdf
fix release pypi retry loop
pirate Jun 4, 2026
ba895db
release: archivebox 0.9.34rc58
pirate Jun 4, 2026
bdff741
release: archivebox 0.9.34rc59
pirate Jun 4, 2026
b74861e
Support URLs up to 8000 chars via variable-length indexed TextField
claude Jun 4, 2026
e4df2d6
Keep MAX_URL_LENGTH at 65535; only switch url storage to TextField
claude Jun 5, 2026
7f8af63
fix index and binary runner lifecycle
pirate Jun 5, 2026
73587a1
use archivebox plugin discovery for extraction queues
pirate Jun 5, 2026
4e4ee8c
fix runner stdin and update maintenance lifecycle
pirate Jun 5, 2026
a99321c
fix disabled extractor test env scope
pirate Jun 5, 2026
16cd383
use archivebox installs in browser security tests
pirate Jun 5, 2026
466238d
Use published abx-dl image for Docker builds
pirate Jun 5, 2026
232de59
Fix ArchiveBox Docker runtime layering
pirate Jun 5, 2026
1b19736
Recover interrupted hook work by hook identity
pirate Jun 5, 2026
1987edb
Use shared env opencode binary and process-linked crawl test
pirate Jun 5, 2026
88dc690
Fix queued hook recovery tests
pirate Jun 5, 2026
f39c5d3
Speed up remove after flag test
pirate Jun 5, 2026
d6d479f
Merge branch 'dev' into claude/festive-hopper-0p7Bj
pirate Jun 6, 2026
7cfebc7
Simplify Docker image layering
pirate Jun 6, 2026
a0aec80
Support URLs up to 8000 chars with TextField storage (#1817)
pirate Jun 6, 2026
0c98465
Avoid Docker bytecode cleanup
pirate Jun 7, 2026
a25adf2
Build ArchiveBox on compact abx-dl image
pirate Jun 7, 2026
f9b13e3
release: archivebox 0.9.34rc61
pirate Jun 7, 2026
36e0e3e
release: archivebox 0.9.34rc62
pirate Jun 7, 2026
0a3c89e
release: archivebox 0.9.34rc63
pirate Jun 7, 2026
2009064
release: archivebox 0.9.34rc64
pirate Jun 7, 2026
d14dd37
release: archivebox 0.9.34rc65
pirate Jun 7, 2026
e580f1a
release: archivebox 0.9.34rc66
pirate Jun 7, 2026
8a22449
Update apt install documentation
pirate Jun 7, 2026
2b9a1a7
Use mkdir -p in install docs
pirate Jun 7, 2026
87b5183
release: archivebox 0.9.34rc67
pirate Jun 7, 2026
1ba5281
release: archivebox 0.9.34rc68
pirate Jun 7, 2026
ea82a36
release: archivebox 0.9.34rc69
pirate Jun 7, 2026
c306018
test: expect normalized snapshot permissions
pirate Jun 7, 2026
03879c4
ci: install sonic for runtime search shards
pirate Jun 7, 2026
6c497e5
ci: install sonic backend plugin for sonic shards
pirate Jun 7, 2026
7e62118
test: wait for stable daemon runner gate
pirate Jun 7, 2026
bb436c5
ci: pin uv setup version
pirate Jun 7, 2026
0816435
test: make sonic takeover fixtures explicit
pirate Jun 7, 2026
2659f20
fix runner takeover for scoped snapshot workers
pirate Jun 7, 2026
8a56553
ci: verify all published docker tags
pirate Jun 7, 2026
1540e66
test: relax search stream latency ceiling
pirate Jun 7, 2026
101c73f
Add bundled HTTPS ingress compose profiles
pirate Jun 7, 2026
667f165
release: archivebox 0.9.34rc70
pirate Jun 7, 2026
4fa90e4
release: archivebox 0.9.34rc71
pirate Jun 8, 2026
c57f0f2
Fix Docker chrome shim path
pirate Jun 8, 2026
608b934
Restore Docker chromium shim path
pirate Jun 8, 2026
4fe6780
Fix CI expectations for fast crawl and takeover logs
pirate Jun 8, 2026
1bb51ca
Repair abxbus cache ownership at Docker startup
pirate Jun 8, 2026
6b03472
Update WorkingDirectory and ExecStart in service file
pirate Jun 8, 2026
24677ae
Delete .claude directory
pirate Jun 8, 2026
2b71254
Restore Docker lib root under opt archivebox
pirate Jun 8, 2026
1eadff0
root dir cleanup
pirate Jun 8, 2026
4b0a143
Change copyright year to 2026
pirate Jun 8, 2026
7bf9bac
Delete CLAUDE.md
pirate Jun 8, 2026
a99f3bf
bump version
pirate Jun 8, 2026
0642d08
release: archivebox 0.9.35rc2
pirate Jun 8, 2026
3024793
Fix CI test expectations for current runtime
pirate Jun 8, 2026
cc701b6
release: archivebox 0.9.35rc3
pirate Jun 8, 2026
d1975c9
Fix paused crawl integration expectation
pirate Jun 8, 2026
c7ad2fe
Fix takeover log assertion
pirate Jun 8, 2026
c48531b
release: archivebox 0.9.35rc4
pirate Jun 8, 2026
5adec53
Expand add flow runtime handling
pirate Jun 9, 2026
f59af9c
Cover malicious feed add input
pirate Jun 9, 2026
a1449c2
Use plugin URL patterns for source imports
pirate Jun 9, 2026
b8c3ce3
Exercise import parsing through live add APIs
pirate Jun 9, 2026
2bfb3ad
release: archivebox 0.9.35rc5
pirate Jun 10, 2026
a5e419c
release: archivebox 0.9.35rc6
pirate Jun 10, 2026
98aecfa
release: archivebox 0.9.35rc8
pirate Jun 10, 2026
3328d9f
release: archivebox 0.9.35rc9
pirate Jun 10, 2026
9c29724
Allow selecting Docker release registries
pirate Jun 10, 2026
e38719f
Rename image build UID defaults
pirate Jun 10, 2026
1118d5f
release: archivebox 0.9.35rc10
pirate Jun 10, 2026
2127d07
Inherit crawl tags on snapshot creation
pirate Jun 10, 2026
77964bc
release: archivebox 0.9.35rc11
pirate Jun 10, 2026
303e992
Simplify Docker data ownership detection
pirate Jun 10, 2026
d3e3a5f
release: archivebox 0.9.35rc12
pirate Jun 10, 2026
7e54e3f
Stabilize API add import tag assertion
pirate Jun 10, 2026
1fd8c8b
release: archivebox 0.9.35rc13
pirate Jun 10, 2026
9149ee1
Reopen crawls before queuing discovered snapshots
pirate Jun 10, 2026
f468b5b
release: archivebox 0.9.35rc14
pirate Jun 10, 2026
2081fe5
Clarify Docker data directory ownership
pirate Jun 10, 2026
9b6b95c
release: archivebox 0.9.35rc15
pirate Jun 10, 2026
2f65977
Tighten tag JSONL export assertions
pirate Jun 10, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
92 changes: 87 additions & 5 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,88 @@
output
__pycache__
.DS_Store
venv
.venv
data
._*
/.heartbeat.json
/package.json
/package-lock.json
__pycache__
**/__pycache__
*.pyc
*.pyo
*.py[cod]
*$py.class
.mypy_cache/
.pytest_cache/
.ruff_cache/
.uv-cache/
.github/
.pdm-build/
.pdm-python
.eggs/
.git/
!.git/
.git/*
.vscode/
!.git/HEAD
!.git/packed-refs
!.git/refs/
!.git/refs/heads/
!.git/refs/heads/**

venv/
.venv/
.venv-old/
.docker_venv/
.docker-venv/
node_modules/
abx-dl/
abxpkg/
abx-plugins/
abxbus/
chrome/
chromeprofile/
chrome_profile/
lib/
out/
users/
archive/
crawls/
snapshots/
logs/
archivebox-docker-smoke*/
archivebox-compose-smoke*/
docker-test/
docker-test*/
core
*.core

pdm.dev.lock
pdm.lock

docs/
build/
dist/
brew_dist/
deb_dist/
pip_dist/
assets/
docker/
website/
typings/

tmp/
.tmp/
data/
data*/
-
personas/
sources/
output/
index.sqlite3
index.sqlite3-wal
queue.sqlite3
*.sqlite*
data.*
.archivebox_id
ArchiveBox.conf
*.stdout
*.stderr
*.log
26 changes: 26 additions & 0 deletions .github/.readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Read the Docs config for https://docs.archivebox.io
# https://docs.readthedocs.io/en/stable/config-file/v2.html

version: 2

submodules:
include: all
recursive: true

build:
os: ubuntu-22.04
tools:
python: "3.12"
#nodejs: "20" # not needed unless we need the full archivebox to run while building docs for some reason

sphinx:
configuration: docs/conf.py

formats:
- pdf
- epub

# https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html
python:
install:
- requirements: docs/requirements.txt
39 changes: 38 additions & 1 deletion .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1 +1,38 @@
Make sure check in with me first or confirm your desired features line up with our roadmap: https://github.com/pirate/ArchiveBox#roadmap
# Contribution Process

1. Confirm your desired features fit into our bigger project goals [Roadmap](https://github.com/pirate/ArchiveBox/wiki/Roadmap).
2. Open an issue with your planned implementation to discuss
3. Check in with me before starting development to make sure your work wont conflict with or duplicate existing work
4. Setup your dev environment, make some changes, and test using the test input files
5. Commit, push, and submit a PR and wait for review feedback
6. Have patience, don't abandon your PR! We love contributors but we all have day jobs and don't always have time to respond to notifications instantly. If you want a faster response, ping @theSquashSH on twitter or Patreon.

**Useful links:**

- https://github.com/ArchiveBox/ArchiveBox/issues
- https://github.com/ArchiveBox/ArchiveBox/pulls
- https://github.com/ArchiveBox/ArchiveBox/wiki/Roadmap
- https://github.com/ArchiveBox/ArchiveBox/wiki/Install#manual-setup

### Development Setup

```bash
git clone https://github.com/ArchiveBox/ArchiveBox
cd ArchiveBox
# Ideally do this in a virtualenv
pip install -e '.[dev]' # or use: pipenv install --dev
```

### Running Tests

```bash
./bin/lint.sh
./bin/test.sh
./bin/build.sh
```

For more common tasks see the `Development` section at the bottom of the README.

### Getting Help

Open issues on Github or message me https://sweeting.me/#contact.
2 changes: 2 additions & 0 deletions .github/FUNDING.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
github: ["ArchiveBox", "pirate"]
custom: ["https://donate.archivebox.io", "https://swag.archivebox.io"]
198 changes: 198 additions & 0 deletions .github/ISSUE_TEMPLATE/1-bug_report.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
name: 🐞 Bug report
description: Report a bug or error you encountered in ArchiveBox
title: "Bug: ..."
assignees:
- pirate
type: 'Bug'
body:
- type: markdown
attributes:
value: |
*Please note:* it is normal to see errors occasionally for some extractors on some URLs (not every extractor will work on every type of page).
Please report archiving errors if you are seeing them *consistently across many URLs* or if they are *preventing you from using ArchiveBox*.

- type: textarea
id: description
attributes:
label: Provide a screenshot and describe the bug
description: |
Attach a screenshot and describe what the issue is, what you expected to happen, and if relevant, the *URLs you were trying to archive*.
placeholder: |
Got a bunch of 'singlefile was unable to archive this page' errors when trying to archive URLs from this site: https://example.com/xyz ...
I also tried to archive the same URLs using `singlefile` directly and some of them worked but not all of them. etc. ...
validations:
required: true

- type: textarea
id: steps_to_reproduce
attributes:
label: Steps to reproduce
description: Please provide the exact steps you took to trigger the issue (including any shell commands run, URLs visited, buttons clicked, etc.).
render: markdown
placeholder: |
1. Started ArchiveBox by running: `docker run -v $PWD:/data -p 8000:8000 archivebox/archivebox` in iTerm2
2. Went to the https://127.0.0.1:8000/add/ page in Google Chrome
3. Typed 'https://example.com/xyz' into the 'Add URL' input field
4. Clicked the 'Add+' button
5. Got a 500 error and saw the errors below in terminal
validations:
required: true

- type: textarea
id: logs
attributes:
label: Logs or errors
description: "Paste any terminal output, logs, or errors (check `data/logs/errors.log` as well)."
placeholder: |
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ [2024-11-02 19:54:28] ArchiveBox v0.8.6rc0: archivebox add https://example.com#1234567 │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────╯

[+] [2024-11-02 19:54:29] Adding 1 links to index (crawl depth=0)...
> Saved verbatim input to sources/1730577269-import.txt
> Parsed 1 URLs from input (Generic TXT)
...
render: shell
validations:
required: false

- type: textarea
id: version
attributes:
label: ArchiveBox Version
description: |
**REQUIRED:** Run the `archivebox version` command inside your collection dir and paste the *full output* here (*not just the version number*).
For Docker Compose run: `docker compose run archivebox version`
For plain Docker run: `docker run -v $PWD:/data archivebox/archivebox version`
render: shell
placeholder: |
0.8.6
ArchiveBox v0.8.6rc0 COMMIT_HASH=721427a BUILD_TIME=2024-10-21 12:57:02 1729515422
IN_DOCKER=False IN_QEMU=False ARCH=arm64 OS=Darwin PLATFORM=macOS-15.1-arm64-arm-64bit PYTHON=Cpython (venv)
EUID=502:20 UID=502:20 FS_UID=502:20 FS_PERMS=644 FS_ATOMIC=True FS_REMOTE=False
DEBUG=False IS_TTY=True SUDO=False ID=dfa11485:aa78ad45 SEARCH_BACKEND=ripgrep LDAP=False

Binary Dependencies:
√ python 3.14.0 venv_pip ~/.venv/bin/python
√ django 6.0 venv_pip ~/.venv/lib/python3.14/site-packages/django/__init__.py
√ sqlite 2.6.0 venv_pip ~/.venv/lib/python3.14/site-packages/django/db/backends/sqlite3/base.py
√ pip 24.3.1 venv_pip ~/.venv/bin/pip
...
validations:
required: true

- type: dropdown
id: install_method
validations:
required: true
attributes:
label: How did you install the version of ArchiveBox you are using?
multiple: false
options:
- pip
- apt
- brew
- nix
- Docker (or Podman/LXC/K8s/TrueNAS/Proxmox/etc)
- Other

- type: dropdown
id: operating_system
validations:
required: true
attributes:
label: What operating system are you running on?
description: |
Please note we are *unable to provide support for Windows users* unless you are using [Docker on Windows](https://github.com/ArchiveBox/archivebox#:~:text=windows%20without%20docker).
multiple: false
options:
- Linux (Ubuntu/Debian/Arch/Alpine/etc.)
- macOS (including Docker on macOS)
- BSD (FreeBSD/OpenBSD/NetBSD/etc.)
- Windows (including WSL, WSL2, Docker Desktop on Windows)
- Other

- type: checkboxes
id: filesystem
attributes:
label: What type of drive are you using to store your ArchiveBox data?
description: Are you using a [remote filesystem](https://github.com/ArchiveBox/ArchiveBox/wiki/Setting-Up-Storage#supported-remote-filesystems) or FUSE mount for `data/` OR `data/archive`?
options:
- label: "some of `data/` is on a local SSD or NVMe drive"
required: false
- label: "some of `data/` is on a spinning hard drive or external USB drive"
required: false
- label: "some of `data/` is on a network mount (e.g. NFS/SMB/Ceph/GlusterFS/etc.)"
required: false
- label: "some of `data/` is on a FUSE mount (e.g. SSHFS/RClone/S3/B2/Google Drive/Dropbox/etc.)"
required: false


- type: textarea
id: docker_compose_yml
attributes:
label: Docker Compose Configuration
description: "If using Docker Compose, please share your full `docker-compose.yml` file. If using plain Docker, paste the `docker run ...` command you use."
placeholder: |
services:
archivebox:
image: archivebox/archivebox:latest
ports:
- 8000:8000
volumes:
- ./data:/data
environment:
- ADMIN_USERNAME=admin
- ADMIN_PASSWORD=****<redact any passwords>****
- ALLOWED_HOSTS=*
- CSRF_TRUSTED_ORIGINS=https://archivebox.example.com
- PUBLIC_INDEX=True
- PUBLIC_SNAPSHOTS=True
- PUBLIC_ADD_VIEW=False
...

archivebox_scheduler:
image: archivebox/archivebox:latest
command: schedule --foreground --update --every=day
environment:
...

...
render: shell
validations:
required: false

- type: textarea
id: configuration
attributes:
label: ArchiveBox Configuration
description: "Please share your full `data/ArchiveBox.conf` file here."
render: shell
placeholder: |
[SERVER_CONFIG]
SECRET_KEY = "*********<redact any secrets/passwords>************"

WGET_RESTRICT_FILE_NAMES=windows
USE_SYSTEM_WGET=true
CHECK_SSL_VALIDITY=false
...
validations:
required: false


- type: markdown
attributes:
value: |
---

We strive to answer issues as quickly as possible, it usually takes us *about a ~week* to respond.
Make sure your `data/` is [**fully backed up**](https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#disk-layout) before trying anything suggested here, **we are not responsible for data loss**.

In the meantime please consider:

- 💰 [Donating to support ArchiveBox open-source](https://github.com/ArchiveBox/ArchiveBox/wiki/Donations)
- 👨‍✈️ [Hiring us for corporate deployments](https://docs.monadical.com/s/archivebox-consulting-services) with professional support, custom feature development, and help with CAPTCHAs/rate-limits
- 🔍 [Searching the Documentation](https://docs.archivebox.io/) for answers to common questions
- 📚 Reading the [Troubleshooting Guide](https://github.com/ArchiveBox/ArchiveBox/wiki)
- ✨ Testing out a newer [`BETA` release](https://github.com/ArchiveBox/ArchiveBox/releases) (issues are often already fixed in our latest `BETA` releases)

Loading