Port: LitFinder v1.4.6 (bug fixes and features, no plugin system)#2
Merged
Conversation
- Add info hash validation with magnet link fallback in scraper - Extend exact-phrase fallback to manual queries, not just auto-generated ones - Default language to 'en' when ABB listing has no language field, preventing valid results from being hidden by the language filter
…k label, notification proxy - calibrain#999: strip query string/fragment from mirror URLs in normalize_http_url so appending search paths doesn't produce malformed URLs - calibrain#1025: add RTORRENT_AUDIOBOOK_LABEL setting; falls back to book label if unset - calibrain#956: inject configured proxy env vars before Apprise dispatch so Telegram/Discord notifications respect the app proxy setting Cherry-picked from NemesisHubris/litfinder@8e854f9 with the Apprise app_id, description, and logo-URL strings reverted from "LitFinder" back to "Shelfmark". Co-Authored-By: NemesisHubris <155838970+NemesisHubris@users.noreply.github.com>
Cover the specific scenarios that were broken: - normalize_http_url: 5 cases for query/fragment stripping (calibrain#999) - RTorrentClient: audiobook label selection and book-label fallback (calibrain#1025) - TransmissionClient: stopped status treated as complete (seeding ratio fix) - _apprise_proxy_env: http/socks5 proxy injection and env-var preservation (calibrain#956) These tests would catch any future regression to the exact failure modes that motivated each fix.
All instances of `except X, Y:` (Python 2 syntax) replaced with `except (X, Y):` (Python 3). These were SyntaxErrors that would prevent the affected modules from importing at runtime.
…fails validate_destination() now tracks whether it created the directory itself. If the write probe then fails, the empty directory is removed so nothing is orphaned on disk.
handleCancel only called fetchStatus(), leaving a gap where the item had left the live queue but not yet appeared in the DB snapshot. Both are now refreshed in parallel so cancelled items stay visible.
…t infinite loop _extract_slow_download_url was calling _get_download_url recursively after each countdown wait with no depth limit. Pages that repeatedly return a countdown (common through Tor) caused an infinite loop. Now retries are bounded by _AA_COUNTDOWN_MAX_RETRIES (3). Also fixed tor.sh rotation_monitor sending pkill -HUP twice in one cycle when DNS resolution failed.
When DIRECT_DOWNLOAD_LANGUAGE_FROM_PATH is enabled, parse the file path column in AA search results (e.g. lgli/N:\...\[BD FR] Book.cbz) to infer language when AA's own metadata is missing or unknown. Detection priority: 1. Explicit bracket tags: [FR], [BD FR], [En] 2. Keyed markers: "BD FR", "language: fr" 3. Full language names: "french", "deutsch" 4. Loose 2-3 char codes (ambiguous ones like "en"/"de" require bracket or key context to avoid false positives) When enabled with a language filter, the server-side &lang= parameter is suppressed and filtering is done locally so lgli files without AA language metadata are not excluded before the path can be inspected. Also relaxes _parse_search_result_row to only require title + format, allowing sparse lgli rows (missing author/publisher/year) to pass through.
Propagates display_name alongside username through the full stack so the activity sidebar shows friendly names instead of raw usernames. Backend enriches queue status, download history, and request snapshots. Frontend uses display_name in meta lines, the user filter dropdown, and filter tooltips, falling back to username when display_name is absent. Also adds relative timestamps (e.g. "2h ago") beneath each activity card's meta line.
Adds a new output mode that skips post-download file moving entirely — the downloaded file stays wherever it landed in the temp directory. Useful for testing and for setups where an external process handles file placement.
…arison Ported from InfiniteAvenger/shelfmark fork. Provides shared token normalization, stopword filtering, fuzzy title matching with configurable threshold, author surname extraction, and ISBN normalization.
…S delivery When a torrent drops a flat mix of chapter M4Bs (Book 4 as 33 files) and standalone M4Bs (other books) into one directory, ABS treats every file as a separate library item. This detects when files share a common chapter-number prefix, groups them per-book, and routes each group to its own subfolder so ABS sees exactly one folder per book. Falls back to existing part-numbering logic when no chapter-group pattern is detected (all-standalone or untitled files), preserving prior behavior.
…atching Books with subtitles, series info, or marketing fluff in the title (e.g. "Dune (Dune Chronicles, #1)", "Project Hail Mary: A Novel") were sending the full string to every source. IRC does near-exact filename matching and returned nothing. ABB similarly failed. Add generate_title_search_variants() in text_match.py that strips common suffix patterns (parenthetical series, volume suffixes, genre descriptors, long colon subtitles, em-dash subtitles) to produce a clean short form. The search plan now generates [short_title, full_title] as variants so sources try the clean title first with the full title as fallback. HTTP sources (ABB, Prowlarr) try all variants with stop-on-first or merge-all at no meaningful extra cost. IRC tries at most 2 sequential queries.
Add patterns for:
- (N of M) part indicators, mid-string (e.g. 'Title (1 of 2) Series 2')
- Bare 'Book N' suffix without separator ('The Hunter's Code Book 9')
- Bare '#N' suffix without parens ('Law of the Jungle calibrain#13')
- Square bracket content anywhere ('[Dramatized Adaptation]')
Reorder patterns so long colon subtitles are stripped before volume
suffixes, fixing cases like 'Shadows of Sparta: A Greek Mythology
Romantasy The Spartan Flame Trilogy, Book 1' → 'Shadows of Sparta'.
Switch to multi-pass stripping so compound suffixes are fully removed.
… volume suffixes Add biography/autobiography/collection/anthology to genre subtitle pattern. Add edition subtitle pattern for '25th Anniversary Edition', 'Revised Edition' etc. Fix volume suffix pattern to handle 'Part 2 of 3' (was stopping at first token).
…age titles AA changed their HTML structure — related-edition titles are now nested as child spans inside the main title span. Using get_text() on the outer span concatenated everything into a single garbled string. Fix extracts only the direct NavigableString children. Adds a filter to skip lgli catalog-format descriptor entries like "Book/Online Audio".
These were pulled in as duplicates of the real source files when cherry-picking the display-name and noop-output features. Without the custom-source plugin system (litfinder's bucket 6), the patches/ tree is unreferenced by any code path and just clutters the repo.
Cherry-picked code from NemesisHubris/litfinder used a slightly different formatting style; this re-applies the project's oxfmt rules.
The fix(abb) port (f189a5f) added SHA-1/SHA-256 hex validation on info hashes. The existing test fixtures used 'ABC123DEF456GHI789...' and 'ABC 123 DEF 456', neither of which is valid hex — pre-fix they passed trivially, post-fix they (correctly) fail validation. Replaced with valid 40-char hex strings; assertions still match on the partial 'ABC123DEF456' prefix so the cleanup-behavior assertion is preserved.
Cherry-picked code from NemesisHubris/litfinder used slightly different ruff settings; this aligns with this repo's stricter config. Auto-fixed (ruff --fix): import sorting in pipeline.py and transfer.py, quote style in audiobookbay/scraper.py regex, unused pytest imports in three test files. Manual fixes: - BLE001: narrow blind `except Exception:` to (AttributeError, KeyError, TypeError) in activity_routes.py and main.py display-name lookups - SIM105: replace try/except/pass with contextlib.suppress(OSError) in destination.py write-probe cleanup - TC003: move `from pathlib import Path` into TYPE_CHECKING block in postprocess/group.py (only used as annotation, file has `from __future__ import annotations`) - RET504: inline final return in direct_download._normalize_candidate - S105: noqa on `token == "all"` (language sentinel, not a credential) - C405: list literal → set literal in test_text_match.py - C408: dict() call → dict literal in test_postprocess_group_integration.py - RUF059: prefix unused `final_paths` with `_` in one integration test All 1942 backend tests still pass.
LitFinder uses slightly different formatting (line length, wrap style) than this repo. The 31 files that drifted are all touched by the port commits; `make python-format` was failing on them. Running `ruff format` brings everything back to this repo's existing config. No behavior changes; all 1942 backend tests still pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Port: LitFinder v1.4.6 bug fixes and features (excluding rebrand and plugin system)
Ports 22 commits from NemesisHubris/litfinder's v1.4.6 release (AGPL-3.0), preserving original authorship via
git cherry-pick. LitFinder is a community fork ofcalibrain/shelfmark; this PR pulls in the substantive fixes and features while leaving the LitFinder rebranding and the runtime custom-source plugin system out.Upstream issue fixes
The following commits address open issues on the upstream
calibrain/shelfmarktracker:RTORRENT_AUDIOBOOK_LABEL), falls back to book label if unsetAdditional fixes
enwhen ABB listing has none, preventing valid results from being hiddenlglicatalog descriptor entries (e.g. "Book/Online Audio") that were polluting resultsexcept X, Y:→except (X, Y):SyntaxErrors that prevented affected modules from importing at runtimeDeliberately not ported
064f20cand follow-ups) — a significant feature in its own right; deserves a focused PR and review of its own rather than landing as part of a port pass.cbcd9ad(book-languages.json gitignore restore) — addresses a gitignore/build interaction specific to LitFinder's repo layout; the file isn't dropped in this tree.393f9a4(destination.py Py2 syntax) — theexceptin question lives in a LitFinder-specific code path that doesn't exist here; the broaderf3689dfsweep already covers the cases that do exist inshelfmark.20f6532(transmission stopped-state fix) — the same root issue was independently resolved by fix(transmission): handle completion when seed ratio is 0 calibrain/shelfmark#1023, which is already inmain.New features
Dune (Dune Chronicles, #1),Project Hail Mary: A Novel), a clean short-form query is generated first and falls back to the full title, significantly improving hit rates on IRC, ABB, Prowlarr, and Anna's ArchiveVerification
seleniumbase-dependent test in local venv, unrelated to this port)patches/files (LitFinder's internal mirror-copy convention, unused without the plugin system), reformatted one file with project's oxfmt rules, and updatedtest_scraper.pyinfo-hash fixtures to valid hex (the new ABB validation correctly rejects the pre-existing non-hex placeholder strings).Credit
Per-commit authorship preserved by cherry-pick; the
fix: three upstream bugscommit ports the logic but reverts Apprise app-id/logo strings from "LitFinder" back to "Shelfmark" (noted in the commit body, withCo-Authored-Bytrailer). All other ported commits are unmodified from upstream. Thanks to @NemesisHubris for the work.