chore(main): release 0.16.0 by jack-arturo · Pull Request #154 · verygoodplugins/automem

jack-arturo · 2026-04-24T15:30:49Z

🤖 I have created a release beep boop

0.16.0 (2026-06-17)

Features

api: add admin backup endpoint (#162) (8b1f264)
api: support bulk memory associations (1221e36)
api: support bulk memory associations (#198) (28eb916)
benchmarks: LongMemEval failure-mode diagnosis harness + judge quota preflight (#183) (f99bece)
consolidation: expose cluster threshold and min size as env vars (#163) (7e731f3)
enrichment: expose classification fallback-rate metrics in /enrichment/status (#188) (0b522a9)
entity: harden identity cleanup and repair tooling (#176) (827dfbc)
eval: recall-quality optimization harness — lab foundation + design (#197) (431433e)
graph: support unbounded visualizer snapshots (#141) (c730128)
lab: add aged labelled distractor injection (cc5d546)
lab: add config_complexity simplicity metric (dfb10d9)
lab: add distractor_rate_at_k precision guardrail metric (872eab2)
lab: add lab_corpus with parameterized recall (5e1e071)
lab: add pick_winner scorecard decision rule (3187eac)
lab: add real consolidation pass helper (48a7d4a)
lab: isolate production clone restores (#171) (aef90c0)
lab: wire scorecard, distractors, recall params, consolidation into runner (589ec30)
recall: add metadata sidecar search (#177) (4e7956e)
recall: add state_mode=current|history recall alias (#173) (b1df86c)
recall: cap tag-score denominator to fix query-length bias (#193) (cefa516)
recall: date-aware ranking + latest-fact selection (#158, #159) (#187) (a6ed945)
recall: make recency decay window and curve configurable (#182) (dbb933f)
recall: ranking release — recency config, tag-score cap, relevance gate, date-aware ranking (#182, #193, #186, #187, #183, #184, #188) (#194) (337fe98)
scripts: safer reclassify_with_llm.py with provider flags + tighter prompt (#164) (a742602)

Bug Fixes

api: address copilot review on PR #198 (0466a1e)
api: handle grouped association write failures (cd93df9)
backup: make backup_automem.py runnable as python scripts/backup_automem.py (#175) (edd9742)
benchmarks: add publication verification bundle (#166) (420d721)
consolidation: skip eager first tick at startup to avoid FalkorDB load race (#165) (1b812cf)
docs: keep dispatch payload arrays stable (df6e9e8)
embedding: fall back to per-item real embeddings before placeholders in batch path (#189) (6e9c62c)
entity: restore person-shape exemption on the slug validation path (#179) (5e29960)
entity: stop validator over-rejecting real people, code tools, and event categories (#178) (193b730)
lab: address copilot review on PR #197 (45f80d6)
lab: align scorecard key contract (build_scorecard -> pick_winner) (7d91530)
mcp-sse: decouple /health liveness from upstream readiness (#151) (5bcfb8b)
mcp: cap association failure summary (ea4e08f)
mcp: surface stored metadata and updated_at in detailed recall format (#184) (230416e)
recall: address copilot review on PR #194 (50b1647)
recall: gate query-independent scoring on topical evidence within tag scope (#130) (#186) (c11b594)
recall: hydrate semantic recall summaries (#192) (76e845d)
recall: normalize graph keyword scores into the 0-1 component range (#191) (3653ddf)
recall: respect current memory state (#170) (ed36b98), closes #169 #158 #159

Documentation

bench: log full judged 500q LongMemEval ship-config run with churn attribution (41bf8d0)
eval: Plan A — lab metric foundation (TDD, 9 tasks) (0087dda)
eval: Plan B — parallel matrix harness (TDD, 9 tasks) (c8ddfb2)
evals: mark Memora/FAMA/WRIT lifecycle diagnostics as diagnostic-only (#174) (e8a3285)
eval: spec for recall-quality optimization harness (b1a1995)
note develop-branch contribution policy in README (ccf02dd)
positioning: add scout reference (#168) (922d23b)
refresh README and benchmark guidance (#157) (bba31cc)
runtime: align Docker viewer paths and setup guidance (#155) (bbda79b)

This PR was generated with Release Please. See documentation.

Copilot

Pull request overview

Updates the project changelog for the 0.15.3 release generated by Release Please.

Changes:

Add a new 0.15.3 section to CHANGELOG.md
Document the included bug fix: mcp-sse health/readiness decoupling (PR #151)

…nce gate, date-aware ranking (#182, #193, #186, #187, #183, #184, #188) (#194) ## Release: ranking & recall series (develop → main) ⚠️ **Merge with a MERGE COMMIT — do not squash.** release-please needs the individual conventional commits below to compute the version and changelog for PR #154. ### What's in this release | PR | Change | Default behavior | |---|---|---| | #182 | `feat(recall)`: configurable recency decay window/curve | unchanged (env-gated) | | #193 (replaces #185) | `feat(recall)`: tag-score denominator cap fixes query-length bias | unchanged (`SEARCH_TAG_SCORE_TOKEN_CAP=0`) | | #186 | `fix(recall)`: relevance gate — query-independent scoring gated on topical evidence (#130) | unchanged (gate off) | | #187 | `feat(recall)`: date-aware ranking, `recency_bias=off\|on\|auto`, latest-fact selection (#158, #159) | `RECALL_RECENCY_BIAS=off`; adds deterministic timestamp tiebreak for near-ties | | #183 | `feat(benchmarks)`: failure-mode diagnosis harness + judge quota preflight | tooling only | | #184 | `fix(mcp)`: surface stored metadata + `updated_at` in detailed recall format (#111) | additive | | #188 | `feat(enrichment)`: classification fallback-rate metrics in `/enrichment/status` | additive | Plus: CI now runs on `develop` pushes/PRs; benchmark experiment log + README contribution-policy note. ### Verification evidence - **Unit/lint/npm**: 625 pytest + 16 mcp-sse-server tests green on develop head; CI green. - **Default-preserve**: recall-lab baseline on the 10k-memory production snapshot — develop defaults vs main pooled baseline identical aggregates (R@5 0.655 / R@10 0.710 / MRR 0.434 / NDCG@10 0.501). Two-stack probe run (main vs develop, defaults): 11/12 preserve-exact, remaining diffs are near-tie reorders (top-1 score deltas ≤ 5.4e-5, the #187 timestamp tiebreak). - **Full judged 500q LongMemEval** (ship config: `RECALL_RECENCY_BIAS=auto` + `temporal-answer` harness): recall@5 96.6% (483/500), accuracy 86.0% (430/500), `judge_errors=0`, `memory_ingest_failures=0`. - **Churn attribution** (targeted re-runs of all 17 churned questions on current-main-at-defaults and develop-at-defaults): 15/17 moved with #191 (already on main) — the April canonical 97.2% floor is stale; current main measures ~97.0%. Develop-at-defaults differs from current main by **1 question in 500** (a near-tie rank-5/6 flip from #187's deterministic tiebreak). Accuracy is within answerer replicate noise (identical-config reference runs flip 28/500 answers). - Full detail: `benchmarks/EXPERIMENT_LOG.md` (2026-06-11 entry) and `benchmarks/results/lme_churn17_*` + `analyze_churn17.py`. ### Opt-in features shipped OFF `RECALL_RELEVANCE_GATE` (validated at 0.40 on lab corpus; improves negative-probe precision) and `RECALL_RECENCY_BIAS=auto` (current-state query re-ranking). Neither affects default behavior; see `docs/ENVIRONMENT_VARIABLES.md`. ### After merging release-please will update PR #154 (v0.16.0); merging *that* cuts the tag and publishes the `:stable` image — the actual user-facing deploy event for Railway template users. 🤖 Generated with [Claude Code](https://claude.com/claude-code)

Copilot AI review requested due to automatic review settings April 24, 2026 15:30

jack-arturo added the autorelease: pending label Apr 24, 2026

Copilot started reviewing on behalf of jack-arturo April 24, 2026 15:31 View session

Copilot AI reviewed Apr 24, 2026

View reviewed changes

jack-arturo force-pushed the release-please--branches--main branch 6 times, most recently from f71263e to c02c93a Compare May 1, 2026 16:23

jack-arturo changed the title ~~chore(main): release 0.15.3~~ chore(main): release 0.16.0 May 14, 2026

jack-arturo force-pushed the release-please--branches--main branch from c02c93a to d535732 Compare May 14, 2026 02:01

jack-arturo force-pushed the release-please--branches--main branch from d535732 to 1959ac6 Compare May 22, 2026 07:39

jack-arturo force-pushed the release-please--branches--main branch 14 times, most recently from 26053b3 to f1010db Compare June 11, 2026 18:24

jack-arturo mentioned this pull request Jun 12, 2026

feat(recall): ranking release — recency config, tag-score cap, relevance gate, date-aware ranking (#182, #193, #186, #187, #183, #184, #188) #194

Merged

jack-arturo force-pushed the release-please--branches--main branch from f1010db to 349f8c6 Compare June 12, 2026 15:32

jack-arturo force-pushed the release-please--branches--main branch 2 times, most recently from a0643ef to 4186f2a Compare June 17, 2026 03:16

chore(main): release 0.16.0

bf28679

jack-arturo force-pushed the release-please--branches--main branch from 4186f2a to bf28679 Compare June 17, 2026 19:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(main): release 0.16.0#154

chore(main): release 0.16.0#154
jack-arturo wants to merge 1 commit into
mainfrom
release-please--branches--main

jack-arturo commented Apr 24, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jack-arturo commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🤖 I have created a release beep boop

0.16.0 (2026-06-17)

Features

Bug Fixes

Documentation

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jack-arturo commented Apr 24, 2026 •

edited

Loading