Skip to content

Periodic full-queue safety-net sweep for unmatched_external #87

Description

@koinsaari

Goal

A backstop sweep that periodically re-evaluates the entire unmatched_external queue against the entire places table, catching rows that the per-OSM-ingest sweep missed (e.g., from a sweep that crashed mid-loop, or rows in regions OSM hasn't touched recently).

Why

The per-ingest sweep (introduced for #72) only re-evaluates queue rows near places touched by the current OSM ingest. Queue rows in stale regions, or rows that a previous sweep failed to process due to a mid-flight crash, can sit forever waiting for a coincidental OSM update nearby.

Scope

  • Cron / scheduled job that runs the queue-wide sweep on a configurable cadence (daily / weekly / monthly — settle in brainstorm).
  • Same matcher logic as the per-ingest sweep: re-run identity.Match per queue row, attach + delete on match, bump attempts + last_attempted on no-match.
  • No spatial filter: query is bounded by queue size (WHERE attempts < N).
  • Metrics / log summary: how many rows processed, how many attached, how many still unmatched.

Out of scope

  • Triage UI for high-attempt records.
  • Per-source sweep filters.

Acceptance

  • Backstop runs on schedule, drains queue rows that fall through the per-ingest sweep, idempotent against the per-ingest sweep if both run close together.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions