Added daily requeuing of missing frames. by cmccully · Pull Request #461 · LCOGT/banzai

cmccully · 2026-05-08T21:08:12Z

This PR adds a cron to requeue missing frames using the same cron container as the calibration scheduler. The query only relies on the archive api.

I also refactored some of the archive querying for better reuse of the retry logic and less code duplication.

Copilot

Pull request overview

Adds a Celery beat (cron) job to detect and requeue “missing” frames by querying the archive API, and refactors archive querying / queue routing to reduce duplication.

Changes:

Introduces a new celery.requeue_missing_frames periodic task and wires it into the existing cron container/entrypoint.
Adds a shared banzai.query module with retrying archive GET + archive frame pagination helpers.
Refactors queue selection into get_processing_queue() and reuses archive query helper in FITS downloading / BPM ingestion.

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
uv.lock	Bumps `lco-banzai` to 1.36.0 and updates lock metadata.
pyproject.toml	Updates version and switches console script to new cron entrypoint.
helm-chart/banzai/templates/listener.yaml	Updates the cron container command to `banzai_cron`.
CHANGES.md	Adds 1.36.0 changelog entry for daily requeueing.
banzai/utils/observation_utils.py	Adds tenacity retry to calibration-block archive query.
banzai/utils/instrument_utils.py	Adds shared `get_processing_queue()` helper.
banzai/utils/fits_utils.py	Replaces direct `requests.get` with `archive_get` helper for downloads.
banzai/settings.py	Adds requeue cron configuration knobs.
banzai/scheduling.py	Adds the `requeue_missing_frames` Celery task and archive querying usage.
banzai/query.py	New module for archive querying helpers and cross-matching.
banzai/main.py	Adds new cron entrypoint scheduling and reuses `archive_get` / `get_processing_queue`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…d permissions.

jchate6

I would like to see more documentation here if possible.

New or heavily reworked functions would benefit from doc strings describing their purpose, use, and intended functionality.
I think we need to record this change of behavior somewhere easily accessible and referenced so that the precise details of Banzai's expected behavior are understood by a wider audience.
There are several changes here that I don't really understand why they were made (like the changes in fits_utils). Some comments in the PR explaining how these different changes are related would generally be appreciated.

Also, there are no test changes. Do we want to test that we catch and re-queue missing frames correctly?

I'm curious if we have a different strategy for the cron. Does it make more sense to run it more than once a day? What if we triggered for each site at local noon rather than one bulk re-queue in the West Coast morning? This would spread out the queries and queuing as well.

jchate6 · 2026-05-12T18:58:44Z

+
+REQUEUE_OBSTYPES = ['EXPOSE', 'STANDARD']
+
+REQUEUE_LOOKBACK_HOURS = 36


This feels like an odd lookback time.
This means that everything from 02:30-14:30 is checked twice while everything from 14:30 to 02:30 is only checked once.

It isn't clear to me why we would have redundancy for OGG and ELP, but not CPT and TFN...
It feels like this should be either 25 or 26 hours if we just want a little overlap to make sure we didn't miss anything during processing, or a full 48 hours if we want to check everything twice.

jchate6 · 2026-05-12T19:00:10Z


 REFERENCE_CATALOG_URL = os.getenv('REFERENCE_CATALOG_URL', 'http://phot-catalog.lco.gtn/')
+
+REQUEUE_MISSING_FRAMES_TIME = datetime.time(hour=14, minute=30)


Are we worried about this being too close to the 15:00 report timing? If there are more than a few files, is it possible these might be missed?

jchate6 · 2026-05-12T19:05:09Z

-
+    app.add_periodic_task(crontab(hour=runtime_context.REQUEUE_MISSING_FRAMES_TIME.hour,
+                                  minute=runtime_context.REQUEUE_MISSING_FRAMES_TIME.minute),
+                                  requeue_missing_frames.s(runtime_context=vars(runtime_context)),


I'm not sure what the .s is doing... is that a celery thing?

jchate6 · 2026-05-12T21:07:40Z

+        for obstype in runtime_context.REQUEUE_OBSTYPES:
+            # Get the raw frames that we took
+            start = datetime.now(timezone.utc) - timedelta(hours=runtime_context.REQUEUE_LOOKBACK_HOURS)
+            end = datetime.now(timezone.utc)


Is there any concern here about re-queuing things that are just in the middle of being processed?

jchate6 · 2026-05-12T21:10:56Z

+    return frames
+
+
+def cross_match_missing_frames(raw_frames, reduced_frames):


This will also consider any unbinned UBVRI frames as missing and re-queue them, correct?

If they are missing then yes, but it will not automatically requeue frames that already processed even if they are unbinned.

We don't process those at all unless specifically requested.
There are many warnings and it hasn't happened in months, but they will always be unprocessed.

Added daily requeuing of missing frames.

e48f316

cmccully requested review from Copilot and jchate6 May 8, 2026 21:08

Copilot started reviewing on behalf of cmccully May 8, 2026 21:08 View session

Copilot AI reviewed May 8, 2026

View reviewed changes

cmccully added 5 commits May 11, 2026 09:40

Fixed some typos in api calls.

1ea9bef

Removed code style whitespace.

c11d8c5

Remove auth headers to download url because it already includes signe…

5f1b648

…d permissions.

Fixes based on copilot comments.

07485a9

Typo fixes to passing around include_related_frames.

c2ed4d8

jchate6 requested changes May 12, 2026

View reviewed changes

jchate6 reviewed May 12, 2026

View reviewed changes


		REQUEUE_OBSTYPES = ['EXPOSE', 'STANDARD']

		REQUEUE_LOOKBACK_HOURS = 36


		REFERENCE_CATALOG_URL = os.getenv('REFERENCE_CATALOG_URL', 'http://phot-catalog.lco.gtn/')

		REQUEUE_MISSING_FRAMES_TIME = datetime.time(hour=14, minute=30)

		return frames


		def cross_match_missing_frames(raw_frames, reduced_frames):

Conversation

cmccully commented May 8, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jchate6 left a comment

Choose a reason for hiding this comment

Uh oh!

jchate6 May 12, 2026

Choose a reason for hiding this comment

Uh oh!

jchate6 May 12, 2026

Choose a reason for hiding this comment

Uh oh!

jchate6 May 12, 2026

Choose a reason for hiding this comment

Uh oh!

jchate6 May 12, 2026

Choose a reason for hiding this comment

Uh oh!

jchate6 May 12, 2026

Choose a reason for hiding this comment

Uh oh!

cmccully May 13, 2026

Choose a reason for hiding this comment

Uh oh!

jchate6 May 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants