fix(ml): set last_train_time from model's embedded training_timestamp on load#4077
Merged
Conversation
… on load When a cached model is loaded at startup, last_train_time was left None, causing the training-age check to treat the model as infinitely old and force an immediate retrain. Use predictor.training_timestamp (already restored from the .npz metadata by LoadPredictor.load) rather than the filesystem mtime, which can be wrong after file copies or backup restores. If training_timestamp is None (pre-dating that field), last_train_time stays None and a retrain correctly triggers as a safe fallback. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes unnecessary ML retraining on startup by setting LoadMLComponent.last_train_time from the model’s embedded training_timestamp immediately after a successful load and validation, rather than leaving it unset (which makes the component assume the model is “old” on the next loop).
Changes:
- Set
last_train_timetoself.predictor.training_timestampwhen a cached model loads successfully and passes validation.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
last_train_time#4072, using the correct data sourceLoadPredictor.load()succeeds,predictor.training_timestampis already populated from the.npzmetadata — use that directly instead ofos.path.getmtime()training_timestampisNone(model predates that field),last_train_timestaysNoneand a retrain correctly triggers as a safe fallbackWhy not the approach in #4072
PR #4072 reads
os.path.getmtime()to approximate the training time, but filesystem mtime is unreliable — it resets on file copy, backup restore,touch, or a filesystem with lower mtime resolution. The actual training timestamp is already serialised inside the.npzfile and is loaded byLoadPredictor.load()on every startup, so there is no need to fall back to the filesystem.The fix is a single line with no try/except needed.
Test plan
predbat_ml_model.npzand confirm no immediate retraining occurs in the logslast_trainedin the component status reflects the model's embedded training time, not the file's mtime🤖 Generated with Claude Code