This document describes the job runner, scheduled jobs, and reconciliation subsystem that keep channel data current without manual intervention.
The automation layer has three job types, each registered under a constant name:
| Job Name | Constant | Default Cron | Default Enabled |
|---|---|---|---|
playlist_sync |
jobs.JobPlaylistSync |
*/30 * * * * |
yes |
auto_prioritize |
jobs.JobAutoPrioritize |
30 3 * * * |
no |
dvr_lineup_sync |
jobs.JobDVRLineupSync |
*/30 * * * * |
no |
Runner-managed jobs are triggered two ways:
- Schedule (
triggered_by = "schedule") — fired by the cron scheduler. - Manual (
triggered_by = "manual") — used by automation run endpoints forplaylist_syncandauto_prioritize.
dvr_lineup_sync is currently started by the scheduler callback path; manual
DVR sync uses /api/admin/dvr/sync and does not create a job_runs row.
Source files:
internal/jobs/runner.go— run lifecycle, locking, panic recoveryinternal/jobs/store.go—Runmodel andStoreinterfaceinternal/jobs/context.go—RunMetadatacontext propagationinternal/jobs/metrics.go—StreamMetriccache modelinternal/jobs/playlist_sync.go— playlist sync jobinternal/jobs/auto_prioritize.go— auto-prioritize jobinternal/scheduler/scheduler.go— cron engine wrapperinternal/reconcile/reconcile.go— channel source reconciliationinternal/analyzer/ffmpeg.go— ffprobe/ffmpeg stream analyzer
jobs.Runner (internal/jobs/runner.go) coordinates asynchronous job
execution with a global overlap lock and persisted run state.
By default globalLock is true. When enabled, tryLock rejects any new
job start if any other job is already running (not just the same job name).
This means at most one job runs at a time across all job types. The lock can
be toggled via SetGlobalLock(enabled).
Per-name locking is always active: two concurrent runs of the same job name are never allowed regardless of the global lock setting.
Runner.Start(ctx, jobName, triggeredBy, fn) performs:
tryLock(jobName)— acquires per-name and optional global lock undermu. On success, incrementswgto track the goroutine.store.CreateRun(...)— persists a new run row withstatus = "running".- Creates a cancellable context via
context.WithCancelCause. - Launches
go r.run(...)which callsfn(ctx, runCtx).
RunContext is the handle passed to every JobFunc. It exposes:
RunID() int64— persisted run identifier.SetProgress(ctx, cur, max)— persist progress counters to the store.IncrementProgress(ctx, delta)— convenience for cur += delta.SetSummary(ctx, summary)— persist a summary string.Snapshot() (cur, max, summary)— read in-memory progress without I/O.
Progress updates are guarded by a finalizing flag: once the deferred
cleanup begins, further updates return ErrRunFinalized.
WithRunMetadata(ctx, runID, jobName, triggeredBy) attaches a RunMetadata
value to the context, allowing downstream code to correlate log entries and
store operations back to the originating job run via
RunMetadataFromContext(ctx).
The deferred block in run():
- Recovers panics — sets
status = "error"with the panic message. - Calls
beginFinalization()onRunContext— blocks further progress updates and captures final(cur, max, summary). - Releases the run cancel and per-name lock before persisting terminal
state so that
IsRunning()cannot outlive a non-running persisted status. - Calls
store.FinishRun(...)with the final status, error text, summary, and finished timestamp.
Runner.Close() prevents new runs (closed = true), cancels all active
runs with ErrRunnerClosed, and blocks on wg.Wait() until every in-flight
goroutine finishes its FinishRun persistence.
Defined in internal/jobs/store.go:
| Status | Meaning |
|---|---|
running |
Currently executing |
success |
Completed without error |
error |
Failed (including recovered panics) |
canceled |
Context was cancelled or deadline exceeded |
The Store interface (internal/jobs/store.go) defines persistence:
type Store interface {
CreateRun(ctx, jobName, triggeredBy, startedAt) (int64, error)
UpdateRunProgress(ctx, runID, progressCur, progressMax, summary) error
FinishRun(ctx, runID, status, errText, summary, finishedAt) error
GetRun(ctx, runID) (Run, error)
ListRuns(ctx, jobName, limit, offset) ([]Run, error)
}The Run struct includes ProgressCur, ProgressMax, Summary, and
optional AnalysisErrorBuckets parsed from auto-prioritize summary text.
PlaylistSyncJob (internal/jobs/playlist_sync.go) refreshes the channel
catalog from upstream M3U playlists and reconciles channel sources. The job
supports multiple playlist sources with bounded source-refresh concurrency.
The default remains sequential (playlist_sync_source_concurrency=1).
- Resolve source list — loads all enabled playlist sources from the
playlist_sourcestable. If the job context includes asource_idscope (viaPlaylistSyncSourceIDFromContext), only that source is processed. Falls back to legacyplaylist.urlfor the primary source if its stored URL is blank. - Refresh each source — iterates enabled sources in
order_indexorder with bounded worker concurrency controlled byPLAYLIST_SYNC_SOURCE_CONCURRENCY(default1, max16). For each source, callsPlaylistRefresher.RefreshForSource(ctx, source)which fetches, parses, applies item-key namespacing for non-primary sources (ps:<source_key>:<base_key>), and runs source-scoped catalog upsert/deactivation. Records per-source outcome (item count, error), and emits per-source Prometheus metrics (sync duration, errors, items). - Reconcile — calls
PlaylistReconciler.Reconcile(ctx, onProgress). Progress is reported per-channel with throttled persistence (every 5 channels or every 1 second, whichever comes first). Reconcile is skipped when all sources fail. - DVR lineup reload (optional) — if a
DVRLineupReloaderis configured viaSetPostSyncLineupReloader, it is called when at least one source succeeded, throughReloadLineupForPlaylistSyncOutcome, which returns typed reload status/skip/failure metadata (dvr.ReloadOutcome). DVR reload is skipped on total source failure.
Both playlist sync and auto-prioritize use progressPersistThrottle to
reduce database write pressure. The throttle fires when either condition
is met:
- At least
persistEverychannels have been processed since the last write. - At least
persistIntervalhas elapsed since the last write. - The progress counter has reached
progressMax(always persist on completion).
For playlist sync: persistEvery = 5, persistInterval = 1s.
On completion the run summary is set to a key=value string:
playlist_sources attempted=N succeeded=M failed=K
requested_source_id=... playlist_source_results=<source1>:items=N,<source2>:err=...;
channels processed=X/Y; added_sources=A;
existing_sources=E; dynamic_blocks=B enabled=E added=A updated=U
retained=R removed=R truncated=T; dynamic_channels=C; dynamic_added=A;
dynamic_removed=R; dvr_lineup_reloaded=bool; dvr_lineup_reload_status=S;
dvr_lineup_reload_skip_reason=R
dvr_lineup_reload_skip_reason carries compact provider-scoped detail for
both skipped providers and provider-local reload/build failures.
AutoPrioritizeJob (internal/jobs/auto_prioritize.go) probes stream
sources with ffprobe, scores them by quality, and reorders each channel's
source list so the highest-quality source is tried first during streaming.
-
Resolve analysis scope — reads settings:
analyzer.autoprioritize.enabled_only(defaulttrue) — only analyze enabled sources.analyzer.autoprioritize.top_n_per_channel(default0= unlimited, max100) — limit analysis to the top N sources per channel by current sort order.
-
Load channels and sources — lists all enabled channels, bulk-loads sources via
ListSourcesByChannelIDs. Falls back to per-channelListSourcesif bulk load encounters mutation drift errors. -
Build analysis queue — for each source, checks the metrics cache:
- Cache hit (fresh) — reuse cached
StreamMetric. Freshness thresholds: successful metrics are fresh forSuccessFreshness(default 24h), errored metrics retry afterErrorRetry(default 30min). - Cache miss or stale — queue an
analysisTaskfor probing. - Empty stream URL — record error metric immediately, skip probing.
- Cache hit (fresh) — reuse cached
-
Resolve worker count — determines concurrent probe workers:
"fixed"mode: usesFixedWorkers(1–64), capped by available tuner slots when tuner capacity is configured."auto"mode (default): usesTunerCount - InUseCountwhen tuner capacity is configured, otherwise falls back toDefaultWorkers(default 4, max 64).- If the resolved worker count (after applying tuner/mode constraints) is zero and tasks exist, the job fails with a descriptive error.
-
Analyze pending — runs a concurrent worker pool:
- Workers read from a shared
taskChchannel. - Each worker acquires a probe lease from the source's virtual tuner
pool via
TunerUsage.AcquireProbeForSource(ctx, sourceID, ...)when tuner-aware mode is active. Source-pool-aware probing ensures probe leases respect per-source capacity without cross-pool interference. - On
stream.ErrNoTunersAvailable, performs up to 3 probe-slot acquire attempts with exponential backoff. Delays start at 250 ms and double per attempt, capped at 2 s. - On HTTP 429 errors, retries once after
HTTP429Backoff(default 60s). - On
ErrProbePreempted, sets a fatal error and cancels all workers. - After each completed non-retried probe attempt with a tuner lease, waits
ProbeTuneDelaybefore processing the next task.
- Workers read from a shared
-
Cache and persist — upserts analyzed
StreamMetricentries and updatesSourceProfileUpdateon each source record associated with that item key (last probe timestamp, resolution, FPS, codecs, bitrate). -
Score and reorder — for each channel, calls
orderSourcesByScore:- Computes a quality score and applies a health penalty.
- Reorders sources via
ReorderSourcesif the order changed. - Tolerates mutation drift errors (channel/source not found, source set drift) by recording skip telemetry instead of failing.
Quality score for each source is the sum of three normalized components:
qualityScore = resNorm + fpsNorm + brNorm
Where:
resNorm = (width * height) / maxResolution— resolution normalized against the highest resolution source in the channel.fpsNorm = fps / maxFPS— frame rate normalized against the highest FPS in the channel.brNorm = bitrate / maxBitrate— bitrate normalized against the highest bitrate in the channel. UsesBitrateBPS; falls back toVariantBPSwhenBitrateBPSis zero.
A source is scorable only when Error is empty and Width > 0,
Height > 0, and FPS > 0.
The final score is reduced by a health penalty based on recent streaming failures:
finalScore = qualityScore - healthPenalty
Health penalty components (cumulative, capped at 2.95):
| Condition | Penalty |
|---|---|
FailCount == 1 |
+0.40 |
FailCount == 2 |
+0.85 |
FailCount == 3 |
+1.30 |
FailCount >= 4 |
+1.90 |
CooldownUntil > now |
+0.90 |
LastFailAt within 5 min |
+0.75 |
LastFailAt within 30 min |
+0.50 |
LastFailAt within 2 hours |
+0.25 |
LastFailAt older than 2 hours |
+0.10 |
LastFailAt > LastOKAt (last was fail) |
+0.35 |
Sources with FailCount >= 2 are forcibly demoted (sorted after all
non-demoted sources regardless of score). No penalty is applied when
FailCount == 0 and LastFailAt <= LastOKAt (the source has recovered).
After scoring, sources are sorted by:
- Non-demoted sources before demoted sources.
- Higher score first (within 1e-9 tolerance).
- More recent
LastOKAtfirst. - Lower
SourceIDfirst (stable tiebreak).
Disabled sources are appended after all enabled sources.
StreamMetric (internal/jobs/metrics.go) stores cached probe results:
| Field | Type | Description |
|---|---|---|
ItemKey |
string | Catalog source identifier |
AnalyzedAt |
int64 | Unix timestamp of last analysis |
Width |
int | Video width in pixels |
Height |
int | Video height in pixels |
FPS |
float64 | Frames per second |
VideoCodec |
string | e.g. "h264", "hevc" |
AudioCodec |
string | e.g. "aac", "mp3" |
BitrateBPS |
int64 | Measured or metadata bitrate |
VariantBPS |
int64 | HLS variant bitrate from stream tags |
ScoreHint |
float64 | Reserved quality score hint (field exists but is not currently populated by the analyzer pipeline) |
Error |
string | Probe error message (empty on success) |
Default freshness: DefaultMetricsFreshness = 24h,
DefaultErrorRetry = 30min.
Probe errors are bucketed for summary reporting:
| Bucket | Match Pattern |
|---|---|
http_NNN |
"server returned NNN" (regex) |
decode_ffprobe_json |
"decode ffprobe json" |
ffprobe_no_video_streams |
"ffprobe returned no video streams" |
stream_url_empty |
"stream url is empty" |
probe_slot_unavailable |
"probe slot unavailable" |
timeout |
"deadline exceeded" / "timed out" / "timeout" |
dns_no_such_host |
"no such host" |
connection_refused |
"connection refused" |
ffmpeg_sample_failed |
"ffmpeg sample failed" |
ffprobe_failed |
"ffprobe failed" |
other |
Everything else |
When auto-prioritize encounters mutation drift (channels or sources modified concurrently), it records a skip reason instead of failing:
| Bucket | Condition |
|---|---|
source_load_channel_not_found |
Channel deleted between list and load |
source_load_source_not_found |
Source deleted between list and load |
source_load_source_set_drift |
Source set changed during bulk load |
reorder_channel_not_found |
Channel deleted before reorder |
reorder_source_not_found |
Source deleted before reorder |
reorder_source_set_drift |
Source set changed between score and reorder |
On completion the run summary is set to a key=value string:
channels=N analyzed=N cache_hits=N reordered=N skipped_channels=N
analysis_errors=N analysis_error_buckets=bucket1:N,bucket2:N
skip_reason_buckets=bucket1:N,bucket2:N enabled_only=bool
top_n_per_channel=N limited_channels=N
The playlist subsystem types used by catalog refresh are documented in
CATALOG-PIPELINE.md under "Playlist Management":
Item, ItemStream, Group, and Query.
analyzer.FFmpegAnalyzer (internal/analyzer/ffmpeg.go) wraps ffprobe and
optionally ffmpeg to extract stream quality metrics.
| Field | Default | Description |
|---|---|---|
FFprobePath |
"ffprobe" |
Path to ffprobe binary |
FFmpegPath |
"ffmpeg" |
Path to ffmpeg binary |
ProbeTimeout |
7s | Per-probe context timeout |
AnalyzeDurationUS |
1,500,000 | ffprobe -analyzeduration (us) |
ProbeSizeBytes |
1,000,000 | ffprobe -probesize (bytes) |
BitrateMode |
"metadata_then_sample" |
Bitrate measurement strategy |
SampleSeconds |
3 | ffmpeg sample duration (seconds) |
metadata— use only ffprobe metadata (stream/format/variant bitrate).sample— always run an ffmpeg sample to measure bitrate.metadata_then_sample(default) — use metadata if positive; fall back to ffmpeg sample otherwise.
- Run ffprobe with JSON output, requesting stream codec/resolution/framerate and format bitrate.
- Parse the first video stream for
Width,Height,FPS,VideoCodec. Extract the first audio stream'sAudioCodec. - Bitrate priority:
stream.bit_rate>format.bit_rate>stream.tags.variant_bitrate. - If bitrate mode requires sampling, run ffmpeg for
SampleSeconds, capture output byte count, and computebitrate = (bytes * 8) / elapsed.
scheduler.Service (internal/scheduler/scheduler.go) wraps
github.com/robfig/cron/v3 to provide settings-backed cron scheduling with
hot-reload.
The parser supports the optional-seconds 6-field format plus descriptors:
Second (optional) | Minute | Hour | Dom | Month | Dow | Descriptor
This allows both standard 5-field specs (*/30 * * * *) and 6-field specs
with a leading seconds field (0 */30 * * * *).
Default timezone is America/Chicago. Configurable via setting
jobs.timezone. On invalid timezone, falls back to UTC with a warning.
The timezone affects when cron expressions fire. It can be updated at runtime
via UpdateTimezone(ctx, timezone) which persists the setting and reloads
all schedules.
| Setting Key | Default | Description |
|---|---|---|
jobs.timezone |
America/Chicago | Scheduler timezone |
jobs.playlist_sync.enabled |
true | Enable playlist sync cron |
jobs.playlist_sync.cron |
*/30 * * * * |
Playlist sync schedule |
jobs.auto_prioritize.enabled |
false | Enable auto-prioritize |
jobs.auto_prioritize.cron |
30 3 * * * |
Auto-prioritize schedule |
jobs.dvr_lineup_sync.enabled |
false | Enable DVR lineup sync |
jobs.dvr_lineup_sync.cron |
*/30 * * * * |
DVR lineup sync schedule |
LoadFromSettings(ctx) rebuilds the entire cron engine:
- Reads timezone and per-job enabled/cron settings from the store.
- Validates cron specs for enabled jobs.
- Creates a new
cron.Cronengine with the resolved timezone. - Registers all enabled jobs with their callbacks.
- Stops the old engine (waits for completion), swaps in the new engine.
- If the scheduler was previously started, starts the new engine.
UpdateJobSchedule(ctx, jobName, enabled, cronSpec) provides single-job
hot-update: validates, persists to settings, removes the old cron entry,
and adds the new one — all under the lifecycle mutex.
RegisterJob(jobName, callback)— registers a callback before starting.Start()— starts the cron engine.Stop()— cancels the run context, stops the engine, returns a done context.ListSchedules(ctx)— returns all job schedules with their next-run times.NextRun(jobName)— returns the next scheduled fire time for one job.
reconcile.Service (internal/reconcile/reconcile.go) synchronizes
published channel source lists against the current catalog state.
For channels without dynamic rules, reconciliation:
- Lists active item keys from the catalog matching the channel's
ChannelKey. - Lists existing sources on the channel.
- Adds any catalog items not already present via
AddSource. - Skips items that cause
ErrAssociationMismatch(cross-channel conflicts).
Channels with DynamicRule.Enabled = true use catalog filter queries:
-
Dynamic channel block sync —
SyncDynamicChannelBlocksmaterializes block-level channel additions/removals/updates. When a block'ssource_idsfilter is set, only catalog items from those playlist sources are matched. Truncation is logged when matches exceedDynamicGuideBlockMaxLen. -
Per-channel dynamic source sync — for each reconcilable dynamic channel:
- Builds a
dynamicCatalogFilterKeyfrom the rule's group names, search query, regex flag, and optionalsource_idsfilter. - Paged mode (when enabled and the rule is used by only one channel):
delegates to
SyncDynamicSourcesByCatalogFilterwhich iterates the catalog in pages of 512 items. - Shared-rule cache mode (when multiple channels share the same dynamic rule): caches the full item key list and reuses it for each channel to avoid repeated full catalog scans.
- Legacy mode (paged mode disabled): fetches all matching item keys
via
ListActiveItemKeysByCatalogFilter, then callsSyncDynamicSources.
- Builds a
-
Match limit enforcement — if the matched item count exceeds
dynamicRuleMatchLimit(defaultDynamicGuideBlockMaxLen), the reconciliation fails with an error for that channel. This prevents overly broad rules from creating excessive source associations.
The Result struct tracks:
| Field | Description |
|---|---|
ChannelsTotal |
Total channel count |
ChannelsProcessed |
Channels successfully reconciled |
ChannelsSkipped |
Non-reconcilable or conflict-skipped |
SourcesAdded |
New source associations created |
SourcesAlreadySeen |
Existing sources unchanged |
DynamicBlocksProcessed |
Dynamic block queries evaluated |
DynamicBlocksEnabled |
Enabled dynamic block queries |
DynamicChannelsAdded |
Channels added by dynamic blocks |
DynamicChannelsUpdated |
Channels updated by dynamic blocks |
DynamicChannelsRetained |
Channels unchanged by dynamic blocks |
DynamicChannelsRemoved |
Channels removed by dynamic blocks |
DynamicChannelsTruncated |
Channels truncated at match cap |
DynamicChannelsProcessed |
Dynamic-rule channels source-synced |
DynamicSourcesAdded |
Sources added via dynamic rules |
DynamicSourcesRemoved |
Sources removed via dynamic rules |
| Setting Key | Type | Default | Description |
|---|---|---|---|
analyzer.probe.timeout_ms |
int | 7000 | ffprobe/ffmpeg per-probe timeout |
analyzer.probe.analyzeduration_us |
int64 | 1500000 | ffprobe -analyzeduration value |
analyzer.probe.probesize_bytes |
int64 | 1000000 | ffprobe -probesize value |
analyzer.bitrate_mode |
string | metadata_then_sample |
Bitrate strategy (metadata, sample, metadata_then_sample) |
analyzer.sample_seconds |
int | 3 | ffmpeg sample duration when sampling bitrate |
analyzer.autoprioritize.enabled_only |
bool | true | Only analyze enabled sources |
analyzer.autoprioritize.top_n_per_channel |
int | 0 | Limit sources per channel (0 = all, max 100) |
| Option | Default | Description |
|---|---|---|
SuccessFreshness |
24h | How long a successful probe stays cached |
ErrorRetry |
30min | How long before retrying a failed probe |
DefaultWorkers |
4 | Worker count when tuner-unaware |
WorkerMode |
"auto" |
"auto" or "fixed" |
FixedWorkers |
— | Required when WorkerMode is "fixed" |
TunerCount |
0 | Total tuner capacity (0 = tuner-unaware) |
ProbeTuneDelay |
0 | Cooldown after releasing a probe lease |
HTTP429Backoff |
60s | Wait time after an HTTP 429 response |
-
Playlist sync — run frequently (every 15–30 minutes) to keep the catalog current with upstream playlist changes. This is the only job enabled by default.
-
Auto-prioritize — run during low-usage hours (default: 3:30 AM). Probing streams consumes tuner slots and network bandwidth. Running it overnight minimizes impact on active viewers. Consider enabling only if you have multiple sources per channel and want automatic quality-based ordering.
-
DVR lineup sync — run at the same cadence as playlist sync when a DVR integration is configured. Keeps the DVR provider's channel lineup in sync with the published channel list. Note: dvr_lineup_sync scheduling is configured via the DVR admin UI and API, not through the Automation page (see DVR-INTEGRATION.md for configuration details).
- Set the timezone to your local timezone so cron expressions behave intuitively.
- For auto-prioritize with tuner-aware mode, ensure
TunerCountmatches your actual tuner configuration so the worker pool respects active stream capacity. - After changing playlist URL or channel configuration, trigger a manual playlist sync to apply changes immediately rather than waiting for the next scheduled run.
- Monitor job run summaries for elevated error counts — a high
analysis_errorscount in auto-prioritize may indicate network issues or stale stream URLs.
The job runner uses a global overlap lock (globalLock = true by default),
meaning only one job can run at a time across all job types. When a job
start is rejected because another job is already running, the admin API
returns HTTP 409 Conflict.
To diagnose:
- Check
GET /api/admin/jobs?limit=5for a run withstatus: "running". - Wait for the running job to finish before retrying. Scheduled triggers that collide with an in-flight job are now coalesced into one deferred catch-up run with bounded exponential backoff; no unbounded overlap queue is created.
- If a job appears stuck (running for an unexpectedly long time), check
server logs for panics or deadlocks. The runner recovers from panics
and marks the run as
"error", but a blocked network call may hang until its context deadline.
Scheduler contention/freshness observability (Prometheus):
job_scheduler_events_total{job_name,event}— scheduler start/skip/deferred lifecycle counters.job_scheduler_deferred_pending{job_name}—1when a deferred catch-up run is pending.job_scheduler_deferred_backoff_seconds{job_name}— current deferred retry backoff.job_scheduler_deferred_age_seconds{job_name}— age of the pending deferred catch-up.job_scheduler_last_success_timestamp_seconds{job_name}— last successful schedule-triggered run timestamp.job_scheduler_freshness_seconds{job_name}— computed freshness age since the last successful schedule-triggered run.
Upstream IPTV providers may return HTTP 429 (Too Many Requests) when
probe workers hit rate limits. When a probe receives a 429 response,
the worker retries once after HTTP429Backoff (default 60 seconds).
To reduce 429 pressure:
- Increase
ProbeTuneDelay— adds a cooldown (in milliseconds) after each probe releases its tuner lease, spacing out requests to the upstream provider. Even a small value (e.g. 500ms–2s) can significantly reduce rate-limit hits. - Lower the worker count — in
"fixed"worker mode, reduceFixedWorkersso fewer concurrent probes hit the provider. In"auto"mode, the worker count is already capped by available tuner slots. - Schedule during off-peak hours — the default cron (
30 3 * * *) targets low-usage periods. Avoid overlapping with times when other automation or clients are actively streaming.
When playlist sync and auto-prioritize are both enabled, their schedules should be staggered to avoid contention:
- Playlist sync runs frequently (every 15–30 minutes). Keep this as the higher-frequency job since it is lightweight.
- Auto-prioritize is resource-intensive (probes consume tuner slots and network bandwidth). Schedule it at most once or twice per day, ideally during off-peak hours.
- Because the global lock prevents overlap, a long-running auto-prioritize
job will defer playlist sync into catch-up backoff windows. Keep playlist
sync frequency aligned to observed run durations and monitor
job_scheduler_freshness_seconds{job_name="playlist_sync"}to confirm freshness targets are met under load. - Be aware of provider session caps — if your IPTV provider limits
concurrent connections, ensure
TunerCountreflects this limit so auto-prioritize workers do not exhaust all available sessions and block live viewer streams.
All jobs are safe to re-run at any time. The global lock prevents concurrent
execution, so a duplicate trigger returns ErrAlreadyRunning (HTTP 409)
rather than causing data corruption.
| Job | Idempotent | Semantics |
|---|---|---|
playlist_sync |
Yes | Upsert semantics on catalog refresh; reconcile adds missing sources only. Re-running produces the same catalog state for a given upstream playlist. |
auto_prioritize |
Yes | Overwrites quality scores and source order based on current probe results. Metrics cache uses upsert; source reorder is a full replacement of the ordering. |
dvr_lineup_sync |
Yes | Applies an idempotent patch to the DVR provider lineup. Re-running with no upstream changes produces no mutations. |
The playlist sync pipeline executes staged phases. Individual source failures do not prevent other sources from being refreshed.
| Stage | Operation | On Failure |
|---|---|---|
| 1 | Resolve source list | Job errors immediately if no enabled sources found. |
| 2 | Refresh sources | Each source is refreshed independently. Failed sources are recorded in the summary; their catalogs are not affected. Other sources continue. Source-scoped deactivation ensures one failing source does not mark other sources' items inactive. |
| 3 | Reconcile sources | Skipped when all sources fail. When at least one source succeeds, catalog is updated and channel source mappings are reconciled. |
| 4 | DVR lineup reload | Triggered when at least one source succeeds; skipped on total source failure. Provider-local reload/build failures are aggregated while remaining providers continue. Run summary status reflects reloaded, partial, failed, or skipped based on aggregate outcome. |
The auto-prioritize job has analogous staged behavior: if analysis
completes but reordering fails for a specific channel, the failure is
classified as mutation drift and recorded in skip telemetry rather than
aborting the entire job. Fatal errors (e.g. ErrProbePreempted) cancel
all workers immediately.
The scheduler uses robfig/cron/v3 with cron.WithLocation(location),
so all cron expressions are evaluated in the configured timezone.
Default timezone: America/Chicago (setting key: jobs.timezone).
Change it at runtime via PUT /api/admin/automation (include "timezone": "..." in the
JSON body), which persists the value and hot-reloads all schedules.
Spring-forward example: A cron spec of 0 2 * * * in
America/Chicago during the spring DST transition (2:00 AM jumps to
3:00 AM) causes the job to fire at 3:00 AM CDT — the first wall-clock
instant at or after the skipped target time.
Fall-back example: During the fall-back transition (2:00 AM occurs twice), the cron fires on the first occurrence of the target time (i.e. before clocks are set back). It does not fire a second time.
If the configured timezone is invalid (e.g. typo in the setting value), the scheduler logs a warning and falls back to UTC.