Skip to content

fix(ci): retry apt metadata refreshes during release builds#93

Merged
konard merged 5 commits into
mainfrom
issue-92-78df2acc5d94
Jun 5, 2026
Merged

fix(ci): retry apt metadata refreshes during release builds#93
konard merged 5 commits into
mainfrom
issue-92-78df2acc5d94

Conversation

@konard

@konard konard commented Jun 4, 2026

Copy link
Copy Markdown
Member

Summary

Fixes #92 by making apt metadata refreshes in release image builds resilient to transient Ubuntu mirror-sync windows.

Root Cause

The failed main release run after PR 91 was not caused by the Docker Hub manifest guard. The failing job was build-dind-amd64 (python) in run 26983601067, where apt-get update inside the dind image build hit an Ubuntu mirror metadata mismatch:

File has unexpected size (2525257 != 2525258). Mirror sync in progress?
E: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/noble-updates/main/binary-amd64/Packages.gz

Buildx retried the full build step, but the Dockerfile/install scripts still performed unguarded apt metadata refreshes, so each retry could fail on the same mirror state. The dind script also did not source /tmp/common.sh when copied into Docker builds, so shared helper behavior was not available in that path.

During PR CI validation, two branch-introduced Dockerfile issues were also caught and fixed:

  • Docker RUN defaults to /bin/sh, while common.sh is Bash-only because it uses set -euo pipefail.
  • essentials-box/install.sh cleans /tmp/*, so the ACL layer needs common.sh copied again before sourcing it.

Changes

  • Added apt_update_with_retry in ubuntu/24.04/common.sh with apt Acquire::Retries, HTTP/HTTPS timeouts, exponential backoff, and apt-list cleanup between attempts.
  • Replaced direct apt-get update / apt update -y call sites in Ubuntu 24.04 Dockerfiles, install scripts, release measurement workflow, and install helper scripts with the shared retry helper or a local fallback.
  • Fixed Docker build script sourcing so /tmp/common.sh is used when scripts are copied into image build contexts.
  • Ensured Docker layers that source common.sh run with Bash before the source operation.
  • Re-copied common.sh before the essentials ACL layer after the install layer cleanup removes /tmp/*.
  • Added a patch changeset for the CI hardening change.
  • Added the issue 92 case study with archived CI logs, error indexes, PR 89/91 timeline, and template comparison under docs/case-studies/issue-92/.

Verification

  • bash experiments/test-issue92-apt-retry-policy.sh
  • bash experiments/verify-script-syntax.sh
  • bash experiments/test-issue90-release-workflow-policy.sh
  • bash experiments/test-issue82-dockerhub-login-tolerance.sh
  • bash experiments/test-issue82-pr-parallel-tests.sh
  • bash experiments/test-issue84-apply-changesets-quotes.sh
  • bash -n scripts/ubuntu-24-server-install.sh scripts/measure-disk-space.sh ubuntu/24.04/common.sh ubuntu/24.04/assembly/install.sh ubuntu/24.04/cpp/install.sh ubuntu/24.04/dind/install.sh ubuntu/24.04/dotnet/install.sh ubuntu/24.04/essentials-box/install.sh ubuntu/24.04/full-box/install.sh ubuntu/24.04/js/install.sh ubuntu/24.04/php/install.sh ubuntu/24.04/r/install.sh experiments/verify-script-syntax.sh experiments/test-issue92-apt-retry-policy.sh
  • ruby -ryaml -e 'ARGV.each { |path| YAML.load_file(path); puts "parsed #{path}" }' .github/workflows/release.yml .github/workflows/measure-disk-space.yml
  • GITHUB_BASE_REF=main GITHUB_HEAD_REF=issue-92-78df2acc5d94 bash scripts/release/validate-changeset.sh
  • git diff --check

Local Docker CLI was not available in the workspace, so Docker image validation was verified by GitHub Actions.

CI

Latest run for head SHA 8baa3c9f08d4f1aaba71ed7384730c7ec495e08c passed: https://github.com/link-foundation/box/actions/runs/26988202555

Evidence

See docs/case-studies/issue-92/CASE-STUDY.md and archived raw logs under docs/case-studies/issue-92/raw/.

Adding .gitkeep for PR creation (default mode).
This file will be removed when the task is complete.

Issue: #92
@konard konard self-assigned this Jun 4, 2026
@konard konard changed the title [WIP] CI/CD is broken fix(ci): retry apt metadata refreshes during release builds Jun 5, 2026
@konard konard marked this pull request as ready for review June 5, 2026 00:21
@konard

konard commented Jun 5, 2026

Copy link
Copy Markdown
Member Author

Working session summary

Implemented and finalized PR 93: #93

What changed:

  • Added retrying apt metadata refreshes with apt retries, timeouts, backoff, and apt list cleanup.
  • Replaced direct apt update call sites across Dockerfiles, install scripts, and the disk measurement workflow.
  • Fixed Docker build sourcing for /tmp/common.sh, including Bash shell ordering and the essentials ACL cleanup case.
  • Added regression coverage in experiments/test-issue92-apt-retry-policy.sh.
  • Added the issue case study and archived CI evidence under docs/case-studies/issue-92/.

Verification:

  • Local lightweight checks passed, including shell syntax, YAML parsing, changeset validation, and issue-specific policy tests.
  • Docker CLI was unavailable locally, so Docker validation was verified in GitHub Actions.
  • Latest CI run passed for head SHA 8baa3c9f08d4f1aaba71ed7384730c7ec495e08c: https://github.com/link-foundation/box/actions/runs/26988202555
  • PR is ready, not draft, merge state is clean, working tree is clean, and branch is 0 behind / 5 ahead of origin/main.

Official references used in the case study: GitHub Actions contexts, GitHub Docker publishing docs, docker/login-action, and Ubuntu apt.conf.


This summary was automatically extracted from the AI working session output.

@konard

konard commented Jun 5, 2026

Copy link
Copy Markdown
Member Author

🤖 Solution Draft Log

This log file contains the complete execution trace of the AI solution draft process.

💰 Cost estimation:

  • Model: GPT-5.5
  • Provider: OpenAI
  • Public pricing estimate: $31.072050

📊 Context and tokens usage:

  • 858.9K / 1.1M (82%) input tokens, 72.8K / 128K (57%) output tokens

Total: (858.9K + 19.2M cached) input tokens, 72.8K output tokens, $31.072050 cost

🤖 Models used:

  • Tool: OpenAI Codex
  • Requested: gpt-5.5
  • Model: GPT-5.5 (gpt-5.5)

📎 Log file uploaded as Repository (98264KB)


Now working session is ended, feel free to review and add any feedback on the solution draft.

@konard konard merged commit 44d9d6c into main Jun 5, 2026
50 checks passed
@konard

konard commented Jun 5, 2026

Copy link
Copy Markdown
Member Author

🎉 Auto-merged

This pull request has been automatically merged by hive-mind.

  • All CI checks have passed

Auto-merged by hive-mind with --auto-merge flag

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CI/CD is broken

1 participant