Skip to content

CI: run the test suite as a GitHub Action (deferred — prove the product first) #2

Description

@samkeen

Intent

Run the xcodebuild test suite automatically on push / PR so regressions surface without remembering to run it locally.

Status: deferred — holding off until the product is proven; not worth the hosted-macOS minutes cost yet. Today the suite runs locally:

xcodebuild test -project "Relay Notes.xcodeproj" -scheme "Relay Notes" \
  -destination 'platform=iOS Simulator,name=iPhone 17 Pro' 2>&1 | xcbeautify

What CI would (and wouldn't) cover

  • Same subset as a local simulator run — currently 78 tests / 13 suites.
  • MLX/Whisper tests stay #if !targetEnvironment(simulator)-gated and skipped (MLX crashes the simulator GPU). CI can't replace on-device validation or the in-app MLXSmoke button — only the physical iPhone 15 Pro Max does that.
  • No code signing needed (simulator tests don't sign) → no certs/secrets to manage. This is the easy CI case.

Options known today

Option A — GitHub-hosted macos-26 runner (simplest)

  • Verified 2026-06-12: the macos-26 image ships Xcode 26.5 (Build 17F42) — identical to the dev Mac — and iOS 26.5 simulators incl. iPhone 17 Pro, so the local command runs unchanged.
  • Cost: macOS minutes bill at a 10x multiplier; the Free plan's 2,000 min/month → ~200 effective macOS min/month. mlx-swift builds from source each run (not instant). paths-ignore keeps doc-only commits from spending minutes.
  • Zero infra to maintain.

Option B — Self-hosted runner on the existing Mac

  • Free, identical environment (already has Xcode 26.5 + toolchain).
  • Tradeoffs: only runs when the Mac is on; runner setup + the usual self-hosted-runner security considerations (private repo mitigates the main risk).

Draft workflow (for when we pick this up)

.github/workflows/tests.yml:

name: Tests

on:
  push:
    branches: [main]
    paths-ignore: ['**.md', 'planning/**']
  pull_request:
    paths-ignore: ['**.md', 'planning/**']
  workflow_dispatch:

jobs:
  test:
    runs-on: macos-26
    steps:
      - uses: actions/checkout@v4
      - name: Pin Xcode 26.5
        run: sudo xcode-select -s /Applications/Xcode_26.5.0.app
      - run: xcodebuild -version
      - name: Cache SwiftPM checkouts
        uses: actions/cache@v4
        with:
          path: DerivedData/SourcePackages
          key: spm-${{ hashFiles('**/Package.resolved') }}
          restore-keys: spm-
      - name: Run tests
        run: |
          set -o pipefail
          brew list xcbeautify >/dev/null 2>&1 || brew install xcbeautify
          xcodebuild test \
            -project "Relay Notes.xcodeproj" \
            -scheme "Relay Notes" \
            -destination 'platform=iOS Simulator,name=iPhone 17 Pro,OS=26.5' \
            -derivedDataPath DerivedData \
            -resultBundlePath TestResults.xcresult \
            | xcbeautify
      - name: Upload result bundle on failure
        if: ${{ failure() }}
        uses: actions/upload-artifact@v4
        with:
          name: TestResults
          path: TestResults.xcresult

Notes: set -o pipefail is essential (without it, piping into xcbeautify swallows a test failure and the job falsely passes). xcode-select pins 26.5 explicitly because GitHub rotates the default Xcode over time. OS=26.5 matches the deployment target; OS=latest is more resilient but less reproducible.

Revisit triggers

  • Product is proven / it's the daily driver and worth protecting.
  • Regressions start slipping through manual local runs.
  • A second contributor or outside collaborator joins.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions