Skip to content

Feature: Test suite partitioning #449

@frenchy64

Description

@frenchy64

CI platforms like Actions and CircleCI support matrix builds which can be used to fan-out a number of parallel jobs executing a test suite.

For this to result in faster builds, the test runner must be able to partition a test suite.

An Actions build might look like this:

jobs:
  test:
    runs-on: ubuntu-22.04
    strategy:
      matrix:
        id: [0,1,2,3,4]
    steps:
      - run: ./bin/kaocha --partition-index ${{ strategy.job-index }} --partitions ${{ strategy.job-total }}

This would cover the entire test suite by running:

./bin/kaocha --partition-index 0 --partitions 5
./bin/kaocha --partition-index 1 --partitions 5
./bin/kaocha --partition-index 2 --partitions 5
./bin/kaocha --partition-index 3 --partitions 5
./bin/kaocha --partition-index 4 --partitions 5

You could imagine different strategies for partitioning:

  • split by test namespace
    • don't need to load tests you don't need
  • load all namespaces, split by deftest
    • share fixtures?
  • use timing results from prior runs to load-balance tests
  • have Kaocha inform CI how many partitions are needed in order to build in a certain timeframe, e.g.,
jobs:
  setup:
    runs-on: ubuntu-22.04
    outputs:
      partitions: ${{steps.partitions.outputs.partitions}}
    steps:
      - uses: actions/cache/restore@v4
        with:
          path: timings.edn
      - id: partitions
        run: echo "partitions=$(./bin/kaocha --print-partitions --target-time 5m --prior-timings timing.edn | bb -e '(-> *input* range json/encode println)')" >> GITHUB_OUTPUTS

  test:
    runs-on: ubuntu-22.04
    needs: setup
    strategy:
      matrix:
        id: ${{ fromJSON(needs.setup.outputs.partitions)}}
    steps:
      - run: ./bin/kaocha --partition-index ${{ strategy.job-index }} --partitions ${{ strategy.job-total }}

The partitioning algorithm must be deterministic and reproducible, with every test being run. It should assume that each --partition-index is covered, which is the user's responsibility (or could be packaged in a reusable Action). The simplest algorithm might be to sort tests by name before partitioning. Test runs could be randomized by using the current git sha as a seed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions