Evals as part of CI

It can easily happen that someone PRs a core change with deep performance consequences downstream

we have special tests that profile execution speed, CPU and MEM footprints of certain components (mapper, maybe detector, sometimes full blueprints, startup speed), how/where could we log this data? as well as fail tests upon significant deterioration?

what is the standard output for benchmarsk like this? basically a number per key.. lower is better, how do we format this?

would be nice also to get historical graphs for benchmarks to see the repo health, even individual users contributions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evals as part of CI #2288

Metadata

Assignees

Labels

Fields

Projects

Milestone

Relationships

Development

Evals as part of CI #2288

Description

Metadata

Metadata

Assignees

Labels

Fields

Projects

Milestone

Relationships

Development

Issue actions