[CI] Make wrapper2/wrapper3 exercise small systolic-array configs#272
Merged
Conversation
4be79c4 to
57d18a2
Compare
The wrapper2 test job passed vpu_num_lanes and vpu_spad_size_kb_per_lane as docker env vars, but since the unified-config refactor (9db0f2c) the frontend reads these values only from the TOGSim YAML, never from the environment. The env vars were therefore silently ignored and wrapper2 ran with the default 128x128 config, making it an exact duplicate of wrapper1. Switch the reusable workflow to select a config YAML via TOGSIM_CONFIG so that a single file drives both codegen and the TOGSim cycle model, in line with the unified-config design. vpu_num_lanes also sets the systolic array dimension, so each config exercises a different array size: - configs: add systolic_ws_32x32_c1_simple_noc_tpuv3.yml (32x32 array, 32 KB/lane SPAD) and systolic_ws_8x8_c1_simple_noc_tpuv3.yml (8x8 array, 32 KB/lane SPAD); both identical to the tpuv3 default otherwise - pytorchsim_test.yml: replace vector_lane/spad_size inputs with togsim_config (string) and run_accuracy (bool); every job now sets -e TOGSIM_CONFIG instead of the dead vpu_* env vars - docker-image.yml: wrapper1 -> 128x128 config + run_accuracy true, wrapper2 -> 32x32 config, wrapper3 -> 8x8 config (run_accuracy false)
57d18a2 to
7b58700
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The
test-pytorchsim-wrapper2CI job injectsvpu_num_lanes=32andvpu_spad_size_kb_per_lane=32as docker env vars to test a small configuration. But since the unified-config refactor (9db0f2ce),extension_config.pyreads these values only from the TOGSim YAML, never from the environment:So the env vars are silently ignored and wrapper2 falls back to the default 128x128 config, making it an exact duplicate of wrapper1.
TOGSim (the C++ cycle sim) also reads vpu config from the
--configYAML, not from env, so reviving an env override would only desync codegen from the cycle model and fight the unified-config design.Fix (single source of truth)
Select a config YAML via
TOGSIM_CONFIGso one file drives both codegen and the TOGSim cycle model.vpu_num_lanesalso sets the systolic-array dimension, so each config exercises a different array size.systolic_ws_32x32_c1_simple_noc_tpuv3.yml(32x32 array, 32 KB/lane SPAD) andsystolic_ws_8x8_c1_simple_noc_tpuv3.yml(8x8 array, 32 KB/lane SPAD); both identical to the tpuv3 default otherwise.vector_lane/spad_sizenumber inputs withtogsim_config(string) andrun_accuracy(bool). Every job now sets-e TOGSIM_CONFIGinstead of the deadvpu_*env vars. The accuracy/speedup job is gated onrun_accuracyinstead ofvector_lane == 128.run_accuracy: true; wrapper2 -> 32x32 config; wrapper3 -> 8x8 config (run_accuracy: false).Verification
TOGSIM_CONFIGpointed at each new config,extension_configreadsvpu_num_lanes= 32 / 8 and SPAD = 32 KB/lane respectively; the default still reads 128/128. wrapper2 and wrapper3 now genuinely exercise the smaller array paths for codegen and the cycle sim consistently.