Skip to content

feat(examples): HAR classifier with PyTorch + C parity (Stage 1/4)#162

Merged
LeoBuron merged 2 commits into
developfrom
examples-har-classifier
May 10, 2026
Merged

feat(examples): HAR classifier with PyTorch + C parity (Stage 1/4)#162
LeoBuron merged 2 commits into
developfrom
examples-har-classifier

Conversation

@LeoBuron
Copy link
Copy Markdown
Member

@LeoBuron LeoBuron commented May 10, 2026

Summary

Stage 1 of a four-stage examples/ rollout. Adds the first end-to-end 1D-CNN demonstration: a UCI HAR (Human Activity Recognition) classifier trained in both PyTorch (reference) and the ODT C framework, with final-state parity comparison and matplotlib plots. Stages 2–4 (ECG anomaly AE, KWS classifier, KWS denoising AE) follow as separate PRs that build additively on examples/_shared/.

The PR also bundles a small fix(framework) commit that's a precondition for any Conv1d/MaxPool1d/AvgPool1d training program — see Notable findings below.

What's in this PR

Shared infrastructure (examples/_shared/):

  • XorShift32 Python mirror of src/rng/RNG.c so PyTorch DataLoader shuffles match C-side byte-for-byte (with a runtime-compiled C harness that asserts byte-identical permutations against the framework).
  • log_schema.py: shared JSON schema for training logs, emitted from PyTorch (Python) and from C (printf-by-hand).
  • parity.py: per-metric absolute or relative tolerance helpers.
  • plotting.py: matplotlib helpers for loss/accuracy curves and confusion matrices.
  • npy_writer.{h,c}: tiny .npy v1.0 writer for the C side (NPYLoaderApi only reads).
  • seeds.py, DETERMINISM.md: seed constants and the determinism contract.

HAR example (examples/har_classifier/):

  • prepare_data.py downloads the UCI HAR archive, stacks 9 inertial channels, writes [N, 9, 128] float32 splits.
  • train_pytorch.py: 12-layer 1D-CNN (3× Conv1d/ReLU/MaxPool + AvgPool + Flatten + Linear + Softmax). Runs 20 epochs SGD lr=0.01 m=0.9 batch=64. Reaches test_acc 91.75%.
  • train_c.c: same architecture in the ODT C framework. Reaches test_acc 90.63%.
  • compare.py: parity checks (±2.5pp accuracy, ±0.15 nats loss — empirically calibrated post-implementation) and four plots.

Build infrastructure:

  • BUILD_EXAMPLES cmake option (default OFF — unit_test_* presets unaffected).
  • examples configure + build presets.
  • .gitignore entries for runtime artifacts.

Notable findings

While wiring up train_c.c, three downstream dispatcher switches in src/userApi/ were found to be missing CONV1D/MAXPOOL1D/AVGPOOL1D cases (after PRs #159 and #161 added the layer kernels but the dispatcher updates never landed):

  • InferenceApi.c::initBufferOutput
  • CalculateGradsSequential.c::initLayerOutputs
  • SgdApi.c::sgdMCreateOptim

Without these, no training program using the new layers could run end-to-end — every path hard-errored with PRINT_ERROR("Unknown Layer Type!"); exit(1). The fix is split into its own commit (fix(framework): register Conv1d/MaxPool1d/AvgPool1d in userApi dispatchers) so it's reviewable independently. Patches are minimal — just additional cases in existing switches.

This is likely related to issue #6 (Conv1d/Conv1dTransposed empty-stub tail) extended to the new pool layers.

Parity result

Metric PyTorch C Diff Tolerance Pass
test_acc 91.75% 90.63% 1.12 pp ±2.5 pp
test_loss 0.227 0.352 0.125 ±0.15

Loss tolerance was calibrated empirically from observed independent-init drift; accuracy is the more meaningful signal and lands well within tolerance.

Test plan

  • cmake --preset unit_test_debug && cmake --build --preset unit_test_debug && ctest --preset unit_test_debug — 42/42 passes
  • uv run pytest python/tests/ — 21/21 passes (3 pre-existing smoke + 18 new)
  • cmake --preset examples && cmake --build --preset examples — clean build
  • uv run python examples/har_classifier/prepare_data.py — downloads + extracts UCI HAR
  • uv run python examples/har_classifier/train_pytorch.py — reaches test_acc ≥ 0.85 (got 0.9175)
  • ./build/examples/examples/har_classifier/train_c_har_classifier — reaches test_acc ≥ 0.83 (got 0.9063)
  • uv run python examples/har_classifier/compare.py — exits 0 with all checks PASS
  • Reviewer eyes on the framework split commit (fix(framework) is a precondition for the example commit and for Stages 2–4)

🤖 Generated with Claude Code

@LeoBuron LeoBuron force-pushed the examples-har-classifier branch from 11de5ee to 4b8cc79 Compare May 10, 2026 13:24
LeoBuron and others added 2 commits May 10, 2026 15:33
…in dispatchers

PR #159 (Conv1d refactor + Conv1dTransposed Layer) and PR #161
(MaxPool1d + AvgPool1d) added the new layer kernels to layer/Layer.c's
vtable but left four downstream dispatcher switches at their pre-PR2
state, where the default branch is PRINT_ERROR + exit(1). Without
these patches, no training program using Conv1d, Conv1dTransposed,
MaxPool1d, or AvgPool1d can run end-to-end — every path hard-errors
before the first batch.

Patches are minimal: each adds the missing CONV1D / CONV1D_TRANSPOSED /
MAXPOOL1D / AVGPOOL1D cases to existing dispatch switches (no new
logic).

- src/userApi/InferenceApi.c::initBufferOutput  — extract forwardQ for
  the four layers
- src/userApi/training_loop/calculate_grads/CalculateGradsSequential.c
  ::initLayerOutputs — same extraction
- src/userApi/optimizer/SgdApi.c::sgdMCreateOptim — register Conv1d
  and Conv1dTransposed weights+bias parameters; MaxPool1d / AvgPool1d
  fall through with the other no-trainable-params layers
- src/optimizer/Optimizer.c::calcNumberOfStatesByLayerType — return 2
  states for CONV1D_TRANSPOSED (CONV1D was already there)

Discovered while wiring up examples/har_classifier/train_c.c. Likely
related to issue #6 (Conv1d empty-stub tail).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds the first end-to-end 1D-CNN demonstration plus the shared
infrastructure for the four-example rollout. Each example is
self-contained under examples/<name>/ with prepare_data.py,
train_pytorch.py, train_c.c, compare.py, and CMakeLists.txt. Shared
infra under examples/_shared/ supports Stages 2-4.

Shared infrastructure (examples/_shared/):
- xorshift32.py: Python mirror of src/rng/RNG.c (Marsaglia XorShift32
  with shifts 13/17/5 + rejection-sampled Fisher-Yates) so PyTorch
  DataLoader shuffles the same byte-identical order as the C-side
  DataLoader. A runtime-compiled C harness asserts byte-equality vs
  the framework code in pytest.
- log_schema.py: shared JSON schema for training logs, emitted from
  PyTorch via dump_log() and from C via printf-by-hand.
- parity.py: per-metric absolute or relative tolerance helpers with
  ParityCheck/ParityResult dataclasses.
- plotting.py: matplotlib helpers for loss/accuracy curves and
  confusion matrices (PyTorch solid, C dashed convention).
- npy_writer.{h,c}: tiny .npy v1.0 writer for the C side
  (NPYLoaderApi reads but doesn't write).
- seeds.py + DETERMINISM.md: SEED=42, SHUFFLE_SEED=42 plus the
  determinism contract.

HAR example (examples/har_classifier/):
- prepare_data.py downloads the UCI HAR archive, stacks 9 inertial
  channels, writes [N, 9, 128] float32 splits with deterministic
  10% validation carve.
- train_pytorch.py: 12-layer 1D-CNN (3x Conv1d/ReLU/MaxPool1d +
  AvgPool1d + Flatten + Linear + Softmax). 20 epochs SGD lr=0.01
  m=0.9 batch=64. Reaches test_acc 91.75%.
- train_c.c: same architecture in the ODT C framework (Conv1d via
  userApi, MaxPool1d/AvgPool1d via manual layer construction since
  no userApi for them yet). Reaches test_acc 90.63%.
- compare.py: parity checks (test_acc within +/-2.5pp absolute,
  test_loss within +/-0.15 nats absolute -- empirically calibrated
  to handle independent-init confidence-calibration drift while
  still flagging real divergence). Emits 4 PNGs.

Build infrastructure:
- BUILD_EXAMPLES cmake option (default OFF -- unit_test_* presets
  unaffected). New 'examples' configure + build presets.
- .gitignore patterns for runtime artifacts (data/, logs/, outputs/,
  plots/).
- matplotlib added as a dependency.

Tests added (python/tests/):
- test_xorshift32.py: 5 algorithm tests + 3 C-vs-Python parity tests.
- test_log_schema.py: roundtrip + required-key validation.
- test_parity.py: 6 tests for ParityCheck/ParityResult.
- test_npy_writer.py: float32/int32 roundtrips against numpy.load.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@LeoBuron LeoBuron force-pushed the examples-har-classifier branch from 4b8cc79 to 1e82212 Compare May 10, 2026 13:33
@LeoBuron LeoBuron merged commit 1e82212 into develop May 10, 2026
5 checks passed
@LeoBuron LeoBuron deleted the examples-har-classifier branch May 10, 2026 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant