feat(layer): MaxPool1d + AvgPool1d (PR 3/3) by LeoBuron · Pull Request #161 · es-ude/OnDeviceTraining

LeoBuron · 2026-05-08T12:41:02Z

Summary

PR 3 of 3 in the Conv1d/Pool-family expansion (spec: docs/superpowers/specs/2026-05-06-conv1d-and-sliding-window-design.md).

Adds two new Layer-3 ops, both built on PR 1's SlidingWindow1d utility:

MaxPool1d: forward writes per-output argmax indices into a caller-allocated INT32 tensor; backward scatters lossGrad into propLoss[b, c, argmax]. Sentinel -1 handles theoretical-only empty-window case (in-practice unreachable per spec §6.3).
AvgPool1d: forward computes mean over each window; backward scatters lossGrad/kernel_size to all valid input positions. Divisor is kernel_size always (count_include_pad=True / A1 semantics per spec §6.4 / §11).

Layer 3 is alloc-free; argmax/output/propLoss buffers are caller-owned.

Layer registry extended: MAXPOOL1D and AVGPOOL1D enum entries (slotted between CONV1D_TRANSPOSED and SOFTMAX); union members and layerFunctions[] entries registered. Optimizer.c case MAXPOOL1D / AVGPOOL1D: return 0; (no trainable params) added.

Test plan

PyTorch-derived gold-value generators run via uv run; CMake regenerates them on script change.

MaxPool: 9 RUN_TEST entries — basic forward/backward, calcOutputShape, argmaxIndicesContent, multiChannel, multiBatch, withStrideAndDilation, withSamePadding (left+right edge), edgeCases (K=1).
AvgPool: 7 RUN_TEST entries — basic forward/backward, multiChannel, multiBatch, withStrideAndDilation, withSamePadding, edgeCases (K=L=4).
All withStrideAndDilation fixtures use torch.randn_like(y) lossGrad to keep positional mutations non-vacuous (per codebase_uniform_lossgrad_mutation_vacuity).
All SAME fixtures exercise BOTH left-edge and right-edge truncation in the same window set (PR 1 Errata 4).
Mutation-tested per task; vacuous mutations are explicitly substituted or deferred to fixtures that exercise the missing axis.
Allocation-locality grep clean.
Full CI green on the PR.

Closes #6 (Conv1d/ConvT empty-stub — already addressed by PR 2 for Conv side; this PR completes the Pool side that the spec scope absorbs).

NOTE: PR is stacked on develop. Per feedback_github_auto_close_default_branch, "Closes #NN" only triggers GitHub's auto-close on default-branch (main) merges. After develop→main merge, manual gh issue close 6 is required if not yet closed.

🤖 Generated with Claude Code

MaxPool1d Layer 3 implemented from greenfield: init/forward/backward/ calcOutputShape on FLOAT32. Forward scans each window for max value and writes the corresponding input position to a caller-allocated INT32 argmax tensor stored in cfg. Backward is a pure scatter — for each output position, accumulate lossGrad into propLoss[b, c, argmax]. forwardInput is unused in backward (argmax already encodes the necessary info). Sentinel -1 + PRINT_ERROR + zero-output handles the theoretically-possible-but-in-practice-unreachable validCount==0 case (spec §6.3). AvgPool1d Layer 3 implemented from greenfield: stateless init/forward/backward/calcOutputShape on FLOAT32. Forward computes mean over each window with divisor = kernel_size always (count_include_pad=True / A1 semantics per spec §6.4 / §11). Backward scatters lossGrad/kernel_size into all valid input positions of each window. Padded positions get no gradient. forwardInput is unused (geometry derived from kernel + propLoss shape). Layer 3 contains zero heap allocations; output, argmax, and propLoss buffers are caller-owned. calcOutputShape validates 3D input and populates orderOfDimensions via setOrderOfDimsForNewTensor (matches Conv1d / Conv1dTransposed convention). Two PyTorch-derived gold-value generators (generate_expected_max_pool_1d.py, generate_expected_avg_pool_1d.py) emit forward + autograd-derived dL/dx, plus int32 expectedArgmax arrays for MaxPool. The AvgPool generator includes a hand-built autograd-trackable dilation helper (F.avg_pool1d lacks a dilation arg). _format_float_literal helper applies PR 1 Errata 1 (avoid :.9g). withStrideAndDilation fixtures use torch.randn_like(y) lossGrad to keep positional mutations non-vacuous (codebase memory: uniform-lossGrad-mutation-vacuity). layerType_t extended with MAXPOOL1D and AVGPOOL1D (slotted between CONV1D_TRANSPOSED and SOFTMAX, semantic-grouping with the sliding-window cluster). maxPool1dConfig_t / avgPool1dConfig_t fwd-decls + union members added. Both registered in layerFunctions[] and in Optimizer.c::calcNumberOfStatesByLayerType (return 0 — pool layers have no trainable parameters). UnitTestMaxPool1d covers 9 RUN_TESTs (forwardBasic, calcOutputShape, backwardBasic, argmaxIndicesContent, multiChannel, multiBatch, withStrideAndDilation, withSamePadding, edgeCases-K=1). UnitTestAvgPool1d covers 7 RUN_TESTs (forwardBasic, backwardBasic, multiChannel, multiBatch, withStrideAndDilation, withSamePadding, edgeCases-K=L=4). All SAME fixtures exercise both left-edge and right-edge truncation (PR 1 Errata 4). Spec: docs/superpowers/specs/2026-05-06-conv1d-and-sliding-window-design.md §6.3, §6.4, §6.5, §6.6, §8.3, §8.4, §8.5

…chers PR #159 (Conv1d refactor) and PR #161 (MaxPool1d/AvgPool1d) added the new layer kernels to layer/Layer.c's vtable but left three downstream dispatcher switches in src/userApi/ at their pre-PR2 state, where the default branch is PRINT_ERROR + exit(1). Without these patches, no training program using Conv1d, MaxPool1d, or AvgPool1d can run end-to- end — the inference path, gradient setup path, and SGD parameter registration all hard-error before the first batch. Patches are minimal: each adds the missing CONV1D/MAXPOOL1D/AVGPOOL1D cases to existing dispatch switches (no new logic). - InferenceApi.c::initBufferOutput — extract forwardQ for the 3 layers - CalculateGradsSequential.c::initLayerOutputs — same extraction - SgdApi.c::sgdMCreateOptim — register Conv1d weights+bias parameters (Conv1d already returns 2 states from calcNumberOfStatesByLayerType in Optimizer.c, so the slot count was correct); MaxPool1d / AvgPool1d fall through with the other no-trainable-params layers Discovered while wiring up examples/har_classifier/train_c.c. Without this fix, the example would hard-error at the first inference call. Likely related to issue #6 (Conv1d empty-stub tail). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…in dispatchers PR #159 (Conv1d refactor + Conv1dTransposed Layer) and PR #161 (MaxPool1d + AvgPool1d) added the new layer kernels to layer/Layer.c's vtable but left four downstream dispatcher switches at their pre-PR2 state, where the default branch is PRINT_ERROR + exit(1). Without these patches, no training program using Conv1d, Conv1dTransposed, MaxPool1d, or AvgPool1d can run end-to-end — every path hard-errors before the first batch. Patches are minimal: each adds the missing CONV1D / CONV1D_TRANSPOSED / MAXPOOL1D / AVGPOOL1D cases to existing dispatch switches (no new logic). - src/userApi/InferenceApi.c::initBufferOutput — extract forwardQ for the four layers - src/userApi/training_loop/calculate_grads/CalculateGradsSequential.c ::initLayerOutputs — same extraction - src/userApi/optimizer/SgdApi.c::sgdMCreateOptim — register Conv1d and Conv1dTransposed weights+bias parameters; MaxPool1d / AvgPool1d fall through with the other no-trainable-params layers - src/optimizer/Optimizer.c::calcNumberOfStatesByLayerType — return 2 states for CONV1D_TRANSPOSED (CONV1D was already there) Discovered while wiring up examples/har_classifier/train_c.c. Likely related to issue #6 (Conv1d empty-stub tail). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

LeoBuron force-pushed the conv1d-pr3-pool branch from 4c58525 to f6cd541 Compare May 10, 2026 10:05

LeoBuron merged commit f6cd541 into develop May 10, 2026
5 checks passed

LeoBuron deleted the conv1d-pr3-pool branch May 10, 2026 11:04

LeoBuron mentioned this pull request May 10, 2026

feat(examples): HAR classifier with PyTorch + C parity (Stage 1/4) #162

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(layer): MaxPool1d + AvgPool1d (PR 3/3)#161

feat(layer): MaxPool1d + AvgPool1d (PR 3/3)#161
LeoBuron merged 1 commit into
developfrom
conv1d-pr3-pool

LeoBuron commented May 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

LeoBuron commented May 8, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant