feat(layer): MaxPool1d + AvgPool1d (PR 3/3)#161
Merged
Conversation
MaxPool1d Layer 3 implemented from greenfield: init/forward/backward/ calcOutputShape on FLOAT32. Forward scans each window for max value and writes the corresponding input position to a caller-allocated INT32 argmax tensor stored in cfg. Backward is a pure scatter — for each output position, accumulate lossGrad into propLoss[b, c, argmax]. forwardInput is unused in backward (argmax already encodes the necessary info). Sentinel -1 + PRINT_ERROR + zero-output handles the theoretically-possible-but-in-practice-unreachable validCount==0 case (spec §6.3). AvgPool1d Layer 3 implemented from greenfield: stateless init/forward/backward/calcOutputShape on FLOAT32. Forward computes mean over each window with divisor = kernel_size always (count_include_pad=True / A1 semantics per spec §6.4 / §11). Backward scatters lossGrad/kernel_size into all valid input positions of each window. Padded positions get no gradient. forwardInput is unused (geometry derived from kernel + propLoss shape). Layer 3 contains zero heap allocations; output, argmax, and propLoss buffers are caller-owned. calcOutputShape validates 3D input and populates orderOfDimensions via setOrderOfDimsForNewTensor (matches Conv1d / Conv1dTransposed convention). Two PyTorch-derived gold-value generators (generate_expected_max_pool_1d.py, generate_expected_avg_pool_1d.py) emit forward + autograd-derived dL/dx, plus int32 expectedArgmax arrays for MaxPool. The AvgPool generator includes a hand-built autograd-trackable dilation helper (F.avg_pool1d lacks a dilation arg). _format_float_literal helper applies PR 1 Errata 1 (avoid :.9g). withStrideAndDilation fixtures use torch.randn_like(y) lossGrad to keep positional mutations non-vacuous (codebase memory: uniform-lossGrad-mutation-vacuity). layerType_t extended with MAXPOOL1D and AVGPOOL1D (slotted between CONV1D_TRANSPOSED and SOFTMAX, semantic-grouping with the sliding-window cluster). maxPool1dConfig_t / avgPool1dConfig_t fwd-decls + union members added. Both registered in layerFunctions[] and in Optimizer.c::calcNumberOfStatesByLayerType (return 0 — pool layers have no trainable parameters). UnitTestMaxPool1d covers 9 RUN_TESTs (forwardBasic, calcOutputShape, backwardBasic, argmaxIndicesContent, multiChannel, multiBatch, withStrideAndDilation, withSamePadding, edgeCases-K=1). UnitTestAvgPool1d covers 7 RUN_TESTs (forwardBasic, backwardBasic, multiChannel, multiBatch, withStrideAndDilation, withSamePadding, edgeCases-K=L=4). All SAME fixtures exercise both left-edge and right-edge truncation (PR 1 Errata 4). Spec: docs/superpowers/specs/2026-05-06-conv1d-and-sliding-window-design.md §6.3, §6.4, §6.5, §6.6, §8.3, §8.4, §8.5
8 tasks
LeoBuron
added a commit
that referenced
this pull request
May 10, 2026
…chers PR #159 (Conv1d refactor) and PR #161 (MaxPool1d/AvgPool1d) added the new layer kernels to layer/Layer.c's vtable but left three downstream dispatcher switches in src/userApi/ at their pre-PR2 state, where the default branch is PRINT_ERROR + exit(1). Without these patches, no training program using Conv1d, MaxPool1d, or AvgPool1d can run end-to- end — the inference path, gradient setup path, and SGD parameter registration all hard-error before the first batch. Patches are minimal: each adds the missing CONV1D/MAXPOOL1D/AVGPOOL1D cases to existing dispatch switches (no new logic). - InferenceApi.c::initBufferOutput — extract forwardQ for the 3 layers - CalculateGradsSequential.c::initLayerOutputs — same extraction - SgdApi.c::sgdMCreateOptim — register Conv1d weights+bias parameters (Conv1d already returns 2 states from calcNumberOfStatesByLayerType in Optimizer.c, so the slot count was correct); MaxPool1d / AvgPool1d fall through with the other no-trainable-params layers Discovered while wiring up examples/har_classifier/train_c.c. Without this fix, the example would hard-error at the first inference call. Likely related to issue #6 (Conv1d empty-stub tail). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
LeoBuron
added a commit
that referenced
this pull request
May 10, 2026
…in dispatchers PR #159 (Conv1d refactor + Conv1dTransposed Layer) and PR #161 (MaxPool1d + AvgPool1d) added the new layer kernels to layer/Layer.c's vtable but left four downstream dispatcher switches at their pre-PR2 state, where the default branch is PRINT_ERROR + exit(1). Without these patches, no training program using Conv1d, Conv1dTransposed, MaxPool1d, or AvgPool1d can run end-to-end — every path hard-errors before the first batch. Patches are minimal: each adds the missing CONV1D / CONV1D_TRANSPOSED / MAXPOOL1D / AVGPOOL1D cases to existing dispatch switches (no new logic). - src/userApi/InferenceApi.c::initBufferOutput — extract forwardQ for the four layers - src/userApi/training_loop/calculate_grads/CalculateGradsSequential.c ::initLayerOutputs — same extraction - src/userApi/optimizer/SgdApi.c::sgdMCreateOptim — register Conv1d and Conv1dTransposed weights+bias parameters; MaxPool1d / AvgPool1d fall through with the other no-trainable-params layers - src/optimizer/Optimizer.c::calcNumberOfStatesByLayerType — return 2 states for CONV1D_TRANSPOSED (CONV1D was already there) Discovered while wiring up examples/har_classifier/train_c.c. Likely related to issue #6 (Conv1d empty-stub tail). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR 3 of 3 in the Conv1d/Pool-family expansion (spec:
docs/superpowers/specs/2026-05-06-conv1d-and-sliding-window-design.md).Adds two new Layer-3 ops, both built on PR 1's
SlidingWindow1dutility:kernel_sizealways (count_include_pad=True / A1 semantics per spec §6.4 / §11).Layer 3 is alloc-free; argmax/output/propLoss buffers are caller-owned.
Layer registry extended:
MAXPOOL1DandAVGPOOL1Denum entries (slotted betweenCONV1D_TRANSPOSEDandSOFTMAX); union members andlayerFunctions[]entries registered. Optimizer.ccase MAXPOOL1D / AVGPOOL1D: return 0;(no trainable params) added.Test plan
PyTorch-derived gold-value generators run via
uv run; CMake regenerates them on script change.withStrideAndDilationfixtures usetorch.randn_like(y)lossGrad to keep positional mutations non-vacuous (percodebase_uniform_lossgrad_mutation_vacuity).Closes #6 (Conv1d/ConvT empty-stub — already addressed by PR 2 for Conv side; this PR completes the Pool side that the spec scope absorbs).
NOTE: PR is stacked on
develop. Perfeedback_github_auto_close_default_branch, "Closes #NN" only triggers GitHub's auto-close on default-branch (main) merges. After develop→main merge, manualgh issue close 6is required if not yet closed.🤖 Generated with Claude Code