Skip to content

feat(layer): MaxPool1d + AvgPool1d (PR 3/3)#161

Merged
LeoBuron merged 1 commit into
developfrom
conv1d-pr3-pool
May 10, 2026
Merged

feat(layer): MaxPool1d + AvgPool1d (PR 3/3)#161
LeoBuron merged 1 commit into
developfrom
conv1d-pr3-pool

Conversation

@LeoBuron
Copy link
Copy Markdown
Member

@LeoBuron LeoBuron commented May 8, 2026

Summary

PR 3 of 3 in the Conv1d/Pool-family expansion (spec: docs/superpowers/specs/2026-05-06-conv1d-and-sliding-window-design.md).

Adds two new Layer-3 ops, both built on PR 1's SlidingWindow1d utility:

  • MaxPool1d: forward writes per-output argmax indices into a caller-allocated INT32 tensor; backward scatters lossGrad into propLoss[b, c, argmax]. Sentinel -1 handles theoretical-only empty-window case (in-practice unreachable per spec §6.3).
  • AvgPool1d: forward computes mean over each window; backward scatters lossGrad/kernel_size to all valid input positions. Divisor is kernel_size always (count_include_pad=True / A1 semantics per spec §6.4 / §11).

Layer 3 is alloc-free; argmax/output/propLoss buffers are caller-owned.

Layer registry extended: MAXPOOL1D and AVGPOOL1D enum entries (slotted between CONV1D_TRANSPOSED and SOFTMAX); union members and layerFunctions[] entries registered. Optimizer.c case MAXPOOL1D / AVGPOOL1D: return 0; (no trainable params) added.

Test plan

PyTorch-derived gold-value generators run via uv run; CMake regenerates them on script change.

  • MaxPool: 9 RUN_TEST entries — basic forward/backward, calcOutputShape, argmaxIndicesContent, multiChannel, multiBatch, withStrideAndDilation, withSamePadding (left+right edge), edgeCases (K=1).
  • AvgPool: 7 RUN_TEST entries — basic forward/backward, multiChannel, multiBatch, withStrideAndDilation, withSamePadding, edgeCases (K=L=4).
  • All withStrideAndDilation fixtures use torch.randn_like(y) lossGrad to keep positional mutations non-vacuous (per codebase_uniform_lossgrad_mutation_vacuity).
  • All SAME fixtures exercise BOTH left-edge and right-edge truncation in the same window set (PR 1 Errata 4).
  • Mutation-tested per task; vacuous mutations are explicitly substituted or deferred to fixtures that exercise the missing axis.
  • Allocation-locality grep clean.
  • Full CI green on the PR.

Closes #6 (Conv1d/ConvT empty-stub — already addressed by PR 2 for Conv side; this PR completes the Pool side that the spec scope absorbs).

NOTE: PR is stacked on develop. Per feedback_github_auto_close_default_branch, "Closes #NN" only triggers GitHub's auto-close on default-branch (main) merges. After develop→main merge, manual gh issue close 6 is required if not yet closed.

🤖 Generated with Claude Code

MaxPool1d Layer 3 implemented from greenfield: init/forward/backward/
calcOutputShape on FLOAT32. Forward scans each window for max value
and writes the corresponding input position to a caller-allocated
INT32 argmax tensor stored in cfg. Backward is a pure scatter — for
each output position, accumulate lossGrad into propLoss[b, c, argmax].
forwardInput is unused in backward (argmax already encodes the
necessary info). Sentinel -1 + PRINT_ERROR + zero-output handles the
theoretically-possible-but-in-practice-unreachable validCount==0 case
(spec §6.3).

AvgPool1d Layer 3 implemented from greenfield: stateless
init/forward/backward/calcOutputShape on FLOAT32. Forward computes
mean over each window with divisor = kernel_size always
(count_include_pad=True / A1 semantics per spec §6.4 / §11). Backward
scatters lossGrad/kernel_size into all valid input positions of each
window. Padded positions get no gradient. forwardInput is unused
(geometry derived from kernel + propLoss shape).

Layer 3 contains zero heap allocations; output, argmax, and propLoss
buffers are caller-owned. calcOutputShape validates 3D input and
populates orderOfDimensions via setOrderOfDimsForNewTensor (matches
Conv1d / Conv1dTransposed convention).

Two PyTorch-derived gold-value generators
(generate_expected_max_pool_1d.py, generate_expected_avg_pool_1d.py)
emit forward + autograd-derived dL/dx, plus int32 expectedArgmax
arrays for MaxPool. The AvgPool generator includes a hand-built
autograd-trackable dilation helper (F.avg_pool1d lacks a dilation
arg). _format_float_literal helper applies PR 1 Errata 1 (avoid :.9g).
withStrideAndDilation fixtures use torch.randn_like(y) lossGrad to
keep positional mutations non-vacuous (codebase memory:
uniform-lossGrad-mutation-vacuity).

layerType_t extended with MAXPOOL1D and AVGPOOL1D (slotted between
CONV1D_TRANSPOSED and SOFTMAX, semantic-grouping with the
sliding-window cluster). maxPool1dConfig_t / avgPool1dConfig_t
fwd-decls + union members added. Both registered in layerFunctions[]
and in Optimizer.c::calcNumberOfStatesByLayerType (return 0 — pool
layers have no trainable parameters).

UnitTestMaxPool1d covers 9 RUN_TESTs (forwardBasic, calcOutputShape,
backwardBasic, argmaxIndicesContent, multiChannel, multiBatch,
withStrideAndDilation, withSamePadding, edgeCases-K=1).
UnitTestAvgPool1d covers 7 RUN_TESTs (forwardBasic, backwardBasic,
multiChannel, multiBatch, withStrideAndDilation, withSamePadding,
edgeCases-K=L=4). All SAME fixtures exercise both left-edge and
right-edge truncation (PR 1 Errata 4).

Spec: docs/superpowers/specs/2026-05-06-conv1d-and-sliding-window-design.md
§6.3, §6.4, §6.5, §6.6, §8.3, §8.4, §8.5
@LeoBuron LeoBuron merged commit f6cd541 into develop May 10, 2026
5 checks passed
@LeoBuron LeoBuron deleted the conv1d-pr3-pool branch May 10, 2026 11:04
LeoBuron added a commit that referenced this pull request May 10, 2026
…chers

PR #159 (Conv1d refactor) and PR #161 (MaxPool1d/AvgPool1d) added the
new layer kernels to layer/Layer.c's vtable but left three downstream
dispatcher switches in src/userApi/ at their pre-PR2 state, where the
default branch is PRINT_ERROR + exit(1). Without these patches, no
training program using Conv1d, MaxPool1d, or AvgPool1d can run end-to-
end — the inference path, gradient setup path, and SGD parameter
registration all hard-error before the first batch.

Patches are minimal: each adds the missing CONV1D/MAXPOOL1D/AVGPOOL1D
cases to existing dispatch switches (no new logic).

- InferenceApi.c::initBufferOutput  — extract forwardQ for the 3 layers
- CalculateGradsSequential.c::initLayerOutputs — same extraction
- SgdApi.c::sgdMCreateOptim — register Conv1d weights+bias parameters
  (Conv1d already returns 2 states from calcNumberOfStatesByLayerType
  in Optimizer.c, so the slot count was correct); MaxPool1d / AvgPool1d
  fall through with the other no-trainable-params layers

Discovered while wiring up examples/har_classifier/train_c.c. Without
this fix, the example would hard-error at the first inference call.
Likely related to issue #6 (Conv1d empty-stub tail).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
LeoBuron added a commit that referenced this pull request May 10, 2026
…in dispatchers

PR #159 (Conv1d refactor + Conv1dTransposed Layer) and PR #161
(MaxPool1d + AvgPool1d) added the new layer kernels to layer/Layer.c's
vtable but left four downstream dispatcher switches at their pre-PR2
state, where the default branch is PRINT_ERROR + exit(1). Without
these patches, no training program using Conv1d, Conv1dTransposed,
MaxPool1d, or AvgPool1d can run end-to-end — every path hard-errors
before the first batch.

Patches are minimal: each adds the missing CONV1D / CONV1D_TRANSPOSED /
MAXPOOL1D / AVGPOOL1D cases to existing dispatch switches (no new
logic).

- src/userApi/InferenceApi.c::initBufferOutput  — extract forwardQ for
  the four layers
- src/userApi/training_loop/calculate_grads/CalculateGradsSequential.c
  ::initLayerOutputs — same extraction
- src/userApi/optimizer/SgdApi.c::sgdMCreateOptim — register Conv1d
  and Conv1dTransposed weights+bias parameters; MaxPool1d / AvgPool1d
  fall through with the other no-trainable-params layers
- src/optimizer/Optimizer.c::calcNumberOfStatesByLayerType — return 2
  states for CONV1D_TRANSPOSED (CONV1D was already there)

Discovered while wiring up examples/har_classifier/train_c.c. Likely
related to issue #6 (Conv1d empty-stub tail).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant