feat(tensor): add BOOL bit-packed dtype + SYM Lückenschluss by LeoBuron · Pull Request #173 · es-ude/OnDeviceTraining

LeoBuron · 2026-05-12T17:33:20Z

Summary

Adds BOOL value to qtype_t representing 1-bit-per-element bit-packed storage.
Adds initBoolQuantization (stack-fill) and quantizationInitBool (heap, Two-Tier convention).
Adds tensorBoolGet / tensorBoolSet bit accessors and tensorFillFromBoolBuffer userApi helper.
Wires BOOL into five dispatch switches in src/tensor/Tensor.c (calcBytesPerElement, calcBitsPerElement, calcNumberOfBytesForData, copyQuantization, printTensor).
Bundles the SYM-Lückenschluss bugfix as a separate commit — the three calc-function switches were missing SYM cases pre-existing and would trip default: PRINT_ERROR; exit(1).

Architecture

BOOL is bit-packed (1 bit per element, LSB-first within each byte, matching the existing getBitmask / readByte / writeByte convention). Last-byte padding is always zero by construction (init zeroes the buffer, set/get touch only the target bit). qConfig is NULL for BOOL (no scale/zero-point parameters).

This is the prerequisite for the upcoming Dropout layer (#149) — the per-element mask between forward and backward needs ~32× less memory bit-packed.

Test Plan

test/unit/tensor/UnitTestTensorBool.c (new) — 13 tests covering byte/bit counts, accessors, padding invariants, copyTensor, printTensor.
test/unit/tensor/UnitTestTensor.c — 4 SYM regression tests for the three calc functions.
test/unit/tensor/UnitTestTensorApi.c — tensorFillFromBoolBuffer round-trip test.
Mutation testing per spec §7.3 — all 7 mutations caught (table below).
Full ctest 43/43 green on macOS host.

Verified Mutations

#	Mutation	Test that fired
1	`tensorBoolSet` bit-index off-by-one	`test_tensorBoolSetGet_RoundTrip_KnownPattern`
2	`tensorBoolGet` bit-index off-by-one	`test_tensorBoolSetGet_RoundTrip_KnownPattern`
3	`calcNumberOfBytesForData` BOOL truncated divide	`test_calcNumberOfBytesForData_Bool_N3`
4	`calcBitsPerElement` SYM case removed	`test_calcBitsPerElement_Sym_qBits3`
5	`tensorFillFromBoolBuffer` loop no-op	`testTensorFillFromBoolBuffer_RoundTrip_N12`
6	`copyQuantization` BOOL case removed	`test_copyTensorBool_AllBitsMatch`
7	`tensorBoolSet` byte-index off-by-one	`testTensorFillFromBoolBuffer_RoundTrip_N12`

References

Closes #169 (BOOL bit-packed dtype implementation)
Closes #170 (SYM-Lückenschluss in three calc functions)

See also #160 (N=0 latent hazard — behavior inherited unchanged)
See also #171 (copyQuantization missing INT32/SYM/ASYM cases — separate scope, this PR adds only BOOL)
See also #172 (calcBytesPerTensor truncates for sub-byte dtypes — discovered during review, pre-existing)

Note: Closes #NN auto-close only triggers on main-merges; develop-merges need manual gh issue close per repo convention.

🤖 Generated with Claude Code

Adds BOOL as a sibling qtype with 1-bit-per-element bit-packed storage — prerequisite for the Dropout layer (#149) where the per-element mask between forward and backward needs ~32× less memory than FLOAT32. Storage layout: LSB-first within each byte, matching the existing getBitmask / readByte / writeByte convention. Last-byte padding bits remain zero by construction. qConfig is NULL (no scale / zero-point). New API: - BOOL value in qtype_t enum (at end, preserving existing ordinals) - initBoolQuantization (stack-fill) + quantizationInitBool (heap, Two-Tier convention; mirrors quantizationInitFloat / Int32 / SymInt32 / Asym) - tensorBoolGet / tensorBoolSet bit-level accessors (storage-order flat index; non-BOOL tensors trip PRINT_ERROR + exit) - tensorFillFromBoolBuffer userApi helper (caller-owned bool source, count + type validation, packs via tensorBoolSet) Wired into five existing dispatch switches in src/tensor/Tensor.c: calcBytesPerElement, calcBitsPerElement, calcNumberOfBytesForData, copyQuantization, printTensor. Test coverage: - test/unit/tensor/UnitTestTensorBool.c (new) — 13 tests: byte/bit counts, get/set round-trip, neighbor preservation, last-byte padding invariants, copyTensor, printTensor smoke - test/unit/tensor/UnitTestTensorApi.c — tensorFillFromBoolBuffer round-trip across byte boundary (N=12) - test/unit/tensor/UnitTestTensor.c — 4 SYM regression tests (qBits ∈ {3,5}, N ∈ {1,4,10}) - Mutation-tested per spec §7.3: all 7 mutations caught. Bundled SYM-Lückenschluss bugfix: calcBytesPerElement, calcBitsPerElement, and calcNumberOfBytesForData previously hit default: PRINT_ERROR; exit(1) for SYM tensors. SYM cases added; math mirrors ASYM (ceilf for byte count, qBits for bit count, (N*qBits + 7) / 8 for total bytes). Closes #169 Closes #170

LeoBuron force-pushed the bool-tensor-dtype branch from 3be2396 to 3dc0d71 Compare May 12, 2026 19:19

LeoBuron merged commit 3dc0d71 into develop May 15, 2026
5 checks passed

LeoBuron deleted the bool-tensor-dtype branch May 15, 2026 09:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tensor): add BOOL bit-packed dtype + SYM Lückenschluss#173

feat(tensor): add BOOL bit-packed dtype + SYM Lückenschluss#173
LeoBuron merged 1 commit into
developfrom
bool-tensor-dtype

LeoBuron commented May 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

LeoBuron commented May 12, 2026

Summary

Architecture

Test Plan

Verified Mutations

References

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant