feat(tensor): add BOOL bit-packed dtype + SYM Lückenschluss#173
Merged
Conversation
Adds BOOL as a sibling qtype with 1-bit-per-element bit-packed storage — prerequisite for the Dropout layer (#149) where the per-element mask between forward and backward needs ~32× less memory than FLOAT32. Storage layout: LSB-first within each byte, matching the existing getBitmask / readByte / writeByte convention. Last-byte padding bits remain zero by construction. qConfig is NULL (no scale / zero-point). New API: - BOOL value in qtype_t enum (at end, preserving existing ordinals) - initBoolQuantization (stack-fill) + quantizationInitBool (heap, Two-Tier convention; mirrors quantizationInitFloat / Int32 / SymInt32 / Asym) - tensorBoolGet / tensorBoolSet bit-level accessors (storage-order flat index; non-BOOL tensors trip PRINT_ERROR + exit) - tensorFillFromBoolBuffer userApi helper (caller-owned bool source, count + type validation, packs via tensorBoolSet) Wired into five existing dispatch switches in src/tensor/Tensor.c: calcBytesPerElement, calcBitsPerElement, calcNumberOfBytesForData, copyQuantization, printTensor. Test coverage: - test/unit/tensor/UnitTestTensorBool.c (new) — 13 tests: byte/bit counts, get/set round-trip, neighbor preservation, last-byte padding invariants, copyTensor, printTensor smoke - test/unit/tensor/UnitTestTensorApi.c — tensorFillFromBoolBuffer round-trip across byte boundary (N=12) - test/unit/tensor/UnitTestTensor.c — 4 SYM regression tests (qBits ∈ {3,5}, N ∈ {1,4,10}) - Mutation-tested per spec §7.3: all 7 mutations caught. Bundled SYM-Lückenschluss bugfix: calcBytesPerElement, calcBitsPerElement, and calcNumberOfBytesForData previously hit default: PRINT_ERROR; exit(1) for SYM tensors. SYM cases added; math mirrors ASYM (ceilf for byte count, qBits for bit count, (N*qBits + 7) / 8 for total bytes). Closes #169 Closes #170
3be2396 to
3dc0d71
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
BOOLvalue toqtype_trepresenting 1-bit-per-element bit-packed storage.initBoolQuantization(stack-fill) andquantizationInitBool(heap, Two-Tier convention).tensorBoolGet/tensorBoolSetbit accessors andtensorFillFromBoolBufferuserApi helper.src/tensor/Tensor.c(calcBytesPerElement,calcBitsPerElement,calcNumberOfBytesForData,copyQuantization,printTensor).default: PRINT_ERROR; exit(1).Architecture
BOOL is bit-packed (1 bit per element, LSB-first within each byte, matching the existing
getBitmask/readByte/writeByteconvention). Last-byte padding is always zero by construction (init zeroes the buffer, set/get touch only the target bit).qConfigisNULLfor BOOL (no scale/zero-point parameters).This is the prerequisite for the upcoming Dropout layer (#149) — the per-element mask between forward and backward needs ~32× less memory bit-packed.
Test Plan
test/unit/tensor/UnitTestTensorBool.c(new) — 13 tests covering byte/bit counts, accessors, padding invariants, copyTensor, printTensor.test/unit/tensor/UnitTestTensor.c— 4 SYM regression tests for the three calc functions.test/unit/tensor/UnitTestTensorApi.c—tensorFillFromBoolBufferround-trip test.ctest43/43 green on macOS host.Verified Mutations
tensorBoolSetbit-index off-by-onetest_tensorBoolSetGet_RoundTrip_KnownPatterntensorBoolGetbit-index off-by-onetest_tensorBoolSetGet_RoundTrip_KnownPatterncalcNumberOfBytesForDataBOOL truncated dividetest_calcNumberOfBytesForData_Bool_N3calcBitsPerElementSYM case removedtest_calcBitsPerElement_Sym_qBits3tensorFillFromBoolBufferloop no-optestTensorFillFromBoolBuffer_RoundTrip_N12copyQuantizationBOOL case removedtest_copyTensorBool_AllBitsMatchtensorBoolSetbyte-index off-by-onetestTensorFillFromBoolBuffer_RoundTrip_N12References
Closes #169 (BOOL bit-packed dtype implementation)
Closes #170 (SYM-Lückenschluss in three calc functions)
See also #160 (N=0 latent hazard — behavior inherited unchanged)
See also #171 (
copyQuantizationmissing INT32/SYM/ASYM cases — separate scope, this PR adds only BOOL)See also #172 (
calcBytesPerTensortruncates for sub-byte dtypes — discovered during review, pre-existing)Note:
Closes #NNauto-close only triggers on main-merges; develop-merges need manualgh issue closeper repo convention.🤖 Generated with Claude Code