Skip to content

fix: correct total size in AGI bitfield compression header (closes #659)#663

Closed
botbikamordehai2-sketch wants to merge 1 commit into
silx-kit:mainfrom
botbikamordehai2-sketch:fix/issue-659-1782558137
Closed

fix: correct total size in AGI bitfield compression header (closes #659)#663
botbikamordehai2-sketch wants to merge 1 commit into
silx-kit:mainfrom
botbikamordehai2-sketch:fix/issue-659-1782558137

Conversation

@botbikamordehai2-sketch

Copy link
Copy Markdown

What

The compress function in agi_bitfield.py computed the data size header before writing the row_start table, causing the stored size to be smaller than the actual compressed data. This led to incorrect shape reconstruction when decompressing, resulting in errors like array shape mismatch (e.g., (1,2) vs (2,)) in tests like test_get_data_edf.

Fix

Moved the writing of the row_start table before capturing the total buffer size. Now data_size includes the row_start table, ensuring the header accurately reflects the full compressed data length.

Closes #659

@kif kif mentioned this pull request Jun 30, 2026
@kif

kif commented Jun 30, 2026

Copy link
Copy Markdown
Member

Interesting, indeed. Look at the file reference.esperanto which uses this compression scheme and was written by Crysalis itself.

In [1]: import fabio
In [3]: f=fabio.open("reference.esperanto")
In [4]: g=fabio.esperantoimage.EsperantoImage()
In [8]: infile = g._open("reference.esperanto")
In [9]: g._readheader(infile)
In [10]: infile.tell()
Out[10]: 6400

In [11]: raw_data = infile.read()
In [12]: raw_data[:4]
Out[12]: b'\xb4u\x00\x00'

In [14]: import struct
In [15]: data_size = struct.unpack("<I", raw_data[:4])

In [16]: data_size
Out[16]: (30132,)

In [17]: len(raw_data)
Out[17]: 31744

In [18]: data_size = struct.unpack("<I", raw_data[:4])[0]

In [20]: from io import BytesIO
In [21]: data_block = BytesIO(raw_data[4:])
In [22]: import numpy
In [24]: output = numpy.zeros(g.shape, dtype=numpy.int32)
In [25]: g.shape
Out[25]: (256, 256)
In [26]: row_count, col_count = g.shape
In [28]: from fabio.compression.agi_bitfield import decompress_row

In [29]: for row_index in range(row_count):
    ...:     output[row_index] = decompress_row(data_block, col_count)
    ...: 

In [31]: data_block.tell()
Out[31]: 30132

As one can see, the numerical value in the header of the raw data block (30132) is the same as the pointer position at the end of reading of the different rows in the image. Thus the stored value should exclude the 4 bytes for the size. The current implementation is correct.

**This is the second useless pull request you are opening, a third one and you will be banned from all our projects **

@kif kif closed this Jun 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Regression with Fabio 026.06

2 participants