Implement sexed-dtype to avoid byte-swapping by kif · Pull Request #647 · silx-kit/fabio

kif · 2026-06-16T16:03:57Z

Lots of modification all over the code ... and in addition, one has access only to little-endian computers.

Close #310

Claude-code used

kif · 2026-06-17T16:09:15Z

Hi reviewers,
For this silx dev week I decided to set Claude on an 7-year-old issue, reported at that time by Peter Boesecke.
Claude managed to migrate all project from swapping bytes to directly reading/writing data with an "endianness aware data-type", well it was nevertheless a day of work. I took me as long to clean up manually the code, there could be some remaining inconsistencies, so than for the review.

woutdenolf · 2026-06-17T19:40:26Z

I see one byteswap call left. Less crucial but good to stay consistent.

fabio/src/fabio/test/codecs/test_brukerimage.py

Lines 60 to 65 in d474564

    
           MYIMAGE = numpy.ones((256, 256), numpy.uint16) * 16 
        
           MYIMAGE[0, 0] = 0 
        
           MYIMAGE[1, 1] = 32 
        
           MYIMAGE[127:129, 127:129] = 65535 
        
           if not numpy.little_endian: 
        
               MYIMAGE.byteswap(True)

LE_uint16 = np.dtype("uint16").newbyteorder("<")

MYIMAGE = np.full((256, 256), 16, dtype=LE_uint16)

woutdenolf · 2026-06-17T19:48:07Z

    :param version: PCK version 1 or 2
    :param normal_start: position of the normal value section (can be auto-guessed)
-    :param swap_needed: set to True when reading data from a foreign endianness (little on big or big on little)
+    :param byteorder: set to ">" for big=endian or "<" for little-endian data decompression


https://numpy.org/doc/2.1/reference/generated/numpy.dtype.newbyteorder.html

There are more. Perhaps reference numpy's newbyteorder instead.

To my knowledge, there are only types of byte-order, all the other fall back on any of the two supported.

This is a python adapter, we could add support for the other, but:

"|" makes no sense, data-types are all longer than 1 byte

"=" is what we want to prevent because we would not have to manage it if we could

"S" has neither sense here since we have no starting point.
Actually none of the alternative solution make sense in this function, so I would prefer keeping it simple: <>

OK for correcting the typo

woutdenolf · 2026-06-17T19:50:56Z

    :param version: PCK version 1 or 2
    :param normal_start: position of the normal value section (can be auto-guessed)
-    :param swap_needed: set to True when reading data from a foreign endianness (little on big or big on little)
+    :param byteorder: set to ">" to decompress big-endian data else "<" for little-endian


woutdenolf · 2026-06-17T19:55:32Z

        self.numhigh = None
        self.numpixels = None
-        self.swap_needed = None
+        self.byteorder = "="


https://numpy.org/doc/2.1/reference/generated/numpy.dtype.newbyteorder.html

‘S’ - swap dtype from current to opposite endian

{‘<’, ‘little’} - little endian

{‘>’, ‘big’} - big endian

{‘=’, ‘native’} - native order

{‘|’, ‘I’} - ignore (no change to byte order)

To keep the same behavior this would be "|" instead of "=" afaiu.

woutdenolf · 2026-06-17T20:03:45Z

-            if self.swap_needed():
-                data.byteswap(True)
+            stype = self.get_stype(self._dtype, self._data_byteorder)
+            print(stype)


Suggested change

print(stype)

woutdenolf · 2026-06-17T20:18:44Z

            raw = f.read(frame.blobsize)
        try:
-            data = numpy.frombuffer(raw, dtype=self.bytecode).copy()
+            file_endianness = "big" if (numpy.little_endian == bool(frame.swap_needed())) else "little"


Ok I'm getting too old for this

numpy.little_endian = machine_is_little swap_needed = machine_endianness != file_endianness file_is_big = machine_is_little == swap_needed file_is_big = machine_is_little == (machine_endianness != file_endianness) file_is_big = file_endianness == BIG file_endianness = "big" if file_is_big else "little"

This would be much simpler:

file_endianness = ( "little" if frame._data_byteorder is ENDIANNESS.LITTLE else "big" )

woutdenolf · 2026-06-17T20:24:18Z

    datatype="int",
    signed="n",
-    swap="n",
+    byteorder="<",


I think this should be "|" if we want to keep the same behavior.

Without "reference point" one cannot keep anything !

woutdenolf

I understand why it took you a day. Jeeeezus. I did a pass but I suggest a third person goes through the PR as well.

kif · 2026-06-17T20:44:02Z

Thanks @woutdenolf you spotted some which went through ...

payno · 2026-06-19T07:39:34Z

+    @deprecation.deprecated
+    def swap_needed(self):
+        """
+        Decide if we need to byteswap
+        """
+        if self._data_byteorder is ENDIANNESS.LITTLE and numpy.little_endian:
+            return False
+        elif self._data_byteorder is ENDIANNESS.BIG and numpy.little_endian:
+            return True
+        elif self._data_byteorder is ENDIANNESS.LITTLE and not numpy.little_endian:
+            return True
+        elif self._data_byteorder is ENDIANNESS.BIG and not numpy.little_endian:
+            return False
+        else:
+            logger.warning("Unconsistent endianness !!!")


As byteorder can only be "little" or "big" I think the following would simplify the code:
I think _data_byteorder can be | (does not matter) or None. not sure the "does not matter" is handled today.

def swap_needed(self): """ Decide if we need to byteswap """ if self._data_byteorder is ENDIANNESS.LITTLE: return not numpy.little_endian: elif self._data_byteorder is ENDIANNESS.BIG: return numpy.little_endian: else: logger.warning("Unconsistent endianness !!!")

Thanks Henri

kif added 16 commits June 16, 2026 17:55

implement sexed datatypes to avoid swapping endianness

95bc5f2

implement some fileformats

0cebe9f

implement test

10843ca

WIP, this one is broken :(

70186a1

Merge remote-tracking branch 'upstream/main' into 310_sexed_dtype

8b5423e

Merge branch 'main' into 310_sexed_dtype

de5360f

migrate several codecs

6a720b1

Claude-code used

rename sexed_dtype by stype

6c90b2c

fix endianness managerment in agi-bitfield

e6870b6

upgrade bruckerimage

f70343e

big modification made by Claude ... deserves review

fa49d77

review code and clean up after Claude ...

9decd52

review dtreck byteswapping strategy

06acd49

polish edf reading

d02e4a8

review all remaining files

8eb14d1

ruff checking + formating

52cf86c

kif requested review from jonwright, payno, t20100 and woutdenolf June 17, 2026 16:05

kif added the ready to merge label Jun 17, 2026

kif added this to the v2026.6 milestone Jun 17, 2026

woutdenolf reviewed Jun 17, 2026

View reviewed changes

kif added 2 commits June 18, 2026 10:08

correct mistakes spotted by Wout

10f3aad

deprecate unused public methods

bd3cb57

payno reviewed Jun 19, 2026

View reviewed changes

t20100 approved these changes Jun 19, 2026

View reviewed changes

Comment thread src/fabio/dm3image.py Outdated

kif added 4 commits June 19, 2026 10:41

Update edfimage.py

b931768

Thanks Henri

Update byte order for numpy data types in dm3image.py

e7f0fb8

typo

7c5d0d9

Fix indentation for byteorder property

50da3e6

kif merged commit 7c29891 into silx-kit:main Jun 19, 2026
6 checks passed

Uh oh!

Conversation

kif commented Jun 16, 2026

Uh oh!

kif commented Jun 17, 2026

Uh oh!

woutdenolf commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

woutdenolf Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

woutdenolf left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kif commented Jun 17, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

woutdenolf commented Jun 17, 2026 •

edited

Loading

woutdenolf Jun 17, 2026 •

edited

Loading

woutdenolf left a comment •

edited

Loading