Bytes transform.

[plum.bytes] Module Reference

The plum.bytes module provides the BytesX transform which transfers bytes into and out of bytes buffers for applications such as when packing or unpacking structures that have members defined as holding raw bytes. This reference page demonstrates creating and using a BytesX transform as well as provides API details.

The examples shown on this page require the following setup:

>>> from plum.bytes import BytesX
>>> from plum.bigendian import uint8
>>> from plum.structure import Structure, member, sized_member
>>> from plum.utilities import pack, unpack

Fixed Size

The BytesX transform nbytes argument accepts an int and controls the number of bytes to transfer into or out of a bytes buffer. Structures where one member of the structure holds raw uninterpreted bytes is one use case. In the following example, with nbytes=4, the transform unpacks the first four bytes and transfers them into the rawbytes structure member (who’s format is a fixed size BytesX transform):

>>> bytes4 = BytesX(nbytes=4)
>>>
>>> class FixedStruct(Structure):
...     rawbytes: bytes = member(fmt=bytes4)
...     bookend: int = member(fmt=uint8)
...
>>> fixed_struct = FixedStruct.unpack(b'\x00\x01\x02\x03\x99')
>>> fixed_struct.dump()
+--------+----------+---------------------+-------------+-------------------------+
| Offset | Access   | Value               | Bytes       | Format                  |
+--------+----------+---------------------+-------------+-------------------------+
|        |          |                     |             | FixedStruct (Structure) |
|        | rawbytes |                     |             | bytes (fixed)           |
| 0      |   [0:4]  | b'\x00\x01\x02\x03' | 00 01 02 03 |                         |
| 4      | bookend  | 153                 | 99          | uint8                   |
+--------+----------+---------------------+-------------+-------------------------+
>>> fixed_struct.rawbytes
b'\x00\x01\x02\x03'

The bytes transform accepts any iterable of integers (e.g. bytes, bytearray, list of int, etc.) and transfers them into the bytes buffer. For example:

>>> pack(FixedStruct(rawbytes=[0, 1, 2, 3], bookend=0x99))
b'\x00\x01\x02\x03\x99'

Padding

The BytesX transform pad argument accepts a bytes sequence, but only a single byte. The pad defaults to empty bytes as evidenced in the last section’s example. When packing, the transform fills any missing bytes at the end with pad byte:

>>> padded_bytes = BytesX(nbytes=4, pad=b'\x00')
>>>
>>> class PaddedStruct(Structure):
...     rawbytes: bytes = member(fmt=padded_bytes)
...     bookend: int = member(fmt=uint8)
...
>>> struct = PaddedStruct(rawbytes=[1, 2], bookend=0x99)
>>> struct.dump()
+--------+-----------+-------------+-------+--------------------------+
| Offset | Access    | Value       | Bytes | Format                   |
+--------+-----------+-------------+-------+--------------------------+
|        |           |             |       | PaddedStruct (Structure) |
|        | rawbytes  |             |       | bytes (fixed,padded)     |
| 0      |   [0:2]   | b'\x01\x02' | 01 02 |                          |
| 2      |   --pad-- | b'\x00\x00' | 00 00 |                          |
| 4      | bookend   | 153         | 99    | uint8                    |
+--------+-----------+-------------+-------+--------------------------+

When unpacking, the transform strips off any pad bytes found at the end:

>>> struct = PaddedStruct.unpack(b'\x01\x02\x00\x00\x99')
>>> struct.dump()
+--------+-----------+-------------+-------+--------------------------+
| Offset | Access    | Value       | Bytes | Format                   |
+--------+-----------+-------------+-------+--------------------------+
|        |           |             |       | PaddedStruct (Structure) |
|        | rawbytes  |             |       | bytes (fixed,padded)     |
| 0      |   [0:2]   | b'\x01\x02' | 01 02 |                          |
| 2      |   --pad-- | b'\x00\x00' | 00 00 |                          |
| 4      | bookend   | 153         | 99    | uint8                    |
+--------+-----------+-------------+-------+--------------------------+
>>> struct.rawbytes
b'\x01\x02'

Greedy Bytes

When the nbytes BytesX transform argument is left to default to None, the transform behaves “greedy”. When packing, the transform transfers the entire bytes sequence into the buffer without checking its size:

>>> greedy_bytes = BytesX()
>>>
>>> class GreedyStruct(Structure):
...     bookend: int = member(fmt=uint8)
...     rawbytes: bytes = member(fmt=greedy_bytes)
...
>>> struct = GreedyStruct(bookend=0x99, rawbytes=[1, 2, 3, 4])
>>> struct.dump()
+--------+----------+---------------------+-------------+--------------------------+
| Offset | Access   | Value               | Bytes       | Format                   |
+--------+----------+---------------------+-------------+--------------------------+
|        |          |                     |             | GreedyStruct (Structure) |
| 0      | bookend  | 153                 | 99          | uint8                    |
|        | rawbytes |                     |             | bytes (greedy)           |
| 1      |   [0:4]  | b'\x01\x02\x03\x04' | 01 02 03 04 |                          |
+--------+----------+---------------------+-------------+--------------------------+
>>> pack(struct)
b'\x99\x01\x02\x03\x04'

When unpacking, the transform transfers the remaining bytes in the buffer. Within a structure application, the greedy member must be last. Otherwise without special protections, any members following it would starve and cause an :class:UnpackError to be raised.

>>> struct = GreedyStruct.unpack( b'\x99\x00\x01\x02\x03')
>>> struct.rawbytes
b'\x00\x01\x02\x03'

Automatically Sized Bytes

True to life, greed becomes useful when kept in check. The sized_member() function that defines a Structure member accepts a greedy transform (or data store class) as the fmt. When unpacking, the property keeps the greed in check by limiting the buffer bytes available to consume to the size controlled by a separate member of the structure (the size argument of the sized_member() function defines which member definition the size comes from).

>>> class SizedStruct(Structure):
...     size: int = member(fmt=uint8, compute=True)
...     rawbytes: bytes = sized_member(fmt=greedy_bytes, size=size)
...     bookend: int = member(fmt=uint8)
...
>>> struct = unpack(SizedStruct, b'\x01\x02\x99')
>>> struct.dump()
+--------+----------+---------+-------+-------------------------+
| Offset | Access   | Value   | Bytes | Format                  |
+--------+----------+---------+-------+-------------------------+
|        |          |         |       | SizedStruct (Structure) |
| 0      | size     | 1       | 01    | uint8                   |
|        | rawbytes |         |       | bytes (greedy)          |
| 1      |   [0:1]  | b'\x02' | 02    |                         |
| 2      | bookend  | 153     | 99    | uint8                   |
+--------+----------+---------+-------+-------------------------+

Passing compute=True when defining the size member property facilitates leaving the size member uninitialized when constructing the structure. When packing, the structure member gets computed automatically, in this case from the length of the rawbytes provided:

>>> struct = SizedStruct(rawbytes=[0] * 8, bookend=0x99)
>>> struct.dump()
+--------+----------+-------------------------------------+-------------------------+-------------------------+
| Offset | Access   | Value                               | Bytes                   | Format                  |
+--------+----------+-------------------------------------+-------------------------+-------------------------+
|        |          |                                     |                         | SizedStruct (Structure) |
|  0     | size     | 8                                   | 08                      | uint8                   |
|        | rawbytes |                                     |                         | bytes (greedy)          |
|  1     |   [0:8]  | b'\x00\x00\x00\x00\x00\x00\x00\x00' | 00 00 00 00 00 00 00 00 |                         |
|  9     | bookend  | 153                                 | 99                      | uint8                   |
+--------+----------+-------------------------------------+-------------------------+-------------------------+
>>> pack(struct)
b'\x08\x00\x00\x00\x00\x00\x00\x00\x00\x99'

See the Sized Structure Member tutorial for additional features of sized_member() function such as specifying size ratios and offsets.

API Reference

class plum.bytes.BytesX(nbytes: Optional[int] = None, pad: bytes = b'', name: Optional[str] = None)

Bytes transform.

name

Transform format name (for repr and dump “Format” column).

nbytes

Transform format size in bytes.

pad

Pad byte.

pack(value: Any) → bytes

Pack value as formatted bytes.

Raises:PackError if type error, value error, etc.
pack_and_dump(value: Any) → Tuple[bytes, plum.dump.Dump]

Pack value as formatted bytes and produce bytes summary.

Raises:PackError if type error, value error, etc.
unpack(buffer: bytes) → Any

Unpack value from formatted bytes.

Raises:UnpackError if insufficient bytes, excess bytes, or value error
unpack_and_dump(buffer: bytes) → Tuple[Any, plum.dump.Dump]

Unpack value from bytes and produce packed bytes summary.

Raises:UnpackError if insufficient bytes, excess bytes, or value error