[plum] Tutorial: Unpacking Bytes

This tutorial demonstrates the various methods for unpacking Python objects from a sequence of bytes.

The tutorial examples use the following setup:

>>> from plum.bigendian import uint8, uint16
>>> from plum.buffer import Buffer
>>> from plum.structure import Structure, member
>>> from plum.utilities import unpack
>>>
>>> class MyStruct(Structure):
...     m1: int = member(fmt=uint16)
...     m2: int = member(fmt=uint8)
...

Unpack Utility Function

The following example shows a simple use of the unpack() utility function for unpacking a Python object from a bytes sequence.

>>> unpack(uint16, b'\x00\x01')
1

The reference API and tutorial pages thoroughly cover the variations and won’t be repeated here.

Transform Unpack Method

All plum transforms support an unpack() method that accepts a buffer bytes sequence and produces a Python object based on the properties of the transform:

>>> # transform instance method
>>> uint16.unpack(b'\x00\x01')
1
>>> # data store transform class method
>>> MyStruct.unpack(b'\x00\x01\x02')
MyStruct(m1=1, m2=2)

Incremental Unpacking

The unpack methods described in the previous sections raise an UnpackError if supplied extra bytes in the sequence beyond what is needed to unpack the Python object for the transform:

>>> buffer = b'\x00\x01'
>>> uint8.unpack(buffer)
Traceback (most recent call last):
    ...
plum.exceptions.UnpackError:
<BLANKLINE>
+--------+----------------+-------+--------+
| Offset | Value          | Bytes | Format |
+--------+----------------+-------+--------+
| 0      | 0              | 00    | uint8  |
+--------+----------------+-------+--------+
| 1      | <excess bytes> | 01    |        |
+--------+----------------+-------+--------+
<BLANKLINE>
ExcessMemoryError occurred during unpack operation:
<BLANKLINE>
1 unconsumed bytes

The plum.buffer module offers the Buffer class for incrementally unpacking Python objects from a bytes sequence. Start by passing the bytes sequence to the constructor. Then use the unpack() method repeatedly providing a format fmt argument:

>>> buffer = Buffer(b'\x02\x01\x02\x03')
>>> array_length = buffer.unpack(uint8)
>>> array_length
2
>>> buffer.unpack([uint8] * array_length)
[1, 2]

To ensure all bytes were consumed after the completion of all incremental unpacking, use the buffer instance as a context manager:

>>> with Buffer(b'\x02\x01\x02\x03') as buffer:
...     array_length = buffer.unpack(uint8)
...     array = buffer.unpack([uint8] * array_length)
...
Traceback (most recent call last):
    ...
plum.exceptions.ExcessMemoryError: 1 unconsumed bytes

The Buffer instance supports all the behaviors of bytes since it is a subclass of bytes. But it also supports an offset attribute that holds the current position in the bytes sequence where the next unpacking operation will occur:

>>> buffer = Buffer(b'\x00\x01\x02\x03')
>>> buffer.offset
0
>>> buffer.offset = 2
>>> buffer.unpack(uint8)
2