[plum.str] String Tutorial: Zero Terminated Strings

This tutorial shows the basics of using the pre-baked plum zero terminated string types, AsciiZeroTermStr and Utf8ZeroTermStr. These types facilitate packing (encoding) strings into bytes and unpacking (decoding) bytes into strings for the two most common string types. plum supports other encodings, but first read this page and then refer to the Create Custom Types tutorial.

Unpacking

When used as the first argument to the unpack() function, plum string types decode the bytes provided based on the encoding pre-configured within the type. Obvious by its name, the AsciiZeroTermStr decodes bytes as simple ASCII characters stopping at the first null byte. In the following example, the usage of the zero terminated string type within a structure shows that the string type avoids consuming the bookend byte by detecting the end of the string at the null byte:

>>> from plum import unpack
>>> from plum.int.little import UInt8
>>> from plum.str import AsciiZeroTermStr
>>> from plum.structure import Member, Structure
>>>
>>> class Struct1(Structure):
...     string: str = Member(cls=AsciiZeroTermStr)
...     bookend: int = Member(cls=UInt8)
...
>>> struct1 = unpack(Struct1, b'Hello World!\x00\x99')
>>> struct1
Struct1(string='Hello World!', bookend=153)
>>>
>>> struct1.dump()
+--------+-------------------+----------------+-------------------------------------+------------------+
| Offset | Access            | Value          | Bytes                               | Type             |
+--------+-------------------+----------------+-------------------------------------+------------------+
|        |                   |                |                                     | Struct1          |
|        | [0] (.string)     |                |                                     | AsciiZeroTermStr |
|  0     |   [0:12]          | 'Hello World!' | 48 65 6c 6c 6f 20 57 6f 72 6c 64 21 |                  |
| 12     |   --termination-- |                | 00                                  |                  |
| 13     | [1] (.bookend)    | 153            | 99                                  | UInt8            |
+--------+-------------------+----------------+-------------------------------------+------------------+

Instantiating & Packing

plum string types follow the str API. The constructor accepts a string. When used with the plum pack() function, the type encodes the bytes based on the encoding pre-configured within the type and appends the null byte (packing a string directly works similarly):

>>> from plum import pack
>>> from plum.str import Utf8ZeroTermStr
>>>
>>> # create instance and pack
>>> s = Utf8ZeroTermStr('Ahoj světe')
>>> s.pack()
bytearray(b'Ahoj sv\xc4\x9bte\x00')
>>>
>>> # pack directly
>>> pack(Utf8ZeroTermStr, 'Ahoj světe')
bytearray(b'Ahoj sv\xc4\x9bte\x00')