How does I/O work in LiberTEM?

Many algorithms benefit from Tiled processing where the same slice of the signal dimension is processed for several frames in a row. In many cases, algorithms have specific minimum and maximum sizes in signal dimension, navigation dimension or total size where they operate efficiently. Smaller sizes might increase overheads, while larger sizes might reduce cache efficiency.

At the same time, file formats might operate well within specific size and shape limits. The K2IS raw format is a prime example where data is saved in tiled form and can be processed efficiently in specific tile sizes and shapes that follow the native layout. Furthermore, some formats require decoding or corrections by the CPU, such as FRMS6, where tiles that fit the L3 cache can speed up subsequent processing steps. Requirements from the I/O method such as alignment and efficient block sizes are taken into account as well.

The LiberTEM I/O back-end negotiates a tiling scheme between UDF and dataset that fulfills requirements from both UDF and dataset side as far as possible. However, it is not always guaranteed that the supplied data will fall within the requested limits.

New in version 0.6.0: This guide is written for version 0.6.0

High-level overview

  • Once per DataSet, the UDFRunner negotiates a TilingScheme using the Negotiator class

  • Data is split into partitions which are read from independently. Usually they split the navigation axis.

  • The TilingScheme is then passed on to Partition.get_tiles, which then yields DataTiles that match the given TilingScheme. Note that the tiles MUST match the slicing in the signal dimensions, but the tile depth may differ from the scheme (for example, if partition depth isn’t evenly divisible by tile depth, or if there are other technical reasons the DataSet can’t return the given depth)

  • Under the hood, the Partition
    • instantiates an IOBackend, which has a reference to a Decoder

    • generates read ranges, which are passed on to the IOBackend

    • delegates get_tiles to the IOBackend

  • The I/O process can be influenced by passing a subclass of FileSet to the Partition and overriding FileSet.get_read_ranges, implementing a Decoder, or even completely overriding the Partition.get_tiles functionality.

  • There are currently three I/O backends implemented: MMapBackend, BufferedBackend, and DirectBackend which are useful for different storage media and purposes

  • MMapBackend.get_tiles has two modes of operation: either it returns a reference to the tiles “straight” from the file, without copying or decoding, or it uses the read ranges and copies/decodes the tiles in smaller units.

  • When reading the tiles “straight”, the read ranges are not used, instead only the slice information for each tile is used. That also means that this mode only works for very simple formats, when reading without a roi and when not doing any dtype conversion or decoding.

  • For most formats, a sync_offset can be specified, which can be used to correct for synchronization errors by inserting blank frames, or ignoring one or more frames, at the beginning or at the end of the data set.

Read ranges

In FileSet.get_read_ranges, the reading parameters (TilingScheme, roi etc.) are translated into one or more byte ranges (offset, length) for each tile. You can imagine it as translating pixel/element positions into byte offsets.

Each range corresponds to a read operation on a single file, and with multiple read ranges per tile, ranges for a single tile can correspond to reads from multiple files. This is important when reading from a data set with many small files - we can still generate deep tiles for efficient processing.

There are some built-in common parameters in FileSet, like frame_header_bytes, frame_footer_bytes, which can be used to easily implement formats where the reading just needs to skip a few bytes for each frame header/footer.

If you need more influence over how data is read, you can override FileSet.get_read_ranges and return your own read ranges. You can use the make_get_read_ranges function to re-use much of the tiling logic, or implement this yourself. Using make_get_read_ranges you can either override just the px_to_bytes part, or read_ranges_tile_block for whole tile blocks. This is done by passing in njit-ed functions to make_get_read_ranges. make_get_read_ranges should only be called on module-level to enable caching of the numba compilation.

Read ranges are generated as an array with the following shape:

(number_of_tiles, rr_per_tile, rr_num_entries)

rr_per_tile here is the maximum number of read ranges per tile - there can be tiles that are smaller than this, for example at the end of a partition. rr_num_entries is at least 3 and contains at least the values (file_idx, start, stop). This means to read stop - start bytes, beginning at offset start, from the file file_idx in the corresponding FileSet.

Overriding DataSets are free to add additional fields to the end, for example if the decoding functions need additional information.

As an example when you would generate custom read ranges, have a look at the implementations for MIB, K2IS, and FRMS6 - they may not have a direct 1:1 mapping to a numpy dtype, or the pixels may need to be re-ordered after decoding.

Notes for implementing a DataSet

  • Read file header(s) in initialize() - make sure to do the actual I/O in a function dispatched via the JobExecutor that is passed to initialize. See also Platform-dependent code and remote executor regarding platform-dependent code.

  • Implement check_valid() - this will be run on a worker node

  • Implement get_msg_converter() - the MessageConverter class returned is responsible for parsing parameters passed to the Web API and converting them to a Python representation that can be passed to the DataSet constructor.

  • Implement get_cache_key() - the cache key must be different for DataSets that return different data.

  • Implement get_partitions(). You may want to use the helper function get_slices() to generate slices for a specified number of partitions. get_partitions() should yield either BasePartition instances or instances of your own subclass (see below). The same is true for the FileSet that is passed to each partition - you possibly have to implement your own subclass.

  • Implement get_decoder() to return an instance of Decoder. Only needed if the data is saved in a data type that is not directly understood by numpy or numba. See below for details.

  • Implement get_base_shape(). This is only needed if the data format imposes any constraints on how the data can be read in an efficient manner, for example if data is saved in blocks. The tileshape that is negotiated before reading will be a multiple of the base shape in all dimensions.

  • Implement adjust_tileshape(). This is needed if you need to “veto” the generated tileshape, for example if your dataset has constraints that can’t be expressed by the base shape.

Subclass BasePartition

  • Override get_tiles() if you need to use completely custom I/O logic.

Implementing a Decoder

This may be needed if the raw data is not directly supported by numpy or numba. Mostly your decoder will return a different decode function in get_decode(). You can also return different decode functions, depending on the concrete data set you are currently reading. For example, this may be needed if there are different data representations generated by different detector modes. You can also instruct the IOBackend to clear the read buffer before calling decode by returning True from do_clear(). This can be needed if different read ranges contribute to the same part of the output buffer and the decode function accumulates into the buffer instead of slice-assigning.

The decode function will be called for each read range that was generated by the get_read_ranges method described above.