How does I/O work in LiberTEM?¶
Many algorithms benefit from Tiled processing where the same subset in the signal dimension is processed for several frames in a row. In many cases, algorithms have specific minimum and maximum sizes in signal dimension, navigation dimension or total size where they operate efficiently. Smaller sizes might increase overheads, while larger sizes might reduce cache efficiency.
At the same time, file formats might operate well within specific size and shape limits. The K2IS raw format is a prime example where data is saved in tiled form and can be processed efficiently in specific tile sizes and shapes that follow the native layout. Furthermore, some formats require decoding or corrections by the CPU, such as FRMS6, where tiles that fit the L3 cache can speed up subsequent processing steps. Requirements from the I/O method such as alignment and efficient block sizes are taken into account as well.
The LiberTEM I/O back-end negotiates a tiling scheme between UDF and dataset that fulfills requirements from both UDF and dataset side as far as possible. However, it is not always guaranteed that the supplied data will fall within the requested limits.
Data is split into partitions which are read from independently. Usually they split the navigation axis.
For each partition, the
TilingSchemeis then passed on to
Partition.get_tiles, which then yields
DataTilesthat match the given
- Under the hood, the
IOBackend, which has a reference to a
generates read ranges, which are passed on to the
- Under the hood, the
The I/O process can be influenced by passing a subclass of
FileSet.get_read_ranges, implementing a
Decoder, or even completely overriding the
IOBackend.get_tileshas two modes of operation: either it reads the tiles “straight” from the file, without copying or decoding, or it uses the read ranges and copies/decodes the tiles in smaller units.
When reading the tiles “straight”, the read ranges are not used, instead only the slice information for each tile is used. That also means that this mode only works for very simple formats, when reading without a
roiand when not doing any
dtypeconversion or decoding.
FileSet.get_read_ranges, the reading parameters (TilingScheme, roi etc.)
are translated into one or more byte ranges (offset, length) for each tile.
You can imagine it as translating pixel/element positions into byte offsets.
Each range corresponds to a read operation on a single file, which means read ranges for a single tile can correspond to reads from multiple files. This is important when reading from a data set with many small files - we can still generate tiles for efficient processing.
There are some built-in common parameters in FileSet, like frame_header_bytes, frame_footer_bytes, which can be used to easily implement formats where the reading just needs to skip a few bytes for each frame header/footer.
If you need more influence over how data is read, you can override FileSet.get_read_ranges and return your own read ranges. You can use the make_get_read_ranges function to re-use much of the tiling logic, or implement this yourself. Using make_get_read_ranges you can either override just the px_to_bytes part, or read_ranges_tile_block for whole tile blocks. This is done by passing in njit-ed functions to make_get_read_ranges. make_get_read_ranges should only be called on module-level to enable caching of the numba compilation.
Read ranges are generated as an array with the following shape:
(number_of_tiles, rr_per_tile, rr_num_entries)
rr_per_tile here is the maximum number of read ranges per tile - there
can be tiles that are smaller than this, for example at the end of a partition.
rr_num_entries is at least 3 and contains at least the values
(file_idx, start, stop). This means to read
stop - start
bytes, beginning at offset
start, from the file
in the corresponding
DataSets are free to add additional fields to the end, for
example if the decoding functions need additional information.
As an example when you would generate custom read ranges, have a look at the
implementations for MIB, K2IS, and FRMS6 - they may not have a direct 1:1 mapping
to a numpy
dtype, or the pixels may need to be re-ordered after decoding.
Read file header(s) in
initialize()- make sure to do the actual I/O in a function dispatched via the
JobExecutorthat is passed to
initialize. See also Platform-dependent code and remote executor regarding platform-dependent code.
check_valid()- this will be run on a worker node
MessageConverterclass returned is responsible for parsing parameters passed to the Web API and converting them to a Python representation that can be passed to the
get_cache_key()- the cache key must be different for
DataSets that return different data.
get_partitions(). You may want to use the helper function
make_slices()to generate slices for a specified number of partitions.
get_partitions()should yield either
BasePartitioninstances or instances of your own subclass (see below). The same is true for the
FileSetthat is passed to each partition - you possibly have to implement your own subclass.
_get_decoder()to return an instance of
Decoder. Only needed if the data is saved in a data type that is not directly understood by numpy or numba. See below for details.
get_base_shape(). This is only needed if the data format imposes any constraints on how the data can be read in an efficient manner, for example if data is saved in blocks. The tileshape that is negotiated before reading will be a multiple of the base shape in all dimensions.
adjust_tileshape(). This is needed if you need to “veto” the generated tileshape, for example if your dataset has constraints that can’t be expressed by the base shape.
get_tiles()if you need to use completely custom I/O logic.
This may be needed if the raw data is not directly supported
by numpy or numba. Mostly your decoder will return a different
decode function in
You can also return different decode functions, depending on the
concrete data set you are currently reading. For example, this may be needed if there
are different data representations generated by different detector modes.
You can also instruct the
IOBackend to clear the read
buffer before calling
decode by returning
do_clear(). This can be needed
if different read ranges contribute to the same part of the output buffer
decode function accumulates into the buffer instead of slice-assigning.
decode function will be called for each read range that was
generated by the
get_read_ranges method described above.