The Python API is a concise API for using LiberTEM from Python code. It is suitable both for interactive scripting, for example from Jupyter notebooks, and for usage from within a Python application or script.
For a full API reference, please see Python API reference.
libertem.api.Context object is the entry-point for most interaction
and processing with LiberTEM. It is used to load datasets, specify and run analyses.
The following snippet initializes a
with default parameters and backed by a parallel processing engine.
import libertem.api as lt with lt.Context() as ctx: ...
The use of a
with block in the above code ensures the
will correctly release any resources it holds on to when it goes out of scope,
but it is also possible to use the object as a normal variable, i.e.
ctx = lt.Context()
This is a basic example to load the API, create a local cluster, load a file and run an analysis.
import matplotlib.pyplot as plt import libertem.api as lt if __name__ == '__main__': # A path to a Quantum Detectors Merlin header file # Adapt to your data and data format dataset_path = './path_to_dataset.hdr' # Create a Context object to load data and run analyses # Here we specify we want to use 4 CPU workers for parallel jobs with lt.Context.make_with(cpus=4) as ctx: # Next we define a dataset, at this time no data is loaded # into memory, we only specify where the files are # The key 'mib' tells LiberTEM which format to load # it is possible to supply 'auto' and the Context will # try to auto-detect the correct dataset format ds = ctx.load('mib', path=dataset_path) # Create a sum-over-disk analysis, i.e. brightfield image # Values for disk centre x/y and radius in pixels disk_sum_analysis = ctx.create_disk_analysis(ds, cx=32, cy=32, r=8) disk_sum_result = ctx.run(disk_sum_analysis, progress=True) # Plot the resulting brightfield image plt.imshow(disk_sum_result.intensity.raw_data) plt.show()
For complete examples on how to use the Python API, please see the Jupyter notebooks in the example directory.
Custom processing routines¶
To go beyond the included capabilities of LiberTEM, you can implement your own analyses using User-defined functions (UDFs). UDFs are dataset-agnostic and benefit from the same parallelisation as the built-ins tools.
An Executor is the internal engine which the
Context uses to
compute user-defined functions or run other tasks. Executors can be serial or parallel,
and can differ substantially in their implementation, but all adhere to a
common interface which the
New in version 0.9.0: The executor API is internal. Since choice and parameters of executors are important for integration with Dask and other frameworks, they are now documented. Only the names and creation methods for executors are reasonably stable. The rest of the API is subject to change without notice. For that reason it is documented in the developer section and not in the API reference.
The default executor is
dask.distributed scheduler. To support all LiberTEM features and
achieve optimal performance, the methods provided by LiberTEM to start a
dask.distributed cluster should be used. However, LiberTEM can also run on a
dask.distributed cluster. Please note that
clusters that are not created by LiberTEM might use threading or a mixture of threads
and processes, and therefore might behave or perform differently to a
InlineJobExecutor runs all tasks
synchronously in the current thread. This is useful for debugging and for
special applications such as running UDFs that perform their own multithreading
efficiently or for other non-standard use that requires tasks to be executed
sequentially and in order.
See also Threading for more information on multithreading in UDFs.
New in version 0.9.0.
ConcurrentJobExecutor runs all tasks
concurrent.futures. Using a
concurrent.futures.ThreadPoolExecutor, which is the deafult behaviour,
allows sharing large amounts of data as well as other resources between the
main thread and workers efficiently, but is severely slowed down by the
Python global interpreter lock
under many circumstances. Furthermore, it can create thread safety issues such as
It is also in principle possible to use a
as backing for the
though this is untested and is likely to lead to worse performance than the
For special applications, the
DelayedJobExecutor can use dask.delayed to delay the processing. This
is experimental, see Dask integration for more details. It might use threading as
well, depending on the Dask scheduler that is used by
New in version 0.10.0.
For live data processing using
provides a multiprocessing executor that routes the live data source in a round-robin
fashion to worker processes. This is important to support processing that cannot keep
up with the detector speed on a single CPU core. This executor also works for offline
data sets in principle, but is not optimized for that use case.
Specifying executor type, CPU and GPU workers¶
New in version 0.9.0.
Changed in version 0.12.0: Added the
gpus keyword arguments, as well as the
plot_class keyword argument passed to the Context
initializer (replaced the prior
*args, **kwargs form).
import libertem.api as lt # Create a Dask-based Context with 4 cpu workers and 2 gpu workers with lt.Context.make_with('dask', cpus=4, gpus=2) as ctx: ...
The default behaviour is to create a Dask-based Context, but the same
method can be used to create any executor, as described in the documentation
of the method. A useful shortcut is
to quickly create a synchronous executor for debugging.
Not all executor types allow specifying number of workers, and
not all executor types are GPU-capable. In these cases the
method will raise an
for more information.
Connect to an existing cluster¶
DaskJobExecutor is capable of connecting
to an existing
dask.distributed scheduler, which may be a centrally managed
installation on a physical cluster, or a local, single-machine scheduler started for
some other purpose (by LiberTEM or directly through Dask). Cluster re-use can
reduce startup times as there is no requirement to spawn new workers each time
a script or Notebook is executed.
See Starting a custom cluster for more on how to start a scheduler and workers.
import libertem.api as lt from libertem.executor.dask import DaskJobExecutor # Connect to a Dask.Distributed scheduler at 'tcp://localhost:8786' with DaskJobExecutor.connect('tcp://localhost:8786') as executor: ctx = lt.Context(executor=executor) ...