0.5.1 / 2020-08-12


  • Allow installation with latest dask distributed on Python 3.6 and 3.7

0.5.0 / 2020-04-23

New features


  • A large number of usability improvements (#622, #639, #641, #642, #659, #666, #690, #699, #700, #704). Thanks and credit to many new contributors from GSoC!

  • Fixed the buggy “enable Direct I/O” checkbox of the RAW dataset and handle unsupported operating systems gracefully. (#696, #659)


  • Added screenshots and description of ROI and stddev features in usage docs (#669)

  • Improved instructions for installing LiberTEM (general: #664; for development: #598)

  • Add information for downloading and generating sample datasets: Sample Datasets. (#650, #670, #707)



  • Clustering analysis
    • Use a connectivity matrix to only cluster neighboring pixels, reducing memory footprint while improving speed and quality (#618).

    • Use faster ApplyMasksUDF to generate feature vector (#618).

  • StdDevUDF
  • LiberTEM works with Python 3.8 for experimental use. A context using a remote Dask.Distributed cluster can lead to lock-ups or errors with Python 3.8. The default local Dask.Distributed context works.

  • Improve performance with large tiles. (#649)

  • SumUDF moved to the libertem.udf folder (#613).

  • Make sure the signal dimension of result buffer slices can be flattened without creating an implicit copy (#738, #739)

Many thanks to the contributors to this release: @AnandBaburajan, @twentyse7en, @sayandip18, @bdalevin, @saisunku, @Iamshankhadeep, @abiB27, @sk1p, @uellue

0.4.1 / 2020-02-18

This is a bugfix release, mainly constraining the msgpack dependency, as distributed is not compatible to version 1.0 yet. It also contains important fixes in the HDF5 dataset.


  • Fix HDF5 with automatic tileshape (#608)

  • Fix reading from HDF5 with roi beyond the first partition (#606)

  • Add version constraint on msgpack

0.4.0 / 2020-02-13

The main points of this release are the Job API deprecation and restructuring of our packaging, namely extracting the blobfinder module.

New features

  • dtype support for UDFs dtype support (#549, #550)

  • Dismiss error messages via keyboard: allows pressing the escape key to close all currently open error messages (#437)

  • ROI doesn’t have any effect if in pick mode, so we hide the dropdown in that case (#511)

  • Make tileshape parameter of HDF5 DataSet optional (#578)

  • Open browser after starting the server. Enabled by default, can be disabled using –no-browser (#81, #580)

  • Implement libertem.udf.masks.ApplyMasksUDF as a replacement of ApplyMasksJob (#549, #550)

  • Implement libertem.udf.raw.PickUDF as a replacement of PickFrameJob (#549, #550)

Bug fixes

  • Fix FRMS6 in a distributed setting. We now make sure to only do I/O in methods that are running on worker nodes (#531).

  • Fixed loading of nD HDF5 files. Previously the HDF5 DataSet was hardcoded for 4D data - now, arbitraty dimensions should be supported (#574, #567)

  • Fix DaskJobExecutor.run_each_host. Need to pass pure=False to ensure multiple runs of the function (#528).


  • Because HDFS support is right now not tested (and to my knowledge also not used) and the upstream hdfs3 project is not actively maintained, remove support for HDFS. ClusterDataSet or CachedDataSet should be used instead (#38, #534).


  • Depend on distributed>=2.2.0 because of an API change. (#577)

  • All analyses ported from Job to UDF back-end. The Job-related code remains for now for comparison purposes (#549, #550)

Job API deprecation

The original Job API of LiberTEM is superseded by the new User-defined functions API with release 0.4.0. See #549 for a detailed overview of the changes. The UDF API brings the following advantages:

  • Support for regions of interest (ROIs).

  • Easier to implement, extend and re-use UDFs compared to Jobs.

  • Clean separation between back-end implementation details and application-specific code.

  • Facilities to implement non-trivial operations, see User-defined functions: advanced topics.

  • Performance is at least on par.

For that reason, the Job API has become obsolete. The existing public interfaces, namely libertem.api.Context.create_mask_job() and libertem.api.Context.create_pick_job(), will be supported in LiberTEM for two more releases after 0.4.0, i.e. including 0.6.0. Using the Job API will trigger deprecation warnings starting with this release. The new ApplyMasksUDF replaces ApplyMasksJob, and PickUDF replaces PickFrameJob.

The Analysis classes that relied on the Job API as a back-end are already ported to the corresponding UDF back-end. The new back-end may lead to minor differences in behavior, such as a change of returned dtype. The legacy code for using a Job back-end will remain until 0.6.0 and can be activated during the transition period by setting analysis.TYPE = 'JOB' before running.

From ApplyMasksJob to ApplyMasksUDF

Main differences:

  • ApplyMasksUDF returns the result with the first axes being the dataset’s navigation axes. The last dimension is the mask index. ApplyMasksJob used to return transposed data with flattened navigation dimension.

  • Like all UDFs, running an ApplyMasksUDF returns a dictionary. The result data is accessible with key 'intensity' as a BufferWrapper object.

  • ROIs are supported now, like in all UDFs.

Previously with ApplyMasksJob:

# Deprecated!
mask_job = ctx.create_mask_job(
  factories=[all_ones, single_pixel],
mask_job_result =


Now with ApplyMasksUDF:

mask_udf = libertem.udf.masks.ApplyMasksUDF(
  mask_factories=[all_ones, single_pixel]
mask_udf_result = ctx.run_udf(dataset=dataset, udf=mask_udf)

plt.imshow(mask_udf_result['intensity'].data[..., 0])

From PickFrameJob to PickUDF

PickFrameJob allowed to pick arbitrary contiguous slices in both navigation and signal dimension. In practice, however, it was mostly used to extract single complete frames. PickUDF allows to pick the complete signal dimension from an arbitrary non-contiguous region of interest in navigation space by specifying a ROI.

If necessary, more complex subsets of a dataset can be extracted by constructing a suitable subset of an identity matrix for the signal dimension and using it with ApplyMasksUDF and the appropriate ROI for the navigation dimension. Alternatively, it is now easily possible to implement a custom UDF for this purpose. Performing the complete processing through an UDF on the worker nodes instead of loading the data to the central node may be a viable alternative as well.

PickUDF now returns data in the native dtype of the dataset. Previously, PickFrameJob converted to floats.

Using libertem.api.Context.create_pick_analysis() continues to be the recommended convenience function to pick single frames.

Restructuring into sub-packages

We are currently restructuring LiberTEM into packages that can be installed and used independently, see #261. This will be a longer process and changes the import locations.

For a transition period, importing from the previous locations is supported but will trigger a FutureWarning. See Show deprecation warnings on how to activate deprecation warning messages, which is strongly recommended while the restructuring is ongoing.

0.3.0 / 2019-12-12

New features

  • Make OOP based composition and subclassing easier for CorrelationUDF (#466)

  • Introduce plain circular match pattern Circular (#469)

  • Distributed sharded dataset ClusterDataSet (#136, #457)

  • Support for caching data sets CachedDataSet from slower storage (NFS, spinning metal) on fast local storage (#471)

  • Clustering analysis (#401, #408 by @kruzaeva).

  • implementation based on ncempy (#497)
    • Adds a new map() executor primitive. Used to concurrently read the metadata for DM3/DM4 files on initialization.

    • Note: no support for the web GUI yet, as the naming patterns for DM file series varies wildly. Needs changes in the file dialog.

  • Speed up of up to 150x for correlation-based peak refinement in libertem.udf.blobfinder.correlation with a Numba-based pipeline (#468)

  • Introduce FullFrameCorrelationUDF which correlates a large number (several hundred) of small peaks (10x10) on small frames (256x256) faster than FastCorrelationUDF and SparseCorrelationUDF (#468)

  • Introduce UDFPreprocessMixin (#464)

  • Implement iterator over AnalysisResultSet (#496)

  • Add hologram simulation libertem.utils.generate.hologram_frame() (#475)

  • Implement Hologram reconstruction UDF libertem.udf.holography.HoloReconstructUDF (#475)

Bug fixes

  • Improved error and validation handling when opening files with GUI (#433, #442)

  • Clean-up and improvements of libertem.analysis.fullmatch.FullMatcher (#463)

  • Ensure that RAW dataset sizes are calculated as int64 to avoid integer overflows (#495, #493)

  • Resolve shape mismatch issue and simplify dominant order calculation in Radial Fourier Analysis (#502)

  • Actually pass the enable_direct parameter from web API to the DataSet



  • The Job interface is planned to be replaced with an implementation based on UDFs in one of the upcoming releases.


  • Split up the blobfinder code between several files to reduce file size (#468)

0.2.2 / 2019-10-14

Point release to fix a number of minor issues, most notably PR #439 that should have been merged for version 0.2.

Bug fixes

  • Trigger a timeout when guessing parameters for HDF5 takes too long (#440 , #449)

  • Slightly improved error and validation handling when opening files with GUI (ec74c13)

  • Recognize BLO file type (#432)

  • Fixed a glitch where negative peak elevations were possible (#446)

  • Update examples to match 0.2 release (#439)

0.2.1 / 2019-10-07

Point release to fix a bug in the Zenodo upload for production releases.

0.2.0 / 2019-10-07

This release constitutes a major update after almost a year of development. Systematic change management starts with this release.

This is the release message:

User-defined functions

LiberTEM 0.2 offers a new API to define a wide range of user-defined reduction functions (UDFs) on distributed data. The interface and implementation offers a number of unique features:

  • Reductions are defined as functions that are executed on subsets of the data. That means they are equally suitable for distributed computing, for interactive display of results from a progressing calculation, and for handling live data¹.

  • Interfaces adapted to both simple and complex use cases: From a simple map() functionality to complex multi-stage reductions.

  • Rich options to define input and output data for the reduction functions, which helps to implement non-trivial operations efficiently within a single pass over the input data.

  • Composition and extension through object oriented programming

  • Interfaces that allow highly efficient processing: locality of reference, cache efficiency, memory handling


Advanced features:

A big shoutout to Alex (@sk1p) who developed it! 🏆

¹User-defined functions will work on live data without modification as soon as LiberTEM implements back-end support for live data, expected in 2020.

Support for 4D STEM applications

In parallel to the UDF interface, we have implemented a number of applications that make use of the new facilities:

  • Correlation-based peak finding and refinement for CBED (credit: Karina Ruzaeva @kruzaeva)

  • Strain mapping

  • Clustering

  • Fluctuation EM

  • Radial Fourier Series (advanced Fluctuation EM)

More details and examples:

Extended documentation

We have greatly improved the coverage of our documentation:

Fully automated release pipeline

Alex (@sk1p) invested a great deal of effort into fully automating our release process. From now on, we will be able to release more often, including service releases. 🚀

Basic dask.distributed array integration

LiberTEM can generate efficient dask.distributed arrays from all supported dataset types with this release. That means it should be possible to use our high-performance file readers in applications outside of LiberTEM.

File formats

Support for various file formats has improved. More details:

0.1.0 / 2018-11-06

Initial release of a minimum viable product and proof of concept.

Support for applying masks with high throughput on distributed systems with interactive web GUI display and scripting capability.