Changelog
0.15.0.dev0
0.14.1 / 2024-06-19
Support for numpy 2.x. Note that some examples may not yet run, as the
dependencies might not be numpy 2 compatible. For these cases, please downgrade
to numpy<2
(#1656, #1653).
0.14.0 / 2024-05-16
The 0.14.0 release adds a system for UDFs to be aware of which parts of a dataset have already been processed during merging and post-processing, and to mark parts of results as invalid when the processing would otherwise be incorrect. This feature supports the plotting interface by allowing a UDF to mark outliers in results to be ignored, and computations such as discrete derivatives where the result at the boundaries of the processed data is undefined.
This release also increases the minimum supported Python version to 3.9 in line with a range of other projects. Older versions of LiberTEM remain available in sitations where Python cannot be upgraded.
Many thanks to everyone who has contributed to this release!
Features
Add a system for UDFs to be aware of which parts of a dataset have already been processed during merging and post-processing, and mark parts of results as invalid. Documentation for this feature is found here. (#1449 #1593).
Add a new snooze function to
libertem-server
; the argument--snooze-timeout
will free resources after a duration of inactivity which is especially useful when deploying LiberTEM as a service on a multi-user system. (#1570, #1572).In
CoMUDF
, added field_y and field_x to results, which are slices of their respective components in the field result (#1608).
Bugfixes
Fix loading of EMPAD data acquired in “search” mode. In some data sets, the actual raw data acquired corresponds to the
mode="search"
scan parameters, and notmode="acquire"
. The correct mode can be determined by comparing the file size and file name (#1617, #1620).Fix the
CoMUDF
automatic offset and gradient compensation to yield reasonable results during live processing (#1593).Manually add MIME type for Javascript on Windows, this is needed to fix setups with an invalid registry entry (#1612 #1613).
Documentation
Describe how to install LiberTEM safely on air-gapped systems. (#1627).
Obsolescence
Remove support for Python 3.7 and Python 3.8 following numpy and other projects (#1606, #1607).
The modules
libertem.analysis.gridmatching
andlibertem.analysis.fullmatch
are now available fromlibertem_blobfinder.common.gridmatching
andlibertem_blobfinder.common.fullmatch
sincelibertem-blobfinder
version 0.6. The original import locations trigger a deprecation warning and are slated for removal in LiberTEM 0.16. (#1469, #1600).
0.13.1 / 2023-11-09
The 0.13.1 point-release fixes an issue with loading assets when
connecting to the libertem-server
web interface via a proxy (#1554).
0.13.0 / 2023-11-09
The 0.13.0 release adds a number of improvements to the libertem-server
web interface and its user experience, as well as laying the foundation
for descan error compensation in all UDFs based on integration over masked
regions of a frame.
Many thanks to everyone who has contributed to this release!
Web Interface
A progress bar has been added to the web GUI to give feedback on the state of any in-flight computations. (#1500, #1514)
The cluster launcher in the web interface now allows specification of multiple workers-per-GPU, allowing better utilisation of large GPUs. (#1489, #1499)
Direct opening of datasets from URL parameters. The web interface now responds to arguments passed to it by URL, namely instructions to load a particular dataset, e.g.
#action=open&path=/path/to/your/data/
. (#1085, #1518)When launching
libertem-server
additional parameters are now accepted to preload a cluster with a given number of CPU and GPU workers, and also preload data. See the libertem-server documentation for more detail.(#1419, #1535)
Descan compensation
The backend and interface of
ApplyMasksUDF
has been extended to allow compensation for descan error in the form of ashifts
parameter. The shifts information is used to translate any applied masks relative to the given frame being processed. This lays the groundwork for future support forshifts
in all UDFs which rely on the backend ofApplyMasksUDF
, includingCoMUDF
and all virtual imaging analyses. (#1304)
UDF Interface
The UDF interface was extended to allow a UDF to declare multiple processing methods (
process_tile
,process_frame
etc) and choose at runtime the most appropriate one to use. See the UDF documentation for more detail. (#1508, #1509)
Misc
A function
libertem.contrib.convert_transposed.convert_dm4_transposed()
has been added to efficiently convert Gatan Digital Micrograph STEM datasets stored in(sig, nav)
ordering to numpy .npy files in(nav, sig)
ordering (#1520).Several Exception types were moved to
libertem.common
for MIT license compatibility. They are re-exported to the old import location for backwards compatibility (#1543).The
Shape
class is now hashable allowing it to be used as key in adict
andset
. (#1507).
0.12.0 / 2023-08-16
The 0.12 release introduces a dedicated Centre-of-Mass analysis UDF
with additional features compared to the existing
create_com_analysis()
API. The dedicated
UDF allows more straightforward live processing support, and can be
subclassed as in the LiberTEM-iCoM
project.
Also included are two usability improvements aimed at simplifying interaction with LiberTEM:
ctx.export_dataset()
for conversion of any supported LiberTEM dataset to the Numpy binary format .npy.lt.Context.make_with(cpus=n, gpus=m)
to simplify creation of a LiberTEMContext
with a specfic number of workers (rather than using all available resources by default).
This release also contains numerous fixes and backend changes across the codebase, improving the robustness of support for both sparse data and live processing in particular.
Many thanks to everyone who has contributed to this release!
The 0.12.0 release is also the first major release for which
LiberTEM is available on conda-forge.
See the installation documentation
for installation instructions using pip
or conda
.
Features
Adds
CoMUDF
as an extension fromcreate_com_analysis()
, with support for automatic correction of flat field effects (beam offset and descan error) (#1392).Adds
libertem.api.Context.export_dataset()
for export of any supported LiberTEM dataset to another format. At this time only exporting to the Numpy binary fomat .npy is supported, but other formats can later be added according to need (#1379).The
libertem.api.Context.make_with()
constructor method has been improved to allow simple specification of the number CPU and GPU workers to use, as well as the type of executor to create (#1443).Experimental API for dynamic updates to parameters while UDFs are running, allowing for faster feedback loops during live processing, notably (#1441).
Miscellaneous
The
PipelinedExecutor
executor now schedules tasks on the worker with the smallest request queue size (#1451).Update
PipelinedExecutor
to properly match tasks to workers based on resources requested, including GPU workers (#1453).The UDF runner internals will now raise
UDFRunCancelled
if a UDF run is cancelled, allowing the user to handle this case (#1448).Introduce option for
sparse.GCXS
inMaskContainer
(#1447).
Bugfixes
We now properly forward the logging level to
dask.distributed
workers during initialization. This prevents a substantial amount of logging information being printed tostderr
at cluster startup. (#1438).When processing raw_csr datasets, we now avoid upcasting to float64 when reading an indptr of type uint64 (#1465).
Deployment
Starting with version 0.12, LiberTEM container images will be available from the GitHub container registry.
0.11.2 / 2023-07-04
This point release works around issue #597 in hdbscan, which is an incompatibility with the latest scikit-learn release 1.3.0.
0.11.1 / 2023-05-11
This point release works around issue #8940 in numba 0.57.0, which can cause crashes under certain circumstances.
0.11.0 / 2023-04-21
This release introduces first-class support for sparse input data and processing using sparse-compatible libraries, primarily oriented towards data from event-based detectors. This support is provided through the new sparseconverter package, which was developed to enable efficient inter-conversion between different sparse formats and dense arrays. Many thanks in particular to Alexander Clausen and Dieter Weber for their extensive work on sparse support. See Sparse arrays for details on how to use it in your UDFs, and Raw binary files in sparse CSR format for a sparse input file format!
This release also marks the end of official support for Python 3.6 (#1369, #1380). At this time the maximum supported Python version remains 3.10 owing to usage of the Numba library within LiberTEM.
Sparse processing
Other features
Single-file Gatan Digital Micrograph (DM3/DM4) STEM datasets are now supported in the API and in the web interface. The files must use the newer C-style ordering available in Digital Micrograph to be read correctly. (#1401)
The information provided by the
ctx.run_udf
progress bar has been improved with framewise updates, allowing feedback on progress before a partition completes. (#1341)HDF5 datasets can now perform on-the-fly reshaping of the scan grid and can be adjusted using the sync_offset parameter. (#441, #1364)
Bugfixes
Numerous bugs were fixed in the behaviour of the
PipelinedExecutor
, to avoid crashes and deadlocks throughout the lifecycle of the object, particulary duing live processing. (#1308, #1311, #1316, #1318, #1319, #1342)The
Context
object will now attempt to close itself cleanly at process exit (#1343).The handling of
sync_offset
,nav_shape
,sig_shape
andsig_dims
inlibertem.io.dataset.memory.MemoryDataSet
is now consistent with other datasets. (#1207)Bugs were fixed in
libertem.udf.stddev.StdDevUDF
for both complex input data and when some partitions are empty. (#1314)
Miscellaneous
Make self.meta.coordinates available in UDF.get_task_data (#1397).
The methods
libertem.executor.pipelined.PipelinedExecutor.make_spec()
andlibertem.executor.dask.cluster_spec()
both now accept integers for theircpus
andcudas
arguments, in addition to the existing iterable forms. (#1294, #1336).Reduced the overhead of (re-)constructing
Slice
andShape
objects when slicing tiles, in particular focused on the methodBufferWrapper.get_contiguous_view_for_tile()
. (#1313, #1321)
Web interface
Datasets which don’t declare their nav shape will be interpreted as square if their number of frames is a square number (#1309, #1338).
Raise if user input for
numWorker
is non-positive integer or other error is encountered inDaskJobExecutor
creation. (#1334).The dataset loader Reset` button now correctly resets the sync_offset field, if present (#1378).
The function
detect
to automatically determine dataset type will now use the file suffix as a hint to choose its search order. This may lead to faster responses in the web client when configuring a new dataset. (#1377)
Breaking changes
Instead of
DataTile
objects, an UDF’s processing method will receive plain array objects, such asnumpy.ndarray
,sparse.SparseArray
etc. That means thescheme_idx
andtile_slice
attributes are not available from the tile anymore, but only from the correspondinglibertem.udf.base.UDFMeta.tiling_scheme_idx
andlibertem.udf.base.UDFMeta.slice
. This change makes handling different array types such as sparse arrays or CuPy arrays easier. For CuPy arrays this was already the previous behavior.
0.10.0 / 2022-07-28
This release features the Pipelined executor for parallel live data processing (#1267). This change greatly improves the processing performance for live data, in particular to support detectors with high data rate. Many thanks to Alexander Clausen and Matthew Bryan for their work! The corresponding capabilities in the LiberTEM-live package will be released soon and announced separately.
Other changes:
New features
Support for Python 3.10.
NumPy files (NPY) for reading NumPy .npy files (#222, #1249).
Support for updated EMPAD XML format, including series (#1259, #1260).
Integrate Tracing using opentelemetry that allows to debug and trace distribted operation of LiberTEM (#691, #1266).
libertem-server
picks a free port if the default is in use and no port was specified (#1184, #1279).cluster_spec()
now accepts the same CUDA device ID multiple times to spawn multiple workers on the same GPU. This can help increase GPU resource utilisation for some workloads (#1270).
Bugfixes
Correct type determination in
AutoUDF
(#1298).Fix non-square plots (#1255).
Disable the Dask profiler due to issues with the DM dataset (#1289).
Fix GUI glitch in center of mass analysis (#1278).
Documentation
Miscellaneous
Include tests in PyPI release to prepare release on conda-forge, and exclude unneeded files. (#1271, #1275, #1276).
Move some code around to make sure that
libertem.io
andlibertem.common
only depend on code that is compatible with the MIT license. Moved items are re-imported at the same positions as before to keep backwards compatibility (#1031, #1245).
0.9.2 / 2022-04-28
This is a bugfix release with two small fixes:
0.9.0 / 2022-02-17
We are most happy to announce full Dask array integration with this release! Many thanks to Matthew Bryan who implemented major parts of this non-trivial feature. Most notably, HyperSpy lazy signals and LiberTEM can now be combined seamlessly. See Dask integration for details and an example!
This enables the following applications:
Use HyperSpy file readers and other readers that create Dask arrays for LiberTEM.
Create an ad-hoc file reader for LiberTEM by just building a Dask array. This is often simpler than implementing a native LiberTEM dataset, at the expense of performance.
Use LiberTEM file readers for HyperSpy and other software that works with Dask arrays.
Use the same implementation of an algorithm for live processing with LiberTEM, offline processing with LiberTEM, and offline processing with HyperSpy.
Simplify implementation of complex processing routines on Dask arrays. That includes, for example, routines that are not purely implemented with NumPy array operations and produce complex output or are not compatible with all Dask array chunking schemes. Here, LiberTEM UDFs offer a more powerful and versatile interface than Dask’s native map_blocks() interface.
Chain processing steps together using Dask arrays for intermediate results, including using the output of one UDF as input for another UDF. Dask arrays allow working with large intermediate results efficiently since they can remain on the workers.
Specifically, the Dask integration encompasses the following features:
Create LiberTEM datasets from Dask arrays via the Dask (#1137).
Create Dask arrays from LiberTEM UDF results using the
DelayedJobExecutor
. A UDF can define amerge_all()
method in addition to the usualmerge()
to improve performance. See Merge function for Dask array results for details (#1170)!Create Dask arrays directly from LiberTEM datasets using
libertem.contrib.daskadapter.make_dask_array()
, which is already possible since release 0.2.Executor options to improve integration, see Scheduler and Executors (#1170, #1146, #922).
Please note that these features are still experimental and cover a large space of possible uses and parameters. Expect the unexpected! Tests, feedback and improvements are highly appreciated.
Other changes in this release:
New features
Experimental helper function
libertem.analysis.com.guess_corrections()
to guess parameters for Center of Mass analysis (#1111).GUI interface for the COM analysis to call
libertem.analysis.com.guess_corrections()
and update the GUI parameters from the result (#1172).Support for some MIB Quad formats. All integer formats should be supported and were tested with
1x1
and2x2
layouts. Raw formats with1x1
and2x2
layouts using 1 bit, 6 bit, and 12 bit counter depth are supported as well. Support for raw MIB data in other layouts and bit depths can be added on demand (#1169, #1135).New attributes
libertem.udf.base.UDFMeta.sig_slice
andlibertem.udf.base.UDFMeta.tiling_scheme_idx
. These attributes can be used for performant access to the current signal slice - mostly important for throughput-limited analysis (#1167, #1166).New
--preload
option tolibertem-server
andlibertem-worker
. That makes it work as documented in HDF5, following Dask worker preloading (#1151).Allow selection of I/O backend in GUI and Python API (#753, #896, #1129).
Re-add support for direct I/O. It was previously only supported as a special case for raw files on Linux. Now it is supported for all native dataset formats we support on Linux and Windows. Notable exceptions are the OS X platform or HDF5, MRC, and SER formats (#1129, #753).
Support for reading TVIPS binary files, i.e.
*_NNN.tvips
files (#1179).
Bugfixes
Allow running CoM analysis on a linescan dataset by only returning divergence and curl if they are defined (#1138, #1139).
make_dask_array
now works correctly when aroi
is specified (#933).Correct shape of buffer views in
process_tile()
when the tile has depth 1 (#1215).
Documentation
Miscellaneous
A Docker image with a LiberTEM installation is available on DockerHub now. See Containers for details (#1144, #484).
Improve performance with large UDF parameters (#1143).
Start using
libertem.preload
again and importhdf5plugin
if present so that users don’t have to specify this common selection of HDF5 filters as preload themselves (#1160).
0.8.0 / 2021-10-04
This release mainly contains improvements of center of mass / first moment analysis and support for starting the web GUI from JupyterHub or JupyterLab.
New features
Support for center of mass with annular masks in
create_com_analysis()
,COMAnalysis
and the GUI (#633, #1089).Support in the GUI for specifying rotation of scan against detector and flipping the detector y axis (#1087, #31). Previously this was only supported in the Python API.
Tweaks and instructions for JupyterHub and JupyterLab integration in LiberTEM, see Jupyter integration (#1074). New package LiberTEM/LiberTEM-jupyter-proxy for interfacing.
In the web API, support was added to re-run visualization only, without re-running UDFs for an analysis. This allows for almost instant feedback for some operations, like changing CoM parameters.
Added token-based authentication. For now, it is only usable via integrations like Jupyter. It will be extended to local/manual usage later (#1074, #1097). Please comment on #1097 if local/manual use would be beneficial for you so that it is prioritized accordingly.
SEQ dataset: Added support for loading excluded pixels from XML (#805, #1077). See
SEQDataSet
for more information. Also support both*.seq.seq
and*.seq
as extension for the main SEQ file to find files with matching base name that contain correction data (#1120, #1121).
Bugfixes
Assert that the
files
argument toDMDataSet
is actually a list or tuple, to prevent iterating over a string path (#1058).Escape globs to support special characters in file names for multi-file datasets (#1066, #1067).
Make sure multithreading in the main process still works properly after launching a
Context
(#1053, #1100).Allow custom plots to return RGB as plot data, for example a color wheel for vector fields (#1052, #1101).
Adjust partition count to match the number of CPU compute workers, not total workers to prevent residual partitions (#1086, #1103).
Fix memory leak: Don’t submit dynamically generated callables directly to the distributed cluster, as they are cached in an unbounded cache (#894, #964, #1119).
Documentation
Note on handling HDF5 files with non-standard compression in
H5DataSet
(#1059).Link to two more public datasets: High-resolution 4D STEM dataset of SrTiO3 and Synthetic 4D STEM dataset based on a SrTiO3 supercell (#1073).
Misc
Make sure tasks are scheduled dynamically on available workers if they have uneven run time to benefit more from GPUs (#1107).
Cache loaded libraries to reduce overhead of setting the thread count (#1117, #1118).
Many thanks to our new contributors Levente Puskás for the excluded pixel loading and to Matthew Bryan for figuring non-standard compression in HDF5 and improving DM input validation. Congratulations to Alex for closing the long-standing CoM issue #31 and for enabling easy and secure access to the web interface on shared IT infrastructure.
0.7.1 / 2021-07-08
This is a bugfix release that ensures compatibility with the upcoming numba 0.54 release.
Our custom numba caching makes some assumptions about numba internals, which have changed in numba 0.54. This fixes compatibility with numba 0.54, and also makes sure we fail gracefully for future changes (#1060, #1061).
0.7.0 / 2021-06-10
This release introduces features that are essential for live data processing, but can be used for offline processing as well: Live plotting, API for bundled execution of several UDFs in one run, iteration over partial UDF results, and asynchronous UDF execution. Features and infrastructure that are specific to live processing are included in the LiberTEM-live package, which will be released soon.
New features
Support for postprocessing of results on the main node after merging partial results. This adds
get_results()
and theuse
parameter tobuffer()
. See Post-processing after merging for details (#994, #1003, #1001).Obtain partial results from each merge step iteratively as a generator using
run_udf_iter()
. See Partial results and an example for details (#1011)!Run multiple UDFs in one pass over a single
DataSet
by passing a list of UDFs instead of one UDF inrun_udf()
andrun_udf_iter()
(#1011).Allow usage from an asynchronous context with the new
sync=False
argument torun_udf()
andrun_udf_iter()
. See Partial results and an example for details (#216, #1011)!Live plotting using the new
plots
parameter forrun_udf()
andrun_udf_iter()
, as well as live plotting classes documented in Visualization. Passplots=True
for simple usage. See Live Plotting as well as an example for the various possibilities for advanced usage (#980, #1011).Allow some UDF-internal threading. This is mostly interesting for ad-hoc parallelization on top of the
InlineJobExecutor
and live processing that currently relies on theInlineJobExecutor
for simplicity, but could also be used for hybrid multiprocess/multithreaded workloads. Threads for numba, pyfftw, OMP/MKL are automatically controlled. The executor makes the number of allowed threads available aslibertem.udf.base.UDFMeta.threads_per_worker
for other threading mechanisms that are not controlled automatically (#993).K2IS: reshaping, sync offset and time series support. Users can now specify a
nav_shape
,sig_shape
andsync_offset
for a K2IS data set, and load time series data (#1019, #911). Many thanks to @AnandBaburajan for implementing this feature!Support for Python >=3.9.3, use Python 3.9 in AppImage (#914, #1037, #1039).
Bugfixes
UDF: Consistently use attribute access in
UDF.process_*()
,UDF.merge()
,UDF.get_results()
etc. instead of mixing it with__getitem__()
dict-like access. The previous method still works, but triggers aUserWarning
(#1000, #1003).Also allow non-sliced assignment, for example
self.results.res += frame
(#1000, #1003).Better choice of
kind='nav'
buffer fill value outside ROI.String : Was
'n'
, now''
bool : Was
True
, nowFalse
integers : Was smallest possible value, now
0
objects : was
np.nan
, nowNone
(#1011)
Improve performance for chunked HDF5 files, especially compressed HDF5 files which have a chunking in both navigation dimensions. They were causing excessive read amplification (#984).
Fix plot range if only zero and one other value are present in the result, most notably boolean values (#944, #1011).
Fix axes order in COM template: The components in the field are (x, y) while the template had them as (y, x) before (#1023).
Documentation
Obsolescence
Removed deprecated blobfinder and
FeatureVecMakerUDF
as previously announced. Blobfinder is available as a separate package at https://github.com/liberTEM/LiberTEM-blobfinder. Instead ofFeatureVecMakerUDF
, you can use a sparse matrix andApplyMasksUDF
(#979).Remove deprecated
Job
interface as previously announced. The functionality was ported to the more capable UDF interface #978.
0.6.0 / 2021-02-16
We are pleased to announce the latest LiberTEM release, with many improvements since 0.5. We would like to highlight the contributions of our GSoc 2020 students @AnandBaburajan (reshaping and sync offset correction) and @twentyse7en, (Code generation to replicate GUI analyses in Jupyter notebooks) who implemented significant improvements in the areas of I/O and the user interface.
Another highlight of this release is experimental support of NVidia GPUs, both via CuPy and via native libraries. The API is ready to be used, including support in the GUI. Performance optimization is still to be done (#946). GPU support is activated for all mask-based analyses (virtual detector and Radial Fourier) for testing purposes, but will not bring a noticeable improvement of performance yet. GPU-based processing did show significant benefits for computationally heavy applications like the SSB implementation in https://github.com/Ptychography-4-0/ptychography.
A lot of work was done to implement tiled reading, resulting in a new I/O system. This improves performance in many circumstances, especially when dealing with large detector frames. In addition, a correction module was integrated into the new I/O system, which can correct gain, subtract a dark reference, and patch pixel defects on the fly. See below for the full changelog!
New features
I/O overhaul
Implement tiled reading for most file formats (#27, #331, #373, #435).
Allow UDFs that implement
process_tile
to influence the tile shape by overridinglibertem.udf.base.UDF.get_tiling_preferences()
and make information about the tiling scheme available to the UDF throughlibertem.udf.base.UDFMeta.tiling_scheme
. (#554, #247, #635).Update
MemoryDataSet
to allow testing with different tile shapes (#634).Added I/O backend selection (#896), which allows users to select the best-performing backend for their circumstance when loading via the new
io_backend
parameter ofContext.load
. This fixes a K2IS performance regression (#814) by disabling any readahead hints by default. Additionaly, this fixes a performance regression (#838) on slower media (like HDDs), by adding a buffered reading backend that tries its best to linearize I/O per-worker. GUI integration of backend selection is to be done.For now, direct I/O is no longer supported, please let us know if this is an important use-case for you (#716)!
Support for specifying logging level from CLI (#758).
Support for loading stacks of 3D DM files (#877). GUI integration still to be done.
GUI: Filebrowser improvements: users can star directories in the file browser for easy navigation (#772).
Support for running multiple UDFs “at the same time”, not yet exposed in public APIs (#788).
GUI: Users can add or remove scan size dimensions according to the dataset’s shape (#779).
GUI: Shutdown button to stop server, useful for example for JupyterHub integration (#786).
Infrastructure for consistent coordinate transforms are added in
libertem.corrections.coordinates
andlibertem.utils
. See also a description of coordinate systems in Concepts.create_com_analysis()
now allows to specify aflipped y axis
and a scan rotation angle to deal with MIB files and scan rotation correctly. (#325, #786).Corrections can now be specified by the user when running a UDF (#778, #831, #939).
Support for loading dark frame and gain map that are sometimes shipped with SEQ data sets.
GPU support: process data on CPUs, CUDA devices or both (#760, CuPy support).
Spinning out holography to a separate package is in progress: https://github.com/LiberTEM/LiberTEM-holo/
Implement CuPy support in
HoloReconstructUDF
, currently deactivated due to #815 (#760).GUI: Allows the user to select the GPUs to use when creating a new local cluster (#812).
GUI: Support to download Jupyter notebook corresponding to an analysis made by a user in GUI (#801).
GUI: Copy the Jupyter notebook cells corresponding to the analysis directly from GUI, including cluster connection details (#862, #863)
Allow reshaping datasets into a custom shape. The
DataSet
implementations (currently except HDF5 and K2IS) and GUI now allow specifyingnav_shape
andsig_shape
parameters to set a different shape than the layout in the dataset (#441, #793).All
DataSet
implementations handle missing data gracefully (#256, #793).The
DataSet
implementations (except HDF5 and K2IS) and GUI now allow specifying async_offset
to handle synchronization/acquisition problems (#793).Users can access the coordinates of a tile/partition slice through
coordinates
(#553, #793).Cache warmup when opening a data set: Precompiles jit-ed functions on a single process per node, in a controlled manner, preventing CPU oversubscription. This improves further through implementing caching for functions which capture other functions in their closure (#886, #798).
Allow selecting lin and log scaled visualization for sum, stddev, pick and single mask analyses to handle data with large dynamic range. This adds key
intensity_lin
toSumResultSet
,PickResultSet
and the result ofSDAnalysis
. It adds keyintensity_log
toSingleMaskResultSet
. The new keys are chosen to not affect existing keys (#925, #929).Tuples can be added directly to
Shape
objects. Right addition adds to the signal dimensions of theShape
object while left addition adds to the navigation dimensions (#749)
Bugfixes
Fix an off-by-one error in sync offset for K2IS data (drive-by change in #706).
Missing-directory error isn’t thrown if it’s due to last-recent-directory not being available (#748).
GUI: when cluster connection fails, reopen form with parameters user submitted (#735).
GUI: Fixed the glitch in file opening dialogue by disallowing parallel browsing before loading is concluded (#752).
Handle empty ROI and extra_shape with zero. Empty result buffers of the appropriate shape are returned if the ROI is empty or
extra_shape
has a zero (#765)Improve internals of
libertem.io.corrections.detector
andlibertem.io.corrections.corrset
to better support correction of many dead pixels. (#890, #889)Handle single-frame partitions in combination with aux data. Instead of squeezing the aux buffer, reshape to the correct shape (#791, #902).
Libertem-server can now be started from Bash on Windows (#731)
Fix reading without a copy from multi-file datasets. The start offset of the file was not taken account when indexing into the memory maps (#903).
Improve performance and reduce memory consumption of point analysis. Custom right hand side matrix product to reduce memory consumption and improve performance of sparse masks, such as point analysis. See also scipy/13211 (#917, #920).
Fix stability issue with multiple dask clients.
dd.as_completed
needs to specify theloop
to work with multipledask.distributed
clients (#921).GUI: Snap to pixels in point selection analysis. Consistency between point selection and picking (#926, #927).
Open datasets with autodetection, positional and keyword arguments. Handle keyword and positional arguments to
Context.load('auto', ...)
correctly (#936, #938).
Documentation
Switched to the readthedocs sphinx theme, improving the overall documentation structure. The developer documentation is now in a separate section from the user documentation.
Misc
Command line options can also be accessed with shorter alternatives (#757).
Depend on Numba >= 0.49.1 to support setting Numba thread count (#783), bumped to 0.51 to support caching improvements (#886).
libertem-server: Ask for confirmation if the user press ctrl+c. Can immediately stop using another ctrl+c (#781).
Included pytest-benchmark to integrate benchmarks in the test infrastructure. See Benchmarking for details (#819).
The X and Y components for the color wheel visualization in Center of Mass and Radial Fourier Analysis are swapped to match the axis convention in empyre. This just changes the color encoding in the visualization and not the result (#851).
Deprecations
The
tileshape
parameter ofDataSet
implementations is deprecated in favor of tileshape negotiation and will be ignored, if given (#754, #777).Remove color wheel code from
libertem.viz
and replace with imports from empyre. Note that these functions expect three vector components instead of two (#851).The new and consistent
nav_shape
andsig_shape
parameters should be used when loading data. The oldscan_size
anddetector_size
parameters, where they existed, are still recognized (#793).
0.5.1 / 2020-08-12
Bugfixes
Allow installation with latest dask distributed on Python 3.6 and 3.7
0.5.0 / 2020-04-23
New features
In addition to tuples,
Shape
objects can be used asextra_shape
parameter forlibertem.udf.base.UDF.buffer()
andlibertem.udf.base.UDF.aux_data()
now. (#694)Progress bar support based on
tqdm
that can be enabled by passingprogress=True
tolibertem.api.Context.run_udf()
,libertem.api.Context.run()
andlibertem.api.Context.map()
: Running UDFs. (#613, #670, #655)Include explicit support for Direct Electron’s DE5 format based on HDF5. (#704)
GUI: Downloadable results as HDF5, NPZ, TIFF, and RAW. See Downloading results for details. (#665)
libertem.api.Context.load()
now automatically detects file type and parameters iffiletype="auto"
is passed. (#610, #621, #734)Relocatable GUI: Allow LiberTEM to run from different URL prefixes, allowing integration into, for example, JupyterLab. (#697)
Run
preprocess()
also before merge on the main node to allocate or initialize buffers, in addition to running on the workers (#624).No need to set thread count environment variables anymore since the thread count for OpenBLAS, OpenMP, Intel MKL and pyFFTW is now set on the workers at run-time. Numba support will be added as soon as Numba 0.49 is released. (#685).
Bugfixes
Documentation
Obsolescence
Parameters
crop_detector_to
anddetector_size_raw
oflibertem.io.dataset.raw.RawFileDataSet
are deprecated and will be removed after 0.6.0. Please specifydetector_size
instead or use a specialized DataSet, for example for EMPAD.libertem.udf.feature_vector_maker.FeatureVecMakerUDF
is deprecated and will be removed in 0.6.0. UseApplyMasksUDF
with a sparse stack of single pixel masks or a stack generated bylibertem_blobfinder.common.patterns.feature_vector()
instead. (#618)
Misc
Clustering analysis
Use a connectivity matrix to only cluster neighboring pixels, reducing memory footprint while improving speed and quality (#618).
Use faster
ApplyMasksUDF
to generate feature vector (#618).
-
Rename result buffers of
StdDevUDF
,run_stddev()
andconsolidate_result()
from'sum_frame'
to'sum'
,'num_frame'
to'num_frames'
(#640)Resolve ambiguity between variance and sum of variances in result buffer names of
StdDevUDF
,run_stddev()
andconsolidate_result()
. (#640)
LiberTEM works with Python 3.8 for experimental use. A context using a remote Dask.Distributed cluster can lead to lock-ups or errors with Python 3.8. The default local Dask.Distributed context works.
Improve performance with large tiles. (#649)
Make sure the signal dimension of result buffer slices can be flattened without creating an implicit copy (#738, #739)
Many thanks to the contributors to this release: @AnandBaburajan, @twentyse7en, @sayandip18, @bdalevin, @saisunku, @Iamshankhadeep, @abiB27, @sk1p, @uellue
0.4.1 / 2020-02-18
This is a bugfix release, mainly constraining the msgpack
dependency,
as distributed is not compatible to version 1.0 yet. It also contains
important fixes in the HDF5 dataset.
Bugfixes
0.4.0 / 2020-02-13
The main points of this release are the Job API deprecation and restructuring of our packaging, namely extracting the blobfinder module.
New features
dtype
support for UDFs Preferred input dtype (#549, #550)Dismiss error messages via keyboard: allows pressing the escape key to close all currently open error messages (#437)
ROI doesn’t have any effect if in pick mode, so we hide the dropdown in that case (#511)
Make tileshape parameter of HDF5 DataSet optional (#578)
Open browser after starting the server. Enabled by default, can be disabled using –no-browser (#81, #580)
Implement
libertem.udf.masks.ApplyMasksUDF
as a replacement of ApplyMasksJob (#549, #550)Implement
libertem.udf.raw.PickUDF
as a replacement of PickFrameJob (#549, #550)
Bug fixes
Fix FRMS6 in a distributed setting. We now make sure to only do I/O in methods that are running on worker nodes (#531).
Fixed loading of nD HDF5 files. Previously the HDF5 DataSet was hardcoded for 4D data - now, arbitraty dimensions should be supported (#574, #567)
Fix
DaskJobExecutor.run_each_host
. Need to passpure=False
to ensure multiple runs of the function (#528).
Obsolescence
Misc
Job API deprecation
The original Job API of LiberTEM is superseded by the new User-defined functions (UDFs) API with release 0.4.0. See #549 for a detailed overview of the changes. The UDF API brings the following advantages:
Support for regions of interest (ROIs).
Easier to implement, extend and re-use UDFs compared to Jobs.
Clean separation between back-end implementation details and application-specific code.
Facilities to implement non-trivial operations, see User-defined functions: advanced topics.
Performance is at least on par.
For that reason, the Job API has become obsolete. The existing public
interfaces, namely libertem.api.Context.create_mask_job()
and
libertem.api.Context.create_pick_job()
, will be supported in LiberTEM for
two more releases after 0.4.0, i.e. including 0.6.0. Using the Job API will
trigger deprecation warnings starting with this release. The new
ApplyMasksUDF
replaces
ApplyMasksJob
, and PickUDF
replaces PickFrameJob
.
The Analysis classes that relied on the Job API as a back-end are already ported
to the corresponding UDF back-end. The new back-end may lead to minor
differences in behavior, such as a change of returned dtype. The legacy code for
using a Job back-end will remain until 0.6.0 and can be activated during the
transition period by setting analysis.TYPE = 'JOB'
before running.
From ApplyMasksJob
to ApplyMasksUDF
Main differences:
ApplyMasksUDF
returns the result with the first axes being the dataset’s navigation axes. The last dimension is the mask index.ApplyMasksJob
used to return transposed data with flattened navigation dimension.Like all UDFs, running an
ApplyMasksUDF
returns a dictionary. The result data is accessible with key'intensity'
as aBufferWrapper
object.ROIs are supported now, like in all UDFs.
Previously with ApplyMasksJob
:
# Deprecated!
mask_job = ctx.create_mask_job(
factories=[all_ones, single_pixel],
dataset=dataset
)
mask_job_result = ctx.run(mask_job)
plt.imshow(mask_job_result[0].reshape(dataset.shape.nav))
Now with ApplyMasksUDF
:
mask_udf = libertem.udf.masks.ApplyMasksUDF(
mask_factories=[all_ones, single_pixel]
)
mask_udf_result = ctx.run_udf(dataset=dataset, udf=mask_udf)
plt.imshow(mask_udf_result['intensity'].data[..., 0])
From PickFrameJob
to PickUDF
PickFrameJob
allowed to pick arbitrary contiguous
slices in both navigation and signal dimension. In practice, however, it was
mostly used to extract single complete frames.
PickUDF
allows to pick the complete signal
dimension from an arbitrary non-contiguous region of interest in navigation
space by specifying a ROI.
If necessary, more complex subsets of a dataset can be extracted by constructing a suitable subset of an identity matrix for the signal dimension and using it with ApplyMasksUDF and the appropriate ROI for the navigation dimension. Alternatively, it is now easily possible to implement a custom UDF for this purpose. Performing the complete processing through an UDF on the worker nodes instead of loading the data to the central node may be a viable alternative as well.
PickUDF
now returns data in the native dtype
of the dataset. Previously, PickFrameJob
converted to
floats.
Using libertem.api.Context.create_pick_analysis()
continues to be the
recommended convenience function to pick single frames.
Restructuring into sub-packages
We are currently restructuring LiberTEM into packages that can be installed and used independently, see #261. This will be a longer process and changes the import locations.
Blobfinder is the first module separated in 0.4.0.
See Package overview for a current overview of sub-packages.
For a transition period, importing from the previous locations is supported but
will trigger a FutureWarning
. See Show deprecation warnings on how to
activate deprecation warning messages, which is strongly recommended while the
restructuring is ongoing.
0.3.0 / 2019-12-12
New features
Make OOP based composition and subclassing easier for
CorrelationUDF
(#466)Introduce plain circular match pattern
Circular
(#469)Support for caching data sets
CachedDataSet
from slower storage (NFS, spinning metal) on fast local storage (#471)Clustering analysis (#401, #408 by @kruzaeva).
libertem.io.dataset.dm.DMDataSet
implementation based on ncempy (#497)Adds a new
map()
executor primitive. Used to concurrently read the metadata for DM3/DM4 files on initialization.Note: no support for the web GUI yet, as the naming patterns for DM file series varies wildly. Needs changes in the file dialog.
Speed up of up to 150x for correlation-based peak refinement in
libertem.udf.blobfinder.correlation
with a Numba-based pipeline (#468)Introduce
FullFrameCorrelationUDF
which correlates a large number (several hundred) of small peaks (10x10) on small frames (256x256) faster thanFastCorrelationUDF
andSparseCorrelationUDF
(#468)Introduce
UDFPreprocessMixin
(#464)Implement iterator over
AnalysisResultSet
(#496)Add hologram simulation
libertem.utils.generate.hologram_frame()
(#475)Implement Hologram reconstruction UDF
libertem.udf.holography.HoloReconstructUDF
(#475)
Bug fixes
Improved error and validation handling when opening files with GUI (#433, #442)
Clean-up and improvements of
libertem.analysis.fullmatch.FullMatcher
(#463)Ensure that RAW dataset sizes are calculated as int64 to avoid integer overflows (#495, #493)
Resolve shape mismatch issue and simplify dominant order calculation in Radial Fourier Analysis (#502)
Actually pass the
enable_direct
parameter from web API to the DataSet
Documentation
Created Authorship policy (#460, #483)
Documentation for Crystallinity map and Clustering analysis (#408 by @kruzaeva)
Restructure and update the API reference for a number of UDFs and other application-specific code (#503, #507, #508)
Obsolescence
The Job interface is planned to be replaced with an implementation based on UDFs in one of the upcoming releases.
Misc
Split up the blobfinder code between several files to reduce file size (#468)
0.2.2 / 2019-10-14
Point release to fix a number of minor issues, most notably PR #439 that should have been merged for version 0.2.
Bug fixes
Trigger a timeout when guessing parameters for HDF5 takes too long (#440 , #449)
Slightly improved error and validation handling when opening files with GUI (@ec74c13)
Recognize BLO file type (#432)
Fixed a glitch where negative peak elevations were possible (#446)
Update examples to match 0.2 release (#439)
0.2.1 / 2019-10-07
Point release to fix a bug in the Zenodo upload for production releases.
0.2.0 / 2019-10-07
This release constitutes a major update after almost a year of development. Systematic change management starts with this release.
This is the release message:
User-defined functions
LiberTEM 0.2 offers a new API to define a wide range of user-defined reduction functions (UDFs) on distributed data. The interface and implementation offers a number of unique features:
Reductions are defined as functions that are executed on subsets of the data. That means they are equally suitable for distributed computing, for interactive display of results from a progressing calculation, and for handling live data¹.
Interfaces adapted to both simple and complex use cases: From a simple map() functionality to complex multi-stage reductions.
Rich options to define input and output data for the reduction functions, which helps to implement non-trivial operations efficiently within a single pass over the input data.
Composition and extension through object oriented programming
Interfaces that allow highly efficient processing: locality of reference, cache efficiency, memory handling
Introduction: https://libertem.github.io/LiberTEM/udf.html
Advanced features: https://libertem.github.io/LiberTEM/udf/advanced.html
A big shoutout to Alex (@sk1p) who developed it! 🏆
¹User-defined functions will work on live data without modification as soon as LiberTEM implements back-end support for live data, expected in 2020.
Support for 4D STEM applications
In parallel to the UDF interface, we have implemented a number of applications that make use of the new facilities:
Correlation-based peak finding and refinement for CBED (credit: Karina Ruzaeva @kruzaeva)
Strain mapping
Clustering
Fluctuation EM
Radial Fourier Series (advanced Fluctuation EM)
More details and examples: https://libertem.github.io/LiberTEM/applications.html
Extended documentation
We have greatly improved the coverage of our documentation: https://libertem.github.io/LiberTEM/index.html#documentation
Fully automated release pipeline
Alex (@sk1p) invested a great deal of effort into fully automating our release process. From now on, we will be able to release more often, including service releases. 🚀
Basic dask.distributed array integration
LiberTEM can generate efficient dask.distributed arrays from all supported dataset types with this release. That means it should be possible to use our high-performance file readers in applications outside of LiberTEM.
File formats
Support for various file formats has improved. More details: https://libertem.github.io/LiberTEM/formats.html
0.1.0 / 2018-11-06
Initial release of a minimum viable product and proof of concept.
Support for applying masks with high throughput on distributed systems with interactive web GUI display and scripting capability.