Tips and tricks

This is a collection of various helpful tips that don’t fit in elsewhere.

Using SSH forwarding

As there is no built-in authentication yet, LiberTEM should not listen on a network port where untrusted parties have access. You can use ssh port forwarding from localhost instead to access LiberTEM from a different computer.

For example with conda:

$ ssh -L 9000:localhost:9000 <remote-hostname> "conda activate libertem; libertem-server"

Or, with virtualenv:

$ ssh -L 9000:localhost:9000 <remote-hostname> "/path/to/virtualenv/bin/libertem-server"

This makes LiberTEM, which is running on remote-hostname, available on your local host via http://localhost:9000/

Alternatively, you can launch and access LiberTEM on remote systems through Jupyter integration.

Activating ipywidgets in Jupyter

Some examples use ipywidgets in notebooks, most notably the fast BQLive2DPlot. In some cases the corresponding Jupyter Notebook extension has to be activated manually:

$ jupyter nbextension enable --py widgetsnbextension

For BQLive2DPlot to function correctly in JupyterHub and possibly JupyterLab, the packages bqplot and bqplot-image-gl have to be installed in both the notebook environment and in the environment for the notebook server. This might require admin privileges.

Running in a top-level script

Since LiberTEM uses multiprocessing, the script entry point may have to be protected, most notably on Windows:

if __name__ == '__main__':
    # Here goes starting a LiberTEM Context etc
    ...

This is not necessary if LiberTEM is used in a Jupyter notebook or IPython.

Gatan Digital Micrograph and other embedded interpreters

If LiberTEM is run from within an embedded interpreter, the following steps should be taken. This is necessary for Python scripting in Gatan Digital Micrograph (GMS), for example.

The variable sys.argv may not be set in embedded interpreters, but it is expected by the multiprocessing module when spawning new processes. This workaround guarantees that sys.argv is set until this is fixed upstream:

if not hasattr(sys, 'argv'):
    sys.argv  = []

Furthermore, the correct executable for spawning subprocesses has to be set.

multiprocessing.set_executable(
    os.path.join(sys.exec_prefix, 'pythonw.exe'))  # Windows only

In GMS the script may have to run in an additional thread since loading SciPy in a GMS background thread doesn’t work. See https://www.gatan.com/python-faq for more information.

import threading

def main():
    # Here goes the actual script
    ...

if __name__ == '__main__':
    # Start the workload "main()" in a thread and wait for it to finish
    th = threading.Thread(target=main)
    th.start()
    th.join()

See our examples folder for a number of scripts that work in GMS!

Show deprecation warnings

Many warning messages via the warnings built-in module are suppressed by default, including in interactive shells such as IPython and Jupyter. If you’d like to be informed early about upcoming backwards-incompatible changes, you should activate deprecation warnings. This is recommended since LiberTEM is under active development.

import warnings

warnings.filterwarnings("default", category=DeprecationWarning)
warnings.filterwarnings("default", category=PendingDeprecationWarning)

Profiling long-running tests

Since our code base and test coverage is growing continuously, we should make sure that our test suite remains efficient to finish within reasonable time frames.

You can find the five slowest tests in the output of Tox, see Running the tests for details. If you are using pytest directly, you can use the --durations parameter:

(libertem) $ pytest --durations=10 tests/
(...)
================= slowest 10 test durations =============================
31.61s call     tests/udf/test_blobfinder.py::test_run_refine_affinematch
17.08s call     tests/udf/test_blobfinder.py::test_run_refine_sparse
16.89s call     tests/test_analysis_masks.py::test_numerics_fail
12.78s call     tests/server/test_job.py::test_run_job_delete_ds
10.90s call     tests/server/test_cancel.py::test_cancel_udf_job
 8.61s call     tests/test_local_cluster.py::test_start_local
 8.26s call     tests/server/test_job.py::test_run_job_1_sum
 6.76s call     tests/server/test_job.py::test_run_with_all_zeros_roi
 6.50s call     tests/test_analysis_masks.py::test_numerics_succeed
 5.75s call     tests/test_analysis_masks.py::test_avoid_calculating_masks_on_client
= 288 passed, 66 skipped, 6 deselected, 2 xfailed, 7 warnings in 260.65 seconds =

Please note that tests which involve starting a local cluster have long lead times that are hard to avoid.

In order to gain more information on what slows down a particular test, you can install the pytest-profiling extension and use it to profile individual slow tests that you identified before:

(libertem) $ pytest --profile tests/udf/test_blobfinder.py::test_run_refine_affinematch
(...)
749921 function calls (713493 primitive calls) in 5.346 seconds

Ordered by: cumulative time
List reduced from 1031 to 20 due to restriction <20>

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1    0.000    0.000    5.346    5.346 runner.py:76(pytest_runtest_protocol)
 44/11    0.000    0.000    5.344    0.486 hooks.py:270(__call__)
 44/11    0.000    0.000    5.344    0.486 manager.py:65(_hookexec)
 44/11    0.000    0.000    5.344    0.486 manager.py:59(<lambda>)
 44/11    0.001    0.000    5.344    0.486 callers.py:157(_multicall)
     1    0.000    0.000    5.331    5.331 runner.py:83(runtestprotocol)
     3    0.000    0.000    5.331    1.777 runner.py:172(call_and_report)
     3    0.000    0.000    5.330    1.777 runner.py:191(call_runtest_hook)
     3    0.000    0.000    5.329    1.776 runner.py:219(from_call)
     3    0.000    0.000    5.329    1.776 runner.py:198(<lambda>)
     1    0.000    0.000    5.138    5.138 runner.py:119(pytest_runtest_call)
     1    0.000    0.000    5.138    5.138 python.py:1355(runtest)
     1    0.000    0.000    5.138    5.138 python.py:155(pytest_pyfunc_call)
     1    0.004    0.004    5.137    5.137 test_blobfinder.py:149(test_run_refine_affinematch)
     5    0.159    0.032    3.150    0.630 generate.py:6(cbed_frame)
   245    0.001    0.000    2.989    0.012 masks.py:98(circular)
   245    0.046    0.000    2.988    0.012 masks.py:8(_make_circular_mask)
   245    0.490    0.002    2.941    0.012 masks.py:280(radial_bins)
   245    0.152    0.001    2.229    0.009 masks.py:212(polar_map)
    25    0.001    0.000    1.968    0.079 blobfinder.py:741(run_refine)

=============================== 1 passed, 1 warnings in 7.81 seconds ============================

Platform-dependent code and remote executor

Platform-dependent code in a lambda function or nested function can lead to incompatibilities when run on an executor with remote workers, such as the DaskJobExecutor. Instead, the function should be defined as part of a module, for example as a stand-alone function or as a method of a class. That way, the correct remote implementation for platform-dependent code is used on the remote worker since only a reference to the function and not the implementation itself is sent over.

Benchmark Numba compilation time

One has to capture the very first execution of a jitted function and compare it with subsequent executions to measure its compilation time. By default, pytest-benchmark performs calibration runs and possibly warmup rounds that don’t report the very first run.

The only way to completely disable this is to use the pedantic mode specifying no warmup rounds and two rounds with one iteration each:

@numba.njit
 def hello():
     return "world"


 @pytest.mark.compilation
 @pytest.mark.benchmark(
     group="compilation"
 )
 def test_numba_compilation(benchmark):
     benchmark.extra_info["mark"] = "compilation"
     benchmark.pedantic(hello, warmup_rounds=0, rounds=2, iterations=1)

That way the maximum is the first run with compilation, and the minimum is the second one without compilation. Tests are marked as compilation tests in the extra info as well to aid later data evaluation. Note that the compilation tests will have poor statistics since it only runs once. If you have an idea on how to collect better statistics, please let us know!

Simulating slow systems with control groups

Under Linux, it is possible to simulate a slow system using control groups:

sudo cgcreate -g cpu:/slow
sudo cgset -r cpu.cfs_period_us=1000000 slow
sudo cgset -r cpu.cfs_quota_us=200000 slow
sudo chown root:<yourgroup> /sys/fs/cgroup/cpu,cpuacct/slow
sudo chmod 664 /sys/fs/cgroup/cpu,cpuacct/slow

Then, as a user, you can use cgexec to run a command in that control group:

cgexec -g cpu:slow pytest tests/

This is useful, for example, to debug test failures that only seem to happen in CI or under heavy load. Note that tools like cgcreate only work with cgroups v1, with newer distributions using cgroups v2 you may have to adapt these instructions.

Jupyter

To use the Python API from within a Jupyter notebook, you can install Jupyter into your LiberTEM virtual environment.

(libertem) $ python -m pip install jupyter

You can then run a local notebook from within the LiberTEM environment, which should open a browser window with Jupyter that uses your LiberTEM environment.

(libertem) $ jupyter notebook

JupyterHub

If you’d like to use the Python API from a LiberTEM virtual environment on a system that manages logins with JupyterHub, you can easily install a custom kernel definition for your LiberTEM environment.

First, you can launch a terminal on JupyterHub from the “New” drop-down menu in the file browser. Alternatively you can execute shell commands by prefixing them with “!” in a Python notebook.

In the terminal you can create and activate virtual environments and perform the LiberTEM installation as described above. Within the activated LiberTEM environment you additionally install ipykernel:

(libertem) $ python -m pip install ipykernel

Now you can create a custom ipython kernel definition for your environment:

(libertem) $ python -m ipykernel install --user --name libertem --display-name "Python (libertem)"

After reloading the file browser window, a new Notebook option “Python (libertem)” should be available in the “New” drop-down menu. You can test it by creating a new notebook and running

In [1]: import libertem

See also Jupyter integration for launching the web GUI from JupyterHub or JupyterLab.