Tips and tricks

Using SSH forwarding

As there is currently no built-in authentication yet, listening on a different host than / localhost is disabled. As a workaround, if you want to access LiberTEM from a different computer, you can use ssh port forwarding. For example with conda:

$ ssh -L 9000:localhost:9000 <remote-hostname> "source activate libertem; libertem-server"

Or, with virtualenv:

$ ssh -L 9000:localhost:9000 <remote-hostname> "/path/to/virtualenv/bin/libertem-server"

This makes LiberTEM, which is running on remote-hostname, available on your local host via http://localhost:9000/

Running LiberTEM from an embedded interpreter

If LiberTEM is run from within an embedded interpreter, the following steps should be taken. This is necessary for Python scripting in Digital Micrograph, for example.

The variable sys.argv may not be set in embedded interpreters, but it is expected by the multiprocessing module when spawning new processes. This workaround guarantees that sys.argv is set until this is fixed upstream:

if not hasattr(sys, 'argv'):
    sys.argv  = []

Furthermore, the correct executable for spawning subprocesses has to be set.

    os.path.join(sys.exec_prefix, 'pythonw.exe'))  # Windows only

Show deprecation warnings

Many warning messages via the warnings built-in module are suppressed by default, including in interactive shells such as IPython and Jupyter. If you’d like to be informed early about upcoming backwards-incompatible changes, you should activate deprecation warnings. This is recommended since LiberTEM is under active development.

import warnings

warnings.filterwarnings("default", category=DeprecationWarning)
warnings.filterwarnings("default", category=PendingDeprecationWarning)

Profiling long-running tests

Since our code base and test coverage is growing continuously, we should make sure that our test suite remains efficient to finish within reasonable time frames.

You can find the five slowest tests in the output of Tox, see Running the tests for details. If you are using pytest directly, you can use the --durations parameter:

(libertem) $ pytest --durations=10 tests/
================= slowest 10 test durations =============================
31.61s call     tests/udf/
17.08s call     tests/udf/
16.89s call     tests/
12.78s call     tests/server/
10.90s call     tests/server/
 8.61s call     tests/
 8.26s call     tests/server/
 6.76s call     tests/server/
 6.50s call     tests/
 5.75s call     tests/
= 288 passed, 66 skipped, 6 deselected, 2 xfailed, 7 warnings in 260.65 seconds =

Please note that functional tests which involve starting a local cluster have long lead times that are hard to avoid.

In order to gain more information on what slows down a particular test, you can install the pytest-profiling extension and use it to profile individual slow tests that you identified before:

(libertem) $ pytest --profile tests/udf/
749921 function calls (713493 primitive calls) in 5.346 seconds

Ordered by: cumulative time
List reduced from 1031 to 20 due to restriction <20>

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1    0.000    0.000    5.346    5.346
 44/11    0.000    0.000    5.344    0.486
 44/11    0.000    0.000    5.344    0.486
 44/11    0.000    0.000    5.344    0.486<lambda>)
 44/11    0.001    0.000    5.344    0.486
     1    0.000    0.000    5.331    5.331
     3    0.000    0.000    5.331    1.777
     3    0.000    0.000    5.330    1.777
     3    0.000    0.000    5.329    1.776
     3    0.000    0.000    5.329    1.776<lambda>)
     1    0.000    0.000    5.138    5.138
     1    0.000    0.000    5.138    5.138
     1    0.000    0.000    5.138    5.138
     1    0.004    0.004    5.137    5.137
     5    0.159    0.032    3.150    0.630
   245    0.001    0.000    2.989    0.012
   245    0.046    0.000    2.988    0.012
   245    0.490    0.002    2.941    0.012
   245    0.152    0.001    2.229    0.009
    25    0.001    0.000    1.968    0.079

=============================== 1 passed, 1 warnings in 7.81 seconds ============================

Platform-dependent code and remote executor

Platform-dependent code in a lambda function or nested function can lead to incompatibilities when run on an executor with remote workers, such as the DaskJobExecutor. Instead, the function should be defined as part of a module, for example as a stand-alone function or as a method of a class. That way, the correct remote implementation for platform-dependent code is used on the remote worker since only a reference to the function and not the implementation itself is sent over.

Benchmark Numba compilation time

One has to capture the very first execution of a jitted function and compare it with subsequent executions to measure its compilation time. By default, pytest-benchmark performs calibration runs and possibly warmup rounds that don’t report the very first run.

The only way to completely disable this is to use the pedantic mode specifying no warmup rounds and two rounds with one iteration each:

 def hello():
     return "world"

 def test_numba_compilation(benchmark):
     benchmark.extra_info["mark"] = "compilation"
     benchmark.pedantic(hello, warmup_rounds=0, rounds=2, iterations=1)

That way the maximum is the first run with compilation, and the minimum is the second one without compilation. Tests are marked as compilation tests in the extra info as well to aid later data evaluation. Note that the compilation tests will have poor statistics since it only runs once. If you have an idea on how to collect better statistics, please let us know!