Containers and clusters

New in version 0.9.0: LiberTEM repository on Docker Hub with images for public use.

Changed in version 0.12.0: Container images are now available from the GitHub container registry instead.

Note

The LiberTEM server will only bind to localhost by default, unless token-based authentication is enabled or the --insecure flag is provided. It should run behind a reverse proxy that supplies the token and adds encryption, authentication and authorization when access from an untrusted network is desired. Jupyter integration can be used for that.

Third-party commands like dask-scheduler may bind to public interfaces by default, which can expose a machine to remote code execution. For that reason the interface should always be specified and limited to trusted networks.

Furthermore, the default command of the LiberTEM Docker container starts a server that binds to all interfaces to facilitate integration. Docker runs containers in an isolated environment and requires the user to expose the port explicitly. Singularity, however, will run the container like a regular user program. That means it exposes an insecure LiberTEM server to all interfaces when running the default container command. For that reason the command to run should always be specified explicitly when using Singularity.

Containers

Docker images can be found in the LiberTEM repository on the GitHub contrainer registry. LiberTEM is installed in a virtual environment in /venv/ in the Docker image. The executables libertem-server, dask-scheduler and libertem-worker can be found in /venv/bin/, consequently. The container runs /venv/bin/libertem-server --host 0.0.0.0 --insecure --port 9000 by default.

When using Docker, you can run and expose the LiberTEM server to localhost while accessing local data like this:

$ docker run -p 127.0.0.1:9000:9000 \
  --mount type=bind,source=/path/to/your/data/,dst=/data/,ro ghcr.io/libertem/libertem

To use the Docker image and Singularity to start libertem-server:

$ singularity exec docker://ghcr.io/libertem/libertem /venv/bin/libertem-server

Available versions

The tag “latest” (default) points to the stable release with the highest version number. Version tags for all stable releases are available as well. See the LiberTEM contrainers on the GitHub container registry for details.

Note

Older versions are also available on Docker Hub.

Updating

You can update to the latest release like this:

$ docker pull ghcr.io/libertem/libertem

or

$ singularity pull docker://ghcr.io/libertem/libertem

Starting a custom cluster

LiberTEM can connect to a running Dask cluster. To start a cluster on localhost, first run a scheduler:

(libertem) $ dask-scheduler --host localhost

GPU support in LiberTEM requires specific resource tags and environment settings on the dask workers. The easiest way to start workers with the appropriate settings is

(libertem) $ libertem-worker tcp://localhost:8786

There are a few command line options available:

Usage: libertem-worker [OPTIONS] [SCHEDULER]

Options:
  -k, --kind TEXT             Worker kind. Currently only "dask" is
                              implemented.
  -d, --local-directory TEXT  local directory to manage temporary files
  -c, --n-cpus INTEGER        Number of CPUs to use, defaults to number of CPU
                              cores without hyperthreading.
  -u, --cudas TEXT            List of CUDA device IDs to use, defaults to all
                              detected CUDA devices. Use "" to deactivate
                              CUDA.
  -p, --has-cupy BOOLEAN      Activate CuPy integration, defaults to detection
                              of installed CuPy module.
  -n, --name TEXT             Name of the cluster node, defaults to host name
  -l, --log-level TEXT        set logging level. Default is 'info'. Allowed
                              values are 'critical', 'error', 'warning',
                              'info', 'debug'.
  --preload TEXT              Module, file or code to preload on workers, for
                              example HDF5 plugins. Can be specified multiple
                              times. See also
                              https://docs.dask.org/en/stable/how-
                              to/customize-initialization.html#preload-scripts
                              for the behavior with Dask workers (current
                              default)and https://libertem.github.io/LiberTEM/
                              reference/dataset.html#hdf5 for information on
                              loading HDF5 files that depend on custom
                              filters.
  --help                      Show this message and exit.

New in version 0.6.0.

New in version 0.9.0: --preload was added.

For a cluster setup, you can run the scheduler on the appropriate network interface and run workers on all cluster nodes to connect to the scheduler.

You can then connect to the cluster’s scheduler URL in the LiberTEM web GUI.

For easier deployment of in container-based environments, you can also use the Docker image.

Example: Start a scheduler and workers in an isolated environment with Docker.

$ docker run --mount type=bind,source=/path/to/your/data/,dst=/data/,ro \
  ghcr.io/libertem/libertem /venv/bin/dask-scheduler
$ docker run --mount type=bind,source=/path/to/your/data/,dst=/data/,ro \
  ghcr.io/libertem/libertem /venv/bin/libertem-worker tcp://<scheduler-addr>:8786

Example: Start a scheduler and workers in the context of the local user with Singularity.

$ singularity exec docker://ghcr.io/libertem/libertem /venv/bin/dask-scheduler --host localhost
$ singularity exec docker://ghcr.io/libertem/libertem /venv/bin/libertem-worker tcp://localhost:8786