Contributing

LiberTEM is intended and designed as a collaboratively developed platform for data analysis. That means all our development is coordinated openly, mostly on our GitHub repository where our code is hosted. Any suggestions, Issues, bug reports, discussions and code contributions are highly appreciated! Please let us know if you think we can improve on something, be it code, communication or other aspects.

Development principles

We have a rather extensive and growing list of things to work on and therefore have to prioritize our limited resources to work on items with the largest benefit for our user base and project. Supporting users who contribute code is most important to us. Please contact us for help! Furthermore, we prioritize features that create direct benefits for many current users or open significant new applications. Generally, we follow user demand with our developments.

For design of new features we roughly follow the lead user method, which means that we develop new features closely along a non-trivial real-world application in order to make sure the developments are appropriate and easy to use in practice. The interface for User-defined functions (UDFs), as an example, follows the requirements around implementing and running complex algorithms like strain mapping for distributed systems.

Furthermore we value a systematic approach to development with requirements analysis and evaluation of design options as well as iterative design with fast test and review cycles.

Code contributions

We are using pull requests to accept contributions. Each pull request should focus on a single issue, to keep the number of changes small and reviewable. To keep your changes organized and to prevent unrelated changes from disturbing your pull request, create a new branch for each pull request.

All pull requests should come from a user’s personal fork since we don’t push to the main repository for development. See the GitHub documentation on how to create and manage forks for details.

Before creating a pull request, please make sure all tests still pass. See Running the Tests for more information. You should also update the test suite and add test cases for your contribution. See the section Code coverage below on how to check if your new code is covered by tests.

To make sure our code base stays readable and consistent, we follow a Code Style.

Please update packaging/creators.json with your author information when you contribute to LiberTEM for the first time. This helps us to keep track of all contributors and give credit where credit is due! Please let us know if you wouldn’t like to be credited. Our Authorship policy describes in more detail how we manage authorship of LiberTEM and related material.

If you are changing parts of LiberTEM that are currently not covered by tests, please consider writing new tests! When changing example code, which is not run as part of the tests, make sure the example still runs.

When adding or changing a feature, you should also update the corresponding documentation, or add a new section for your feature. Follow the current documentation structure, or ask the maintainers where your new documentation should end up. When introducing a feature, it is okay to start with a draft documentation in the first PR, if it will be completed later. Changes of APIs should update the corresponding docstrings.

Please include version information if you add or change a feature in order to track and document changes. We use a rolling documentation that documents previous behavior as well, for example This feature was added in Version 0.3.0.dev0 or This describes the behavior from Version 0.3.0.dev0 and onwards. The previous behavior was this and that. If applicable, use versionadded and related directives.

The changelog for the development branch is maintained as a collection of files in the docs/source/changelog/*/ folder structure. Each change should get a separate file to avoid merge conflicts. The files are merged into the master changelog when creating a release.

The following items might require an update upon introducing or changing a feature:

  • Changelog snippet in docs/source/changelog/*/

  • Docstrings

  • Examples

  • Main Documentation

When you have submitted your pull request, someone from the LiberTEM organization will review your pull request, and may add comments or ask questions.

In case your PR touches I/O code, an organization member may run the I/O tests with access to test data sets on a separate Azure Agent, using the following comment on the PR:

/azp run libertem.libertem-data

If everything is good to go, your changes will be merged and you can delete the branch you created for the pull request.

Running the tests

Our tests are written using pytest. For running them in a repeatable manner, we are using tox. Tox automatically manages virtualenvs and allows testing on different Python versions and interpreter implementations.

This makes sure that you can run the tests locally the same way as they are run in continuous integration.

After installing tox, you can run the tests on all Python versions by simply running tox:

$ tox

Or specify a specific environment you want to run:

$ tox -e py39

For faster iteration, you can also run only a part of the test suite, without using tox. To make this work, first install the test requirements into your virtualenv:

(libertem) $ python -m pip install -r test_requirements.txt

Now you can run pytest on a subset of tests, for example:

(libertem) $ pytest tests/test_analysis_masks.py

Or you can run tests in parallel, which may make sense if you have a beefy machine with many cores and a lot of RAM:

(libertem) $ pytest -n auto tests/

See the pytest documentation for details on how to select which tests to run. Before submitting a pull request, you should always run the whole test suite.

Some tests are marked with custom markers, for example we have some tests that take many seconds to complete. To select tests to run by these marks, you can use the -m switch. For example, to only run the slow tests:

$ tox -- -m slow

By default, these slow tests are not run. If you want to run both slow and all other tests, you can use a boolean expression like this:

$ tox -- -m "slow or not slow"

Another example, to exclude both slow and dist tests:

$ tox -- -m "not dist and not slow"

In these examples, -- separates the the arguments of tox (left of --) from the arguments for pytest on the right. List of marks used in our test suite:

  • slow: tests that take much longer than 1 second to run

  • dist: tests that require a distributed Dask cluster setup

Example notebooks

The example notebooks are also run as part of our test suite using nbval. The output saved in the notebooks is compared to the output of re-running the notebook. When writing an example notebook, this sometimes requires some work, because certain things will change from run to run. Please check the nbval documentation to understand how to ignore such values. See also the nbval_sanitize.cfg file for our currently ignored patterns.

If your notebook outputs floating point values, you may get spurious failures from precision issues. You can set the precision using the %precision ipython magic, which should be used after importing numpy.

You can run the notebook tests as follows:

$ TESTDATA_BASE_PATH=/path/to/the/test/data tox -e notebooks

You will need access to certain sample data sets; as most of them are not published yet, please contact us to get access!

CUDA

To run tests that require CuPy using tox, you can specify the CUDA version with the test environment:

$ tox -e py39-cuda101

Code coverage

After running the tests, you can inspect the test coverage by opening htmlcov/index.html in a web browser. When creating a pull request, the change in coverage is also reported by the codecov bot. Ideally, the test coverage should go up with each pull request, at least it should stay the same.

Benchmarking

LiberTEM uses pytest-benchmark to benchmark certain performance-critical parts of the code. You can run the benchmarks ad-hoc using

$ pytest benchmarks/

The benchmarks for Numba compilation time are disabled by default since Numba caches compilation results, i.e. one has to make sure that benchmarked functions were not previously run in the same interpreter. To run them, you can use

$ pytest -m compilation benchmarks/

In order to record a complete benchmark run for later comparison, you can use

Note

This requires tox>=3.15 since we are using generative section names in tox.ini.

$ tox -e benchmark
$ # alternatively
$ tox -e benchmark-cuda101
$ tox -e benchmark-cuda102

This saves the benchmark data as a JSON file in a subfolder of benchmark_results. A process to commit such results and report them in a convenient fashion is to be developed. See #198, feedback welcome!

New in version 0.6.0: First benchmarks included to help resolve #814, benchmark coverage will grow over time.

Running tests for the client

To run the testsuite for the client, first install the JavaScript/TypeScript dependencies:

$ cd client/
$ npm install

Then, in the same directory, to run the tests execute:

$ npm run coverage

This will run all tests and report code coverage. If you want to run the tests while developing the client, you can run them in watch mode, which is the default:

$ cd client/
$ npm test

Code style

We try to keep our code PEP8 -compliant, with line-length relaxed to 100 chars, and some rules ignored. See the file .flake8 in the root for the current PEP8 settings. As a general rule, try to keep your changes in a similar style as the surrounding code.

Before submitting a pull request, please check the code style by running:

$ pre-commit run

You may need to install pre-commit into your Python environment first. You can use the following to automatically run the pre-commit hooks before each commit:

$ pre-commit install --install-hooks

We recommend using an editor that can check code style on the fly, such as Visual Studio Code.

Mypy type annotations

We are starting to introduce type annotations and checking them in CI with mypy. Adding type annotations improves developer experience, especially by improving auto completion and type information on hover in IDEs. Type checking is currently quite lax and opt-in. See the file .mypy-checked for the list of Python files that currently opt in. When adding new code, please consider adding new modules to this list.

The checks are run with pre-commit on changed files that opt in. You can run mypy locally on all files that opt in with:

$ pre-commit run --all-files mypy

Please note that in many cases the type for classes is specified with a string instead of the class itself. That allows to import classes for typing only if type checking is performed. See the section on forward references in PEP484 for more information.

For general information on type annotations in Python, including best practices, please also see Static Typing with Python.

Docstrings

The NumPy docstring guide is our guideline for formatting docstrings. We are testing docstring code examples in Continuous Integration using doctest. You can test files by hand by running pytest --doctest-modules <pathspec>.

Building the documentation

Documentation building is also done with tox, see above for the basics. It requires manual installation of pandoc on the build system since pandoc can’t be installed reliably using pip. To start the live building process:

$ tox -e docs

You can then view the documentation at http://localhost:8008, which will be rebuilt every time a change to the documentation source code is detected. Note that changes to the Python source code don’t trigger a rebuild, so if you are working on docstrings, you may have to manually trigger a rebuild, for example by saving one of the .rst files.

You can include code samples with the doctest sphinx extension and test them with

$ tox -e docs-check

Building the GUI (client)

The LiberTEM GUI (also called the client) is written in TypeScript, using a combination of React/Redux/Redux-Saga. The client communicates with the Python API server using both HTTP and websockets. Because browsers can’t directly execute TypeScript, there is a build step involved, which translates the TypeScript code into JavaScript that is then understood by the browser. This build step is needed both for development and then again for building the production version.

If you would like to contribute to the client, you first need to set up the development environment. For this, first install Node.js. On Linux, we recommend to install via package manager, on Windows the installer should be fine. Choose the current LTS version.

One you have Node.js installed, you should have the npm command available in your path. You can then install the needed build tools and dependencies by changing to the client directory and running the install command:

$ cd client/
$ npm install

Note

It is always a good idea to start development with installing the current dependencies with the above command. Having old versions of dependencies installed may cause the build to fail or cause unpredictable failures. In some cases, it makes sense to completely remove the node_modules folder before installing the node dependencies.

Once this command finished without errors, you can start a development server (also from the client directory):

$ npm run start

You can then press the o key to open the LiberTEM GUI in your default browser.

This server watches all source files for changes and automatically starts the build process. The development server will only be able to serve requests for JavaScript and other static files. For handling HTTP API requests you still need to run the Python libertem-server process on the default port (9000) alongside the development server:

$ libertem-server --no-browser

This allows proxying the HTTP API requests from the front-end server to the API server without opening an additional browser window that could interfere with the development server.

You can influence where the web API requests end up using environment variables, for example if you run it on a non-standard port, or if you are running the API server on another host:

$ API_HOST=127.0.0.1 API_PORT=9001 npm run start

To learn more about the build process, please see also the README in the client directory.

You can then use any editor you like to change the client source files, in the client/src directory. We recommend Visual Studio Code for its excellent TypeScript support.

To simplify development and installing from a git checkout, we currently always ship a production build of the client in the git repository. Please always open your pull request for the client as WIP and include a rebuilt production build after the PR is approved and ready to merge. You can create it using a tox shortcut:

$ tox -e build_client

This will build an optimized production version of the client and copy it into src/libertem/web/client. This version will then be used when you start a libertem-server without the client development proxy in front.

Advanced

See more: