Skip to content

Machine learning applications and frameworks

CSCS supports a wide range of machine learning (ML) applications and frameworks on its systems. Most ML workloads are containerized to ensure portability, reproducibility, and ease of use across environments.

Users can choose between running containers, using provided uenv software stacks, or building custom Python environments tailored to their needs.

Running machine learning applications with containers

Containerization is the recommended approach for ML workloads on Alps, as it simplifies software management and maximizes compatibility with other systems.

  • Users are encouraged to build their own containers, starting from popular sources such as the Nvidia NGC Catalog, which offers a variety of pre-built images optimized for HPC and ML workloads. Examples include:
  • For frequently changing dependencies, consider creating a virtual environment (venv) mounted into the container.

Helpful references:

Using provided uenv software stacks

Alternatively, CSCS provides pre-configured software stacks (uenvs) that can serve as a starting point for machine learning projects. These environments provide optimized compilers, libraries, and selected ML frameworks.

Available ML-related uenvs:

To extend these environments with additional Python packages, it is recommended to create a Python Virtual Environment (venv). See this PyTorch venv example for details.

Note

While many Python packages provide pre-built binaries for common architectures, some may require building from source.

Building custom Python environments

Users may also choose to build entirely custom software stacks using Python package managers such as uv or conda. Most ML libraries are available via the Python Package Index (PyPI).

To ensure optimal performance on CSCS systems, we recommend starting from an environment that already includes:

  • CUDA, cuDNN
  • MPI, NCCL
  • C/C++ compilers

This can be achieved either by:

and extending it with a virtual environment.