Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/blog/.authors.yml
Original file line number Diff line number Diff line change
Expand Up @@ -119,3 +119,8 @@ authors:
description: European Molecular Biology Laboratory, Germany
avatar: https://avatars.githubusercontent.com/u/44709261?v=4
slug: https://github.com/stefanomarangoni495
admccartney:
name: Adam McCartney
description: Austrian Scientific Computing (ASC), TU Wien
avatar: https://avatars.githubusercontent.com/u/35410331?v=4
slug: https://github.com/adammccartney
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
241 changes: 241 additions & 0 deletions docs/blog/posts/2026/03/eessi-musica.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,241 @@
---
authors: [admccartney]
date: 2026-03-27
slug: eessi-musica
---

# Choosing EESSI as a base for MUSICA

<figure markdown="span">
![MUSICA](MUSICA-v2-32-Matthias_Heisler.jpg){width=75%}
<figcaption>(c) Matthias Heisler 2026</figcaption>
</figure>

MUSICA (Multi-Site Computer Austria) is the latest addition to Austria's
national supercomputing infrastructure. The system's compute resources
are distributed across three locations in Austria: Vienna, Innsbruck,
and Linz. We describe the process that led to the adoption of EESSI
as a base for the software stack on the MUSICA system at the Austrian
Scientific Computing (ASC) research center.

<!-- more -->

The background section aims to provide a brief history of how cluster
computing at ASC has evolved, with a particular focus on the various
incarnations of the software stack. We outline our motivations for
redesigning a system that delivers the software stack, for initial
use on the MUSICA HPC system. We describe the timeline of events that
lead to the experiments with EESSI and EasyBuild, and offer details of
the two complementary approaches of building a software stack that we
compared. Finally, we offer a critical reflection on our experiments
and outline our ultimate reason for choosing to use EESSI as a base and
blueprint for the software stack.


## Background

The ASC (formerly VSC) is a national center for high performance
computing and research powered by scientific software. The flagship
cluster VSC-1 was in service from 2009-2015, succeeded by a series of
clusters (2-5)[^1]. VSC 4 and 5 are the two clusters that remain in
service as of 2025, they will be joined at the end of the year by a new
cluster MUSICA, which stands for Multi-Site Compute Austria. MUSICA is
a GPU centric cluster run on OpenStack and has so far been the main
testing ground for our initial experiments with EasyBuild and EESSI.

The management of the software stack at ASC evolved along the following
lines:

+ VSC 1, 2: Initially catered to small groups of expert users, all
software was installed manually

+ VSC 3, 4: Still partially managed by hand. A set of scripting tools for
structuring software directory trees. These tools were initially copied
from Innsbruck and adapted to work on the VSC. Use of Tcl modules was also
adopted at this time.

+ VSC 4, 5: Spack introduced (reduced the need for custom install
scripts, install lots of software quickly, pull in dependencies
automatically)

## Motivation

Internal discussions led to a comprehensive understanding of where
the current software stack was lacking and where it would ideally be.
During the discussions, members of the user support team were able
to clearly articulate the various use cases generated by users. This
lead to setting a number of high level goals that were used to derive
requirements. At a very high level, some of the more important goals can
be summarized as:

- Improved reproducibility and redeployment.
- Establishment of clear release cycles.
- Creation of a more organized and user-friendly representations for the
cluster users.

We articulated what an ideal software stack should look like, and we
identified a number of issues with the way the software stack was
currently managed.

### Tooling & Presentation

The way that we had been using Spack and Tcl Modules had lead to a
fairly unmanageable situation on our clusters. To meet user requests
for software, we adopted a pragmatic approach. This lead to a situation
in which a myriad of software variants were installed into the shared
file system hosting the systems' software. This quickly lead to a
fairly overwhelming presentation of available modules to the user.
Another major issue here was that there were significant issues around
de duplication. We don't know the root cause of this, it may just have
been a misconfigured Spack. In any case, we ended up in an untenable
situation where certain dependencies would get installed many times
over. For example, there were multiple installs of the same OpenMPI
version on the system, all built slightly differently and most untested
on the systems. This meant that there was no way to indicate to the user
which version of a particular software was the one that worked.

### Build procedure hard to reproduce

During the last operating system upgrade, the need for a more automated
build process was painfully felt. Because most software was built ad-hoc
in response to user request, sometimes the only record of the build procedure
were the build artefacts themselves. This meant manually going over a very large
software repository and rebuilding everything more or less by hand for the new
operating system.

### Poor bus factor

This one refers to the well known metric from software engineering about
the degree of shared knowledge on a specialized domain within the team.
How many people would have to be hit by a bus before the team could
cease to carry out its work? In this particular case, the knowledge about the
software stack was concentrated in one or two individuals.

## Searching

As outlined above, the numerous issues with the current stack
established the frame in which to search for a set of tools and methods
to ease the realisation of the high level goals for the software
stack. To reiterate, manageability and user-friendliness were top of the list.


### Timeline

We formed the The Software And Modules (SAM) working group in Q4 2024.
SAM consists of 5 people that are dedicating the majority of their
time to exploring possible alternative ways of building, managing and
presenting the software stack to users. The members draw on expertise
from different areas, notably from their work on the user-support,
sysadmin and platform teams. The goal for the new software stack was to
have it up and running on the new MUSICA system towards the end of 2025.

+ Summer 2024
Initial meetings that highlighted the need to reform the management
of software so that it could be easy to use, transparent and logical,
as well as tested and performant. This is the first mention of
EESSI/EasyBuild as possible alternatives to Spack and Lmod as an alternative
to Tcl Modules.

+ Autumn 2024
Working group established and a broad set of tools and approaches were
compared. Guix/Nix, Spack, EasyBuild, EESSI, Lmod, and ReFrame These
tools were installed on a number of existing systems and briefly tried
out against a set of high level user requirements that we agreed.
Outcome was to focus on Easybuild and EESSI.

+ Winter 2024 - Spring 2025
Made the strategic decision to have EESSI installed on the MUSICA
system. Decided to run a small experiment whereby a small software
stack would be built and installed, in order to compare and contrast
approaches - "EESSI on the side" vs. "EESSI as a base"


In June 2024, the system entered a closed test phase, with core software
provided by EESSI. The custom stack will be extended during the course
of the test phase.

## Experiments

### Test stack

The following programs were agreed upon as a way to come in to contact
with specific workflows, such as writing easyconfig files, writing
custom easybuild hooks, installing commercial software, installing gpu
specific application software.

+ AOCC 5.0.0
+ Intel Compilers
+ Vasp 6.5.0
+ 1 Commercial software (starccm, mathematica)
+ NVHPC
+ VASP 6.5.0 GPU
+ Containers (singularity, docker, nvidia)

### EESSI on the side

This approach in a sense represents the traditional way to build a
software stack, building everything directly on the host (Rocky9), and
relying on system libraries. It used scripts and wrappers from the sse2
toolkit from National Supercomputer Centre at Linköping University as
a way to manage and structure the modules and software installations.
The software builds were a mixture of EasyBuild scripts and makefiles.
EESSI was offered as a module in its pure form and in general users were
discouraged from using EESSI-extend, or at their own risk.

### EESSI as a base

With this approach, we leveraged EESSI-extend extensively and aimed to
build the whole stack with the compatibility layer from EESSI as a base.
The learning curve for building software more or less moved back and
forth between three distinct phases, leveraging the various possible
settings for the EESSI-extend module.

+ Phase 0 -> EESSI_USER_INSTALL
+ Phase 1 -> EESSI_SITE_INSTALL
+ Phase 2 -> EESSI_PROJECT_INSTALL=/cvmfs/software.asc.ac.at


## Reflections

### EESSI on the side

By comparison, it was much quicker and easier to build all the software
in list using this approach. It also offers a lot of control to the
sysadmin who builds the software and doing things like tweaking or
modifying module files in place was possible. The downsides were
reproducibility and portability, there would be obvious work involved
with building the stack again upon the next OS upgrade. That said,
everything worked much more smoothly than with EESSI-extend, it was
possible to build all the software that was listed and run basic tests
with Slurm. We had some open questions around interoperability between
custom modules and EESSI, and whether it would be problematic to mix
modules from the two independent stacks without running into issues
(probably not due to different libc versions).


### EESSI as a base

By the end of the closed test phase of MUSICA, the engineering team
chose EESSI as the foundation for the software stack. While this approach
introduced complexity into our build and installation workflows, it
enabled us to meet certain key requirements for the MUSICA software
infrastructure.

Specifically, we leveraged CVMFS to distribute the software stack across
the three sites - Vienna, Linz, and Innsbruck. EESSI offers access
to approximately 1960 modules that are ready to load on the target
architecture. Setting up EESSI was quite straight forward, and despite
team members finding the many options of installing with EESSI-extend
module too complex, adopting this method aligned with modern practices
for managing HPC software. EESSI is open source, well documented, and
maintained by colleagues within Europe's HPC ecosystem.

Engaging with EESSI's documentation, source code, and community proved
valuable. We identified a reusable blueprint that we could adapt to fit
our specific needs. Despite the initial learning curve, this approach
provided long-term benefits in terms of maintainability and scalability.


# Footnotes

[^1]: <https://docs.vsc.ac.at/systems/>