Skip to content

The next era for Apocrita is here

For much of the year we have been working on a major project to upgrade Apocrita to a new operating system, (Rocky Linux 9, hereafter known as Rocky 9). As part of the project, we have deployed a new package building tool to help us recompile all of the research applications to work on the new system. We are now draining the cluster in batches to upgrade to Rocky 9, so now is a good opportunity to move across if you haven't already, to avoid longer queueing times as we reduce the number of remaining CentOS 7 nodes.

Frequently asked questions when migrating from CentOS 7

Please see the FAQs section for commonly asked questions relating to the migration from CentOS 7 to Rocky 9.

Notice

This blog post will receive further updates during the migration phase.

What is the new system?

The existing Apocrita cluster runs the CentOS 7 Operating System, which is based off Redhat Enterprise Linux (RHEL) 7. This has reached end-of-life and we are replacing it with Rocky 9, based off RHEL9. Apart from security and compliance reasons, this will provide the benefit of much newer libraries and applications.

The existing end-of-life platform is no longer able to support newer software releases - various packages are no longer working, as our community of R users have discovered recently. Our new platform offers newer versions of research applications, so this will be of particular interest to anyone wishing to get latest versions of software.

Additionally, due to recent decommissioning of older nodes, we have been able to compile the new applications against a more modern CPU architecture, providing better compatibility and performance for all applications.

We have currently provisioned a new login node, ondemand service and various types of compute and GPU nodes available on Rocky 9. This includes our brand new vhm high memory servers which can accommodate jobs up to 3TB RAM.

How can I use the new system?

Although jobs can be submitted from the CentOS 7 frontend nodes, you will not be able to view the list of newly-built applications from a CentOS 7 node. To access the Rocky 9 login node, SSH into login-test.hpc.qmul.ac.uk using your existing Apocrita username, password and SSH key. To check what applications are available, you can run module avail from this server.

We have provided a new ondemand server, ondemand-dev.hpc.qmul.ac.uk which brings new functionality, and newer versions of applications (for example a newer RStudio).

We will soon be moving the default login and ondemand servers to Rocky 9. We have a range of compute nodes running Rocky 9, which can be checked with nodestatus -t rocky from the login server.

To run on the new Rocky 9 nodes, you must:

  • request -l rocky to ensure your job is allocated to a Rocky 9 node
  • ensure your job script is loading a module provided under Rocky 9

Who should use the new system?

All users of Apocrita should now be ensuring that their workloads run correctly on Rocky 9. We have provisioned serial, parallel and GPU nodes running Rocky 9. Any issues encountered should be reported via the slack channel, or via support ticket.

Changes for conda users

Due to some recent anaconda licence changes, we now only provide the miniforge module for using conda environments. This works in the same way as anaconda and miniconda but does not use certain problematic anaconda repositories. Miniforge is the officially supported installation method for the faster Mamba libsolver and comes with the popular conda-forge channel pre-configured as its only channel, where most of the common conda packages are found. We strongly advise not to use any existing conda environments on Rocky 9 that you created previously on CentOS 7. Instead, you should start with fresh environments. If you want to "migrate" old ones, export them as a YML file on a CentOS 7 node and then use that same YML file to re-create the environment again on a Rocky 9 node.

More detailed information can be found in this separate blog post.

Migration from CentOS 7 FAQs

Do I need to copy/move my data across from the CentOS 7 cluster?

No. All current storage systems (home directories, scratch and lab shares) are available on Rocky 9 using the same path as the CentOS 7 cluster, for example /data/home/$USER for home directories and /data/scratch/$USER for scratch. You will not need to copy or move your data when migrating to Rocky 9.

How can I use applications on the Rocky 9 cluster?

The modules built on Rocky 9 are similar to those in CentOS 7, providing a familiar environment for users transitioning between the two clusters. To view a list of all available modules, run module avail.

Note that we have only built newer versions of software on the Rocky 9 cluster, so you should check that the versions available are suitable for your research.

See the Using Modules page on our docs site for more information.

How can I resolve GLIBC_2.17' not found errors in Rocky 9?

GLIBC is an immovable core library of the Operating System. Software requiring version 2.17 of GLIBC (CentOS 7) will need to be rebuilt using the newer version of GLIBC present in Rocky 9.

You may also need to remove any local files written in your home directory before rebuilding the software. This includes, but is not limited to: ~/.cache, ~/.jupyter and~/.local.

Do I need to kill my running CentOS 7 jobs?

No. Your currently running jobs can run to completion. Once they have completed, we recommend only submitting Rocky 9 jobs using the extra -l rocky parameter.

Please raise a support ticket if you require any assistance with submitting Rocky 9 jobs.

Do I need to request a new HPC account on the Rocky 9 cluster?

No. Your existing HPC username and password will continue to work on the Rocky 9 cluster.

Do I need to generate a new SSH key for the Rocky 9 cluster?

No. Even though we now recommend using ED25519 keys, we are still supporting existing 4096+ RSA keys.

See our SSH keys page on our docs site for more information.

How can I fix a REMOTE HOST IDENTIFICATION HAS CHANGED! login error?

This is shown because phase 2 of the cluster migration includes changing the login.hpc.qmul.ac.uk address to point to the Rocky 9 login servers, which have a different host identification than the CentOS 7 frontend nodes.

To fix this, you will need to remove the cached entry for login.hpc.qmul.ac.uk in your local ~/.ssh/known_hosts file by either manually removing it using a text editor, or by running the ssh-keygen -R login.hpc.qmul.ac.uk command. This is a one-off operation caused by the move from CentOS 7 to Rocky 9.

After removing the cached entry, simply log into the cluster again and accept the new host identification if prompted.


Photo by Chris Liverani on Unsplash.