We have recently procured 120TB of NVMe based SSD storage from E8 Storage for the Apocrita HPC Cluster. The plan is to deploy this to replace our oldest and slowest provision of scratch storage. We have been performing extensive testing on this new storage as we expect it to offer new possibilities and advantages within the cluster.
In addition to the primary queue, there is a queue designed to minimise waiting times for short jobs and interactive sessions, in response to users who requested the ability to quickly obtain qlogin sessions for quick tests and debugging. This short queue runs on a wider selection of nodes and is automatically selected if your runtime request is 1 hour or less.
We removed some problematic module files. Please check your job scripts for use of these modules:
- Python: Due to a number of issues with the module installs of python,
older versions below
3.6.3are being removed from Apocrita (
- Java: version
java/1.8.0_121-oraclecauses problems with mass thread spawning on the cluster and will be removed.
java/1.8.0_152-oraclewill remain the default version loaded.
During the summer, home directories were migrated to the new storage platform. This means that quotas have grown slightly as the underlying block size has increased.
qmquota command will tell you how much space you are using,
and that quotas are applied on size as well as the number of files.
Each Research group gets a free 1Tb of storage space on the cluster; if your
group has not got one then please contact us and we can organise it for your
We identified a problem with the
openmpi/2.0.2-gcc module and have removed
it as the correct interface was not being used for the MPI communication
between nodes. This resulted in potentially much slower communication and
consequently jobs taking longer to run.
Programs should be rebuilt against the other available openmpi modules which correctly select the Infiniband interconnect as default for communication. Recent users of this module have been contacted directly.