Here is a round-up of recent QMUL HPC cluster news from the ITS Research team, including information about new compute nodes.
New sdx nodes are now available
Last time, we mentioned that we were in the process of purchasing some new compute nodes for the cluster. After a period of testing, these 52 nodes have now been installed and have been added to the default public queues. With 36 Intel Cascade Lake cores and 384GB RAM per node, these will capably handle all kinds of workloads, and provide a welcome speed boost. More detail is available here.
Over the next few weeks we will complete the removal of dn0-dn80. These are legacy nodes with 12 slower cores, and are the last remaining machines with 1Gb network cards, which we have been keen to remove for overall filesystem efficiency.
The GPU nodes are proving popular, and we are pleased to announce that an additional four high-end Volta V100 GPU will be added in the next couple of weeks. Read more about our GPU nodes here
A big month for Research on Apocrita
The popularity of Apocrita continues to grow, and despite being the shortest month, February was the “biggest” month ever on Apocrita, in terms of cpuseconds of work performed - see here. It’s also worth bearing in mind that the amount of work done by a CPU in 1 second has effectively doubled over the last few years with faster machines being added to the cluster.
Additionally, if you are performing research within EPSRC remit and frequently run multi-node parallel jobs, then the Tier2 HPC systems provide plenty of resource which may reduce waiting times for the Apocrita sdv parallel nodes, and provide greater scope for running larger jobs. More detail is here.
Please don’t forget to cite Apocrita in your published research if you have used it for storage/computations. Knowledge of the various research projects that benefit from the service assists the business case to secure ongoing funding.
You may be aware that filesystem snapshots allow you to easily retrieve copies of your own files (e.g. in the case of accidental deletion/overwriting) from the previous few days, without need for a restore from tape backup. We’ve made some improvements to the system, and you can now access a .snapshots folder from anywhere in the filesystem, and you will be shown snapshotted files relative to your current position. You can read more here.
Research Software Engineers
A reminder that our Research Software Engineering team can help with code optimisation, parallelisation, and code review/advice, among other services. In particular we notice quite a few Fortran users who may benefit from having a chat with one of the team with specialism in that area.
We’ve been providing hands-on HPC workshops recently (Maths, and the BCI in the last month or so), to provide introductory and more advanced cluster skills for more effective research computing. We will be covering the Medical School over the next couple of months, and afterwards we will be looking to see where else there is demand. If you are interested, we can also run one for your department. The workshops tend to work well with groups of around 10-20 attendees, and we customise some sections based on the audience and typical use cases.