tutorial¶

October 27, 2022
in tutorial
4 min read

Speeding Up Grep Searches

Sometimes you may find yourself needing to filter a large amount of output using the grep command. However, grep can sometimes struggle when you try to filter files with an incredibly large number of lines, as it loads each line into RAM line-by-line. This can mean you can quickly exhaust even large amounts of requested RAM. There are a few ways around this.

June 15, 2022
in tutorial, featured
6 min read

Using job statistics to increase job performance and reduce queueing time

You may wonder why some jobs start immediately but some wait in the queue for hours or days, even if your job is quite simple. If you notice your job has been queueing for a while, you may want to consider adjusting the requested resources to reduce queueing time and reduce any potential resource wastage as the job runs. Below, we outline two useful tools for you to check the resource usage of previous jobs.

March 7, 2022
in tutorial
3 min read

Visualising HEALPix results with Jupyter Notebook

In this tutorial we'll be showing you how to visualise HEALPix results using Jupyter Notebook in our OnDemand appliance on the Apocrita HPC cluster. We'll start with installing the required Python packages before demonstrating how to run the Healpy tutorial.

Information about running other components of HEALPix not covered in this tutorial can be found on our docs site.

January 10, 2022
in tutorial
5 min read

File Permissions

An understanding of file permissions is important to the success of computational jobs, and the security of your files.

The default settings are suitable for some, but not every use-case: without sufficient awareness, your files may be visible to people who should not be able to access them, and vice-versa.

October 14, 2021
in tutorial, featured
11 min read

Running Machine Learning workloads on Apocrita

This blog post is outdated

The information in this article is now outdated, please see docs.hpc.qmul.ac.uk/apps/ml/tensorflow/ for recent documentation regarding TensorFlow installation and usage.

In this tutorial we'll be showing you how to run a TensorFlow job using the GPU nodes on the Apocrita HPC cluster. We will expand upon the essentials provided on the QMUL HPC docs site, and provide more explanation of the process. We'll start with software installation before demonstrating a simple task and a more complex real-world example that you can adapt for your own jobs, along with tips on how to check if the GPU is being used.

June 24, 2021
in tutorial
5 min read

A guide to using Apocrita's scratch storage

The Apocrita scratch storage is a high performance storage system designed for short-term file storage, such as working data. We recently replaced the hardware that provides this service, and expanded the capacity from 250TB to around 450TB. This article will look at the recent changes, and suggest some best practices when using the scratch system.

June 12, 2020
in tutorial
8 min read

SSH authentication and accessing Apocrita

In response to a coordinated security attack on HPC sites world-wide back in 2020, it was necessary to implement some changes to enforce a higher level of authentication security. In this article, we begin with providing some useful information to understand key-based authentication, and document the process for accessing the cluster.

October 23, 2019
in tutorial, featured
8 min read

Productivity tips for Apocrita cluster users

This article presents a selection of useful tips for running successful and well-performing jobs on the QMUL Apocrita cluster.

In the ITS Research team, we spend quite a bit of time monitoring the Apocrita cluster and checking jobs are running correctly, to ensure that this valuable resource is being used effectively. If we notice a problem with your job, and think we can help, we might send you an email with some recommendations on how your job can run more effectively. If you receive such an email, please don't be offended! We realise there are users with a range of experience, and the purpose of this post is to point out some ways to ensure you get your results as quickly and correctly as possible, and to ease the learning curve a little bit.

January 28, 2019
in tutorial, featured
8 min read

Sizing your Apocrita jobs for quicker results

At any one time, a typical HPC cluster is usually full. This is not such a bad thing, since it means the substantial investment is working hard for the money, rather than sitting idle. A less ideal situation is having to wait too long to get your research results. However, jobs are constantly starting and finishing, and many new jobs get run shortly after being added to the queue. If your resource requirements are rather niche, or very large, then you will be competing with other researchers for a more scarce resource. In any case, whatever sort of jobs you run, it is important to choose resources optimally, in order to get the best results. Using fewer cores, although increasing the eventual run time, may result in a much shorter queuing time.