ARCHER White Papers

White Papers produced by the ARCHER Service.

Analysis of parallel I/O use on the UK national supercomputing service, ARCHER using Cray's LASSi and EPCC SAFE

Version 1.0, May 21, 2019

Andrew Turner, EPCC, The University of Edinburgh
Dominic Sloan-Murphy, EPCC, The University of Edinburgh
Karthee Sivalingam, Cray European Research Lab
Harvey Richardson, Cray European Research Lab
Julian Kunkel, Department of Computer Science, University of Reading

In this paper, we describe how we have used a combination of the LASSi tool (developed by Cray) and the SAFE software (developed by EPCC) to collect and analyse Lustre I/O performance data for all jobs running on the UK national supercomputing service, ARCHER; and to provide reports on I/O usage for users in our standard reporting framework. We also present results from analysis of parallel I/O use on ARCHER and analysis on the potential impact of different applications on file system performance using metrics we have derived from the LASSi data. We show that the performance data from LASSi reveals how the same application can stress different components of the file system depending on how it is run, and how the LASSi risk metrics allow us to identify use cases that could potentially cause issues for global I/O performance and work with users to improve their I/O use. We use the IO-500 benchmark to help us understand how LASSi risk metrics correspond to observed performance on the ARCHER file systems. We also use LASSi data imported into SAFE to identify I/O use patterns associated with different research areas, understand how the research workflow gives rise to the observed patterns and project how this will affect I/O requirements in the future. Finally, we provide an overview of likely future directions for the continuation of this work.

Full white paper (PDF)

Performance of HPC Application Benchmarks across UK National HPC services: single node performance

Version 1: March 29, 2019

DOI: 10.5281/zenodo.2616549

Andrew Turner, EPCC, THe University of Edinburgh

In this report compare the performance of different processor architectures for different application benchmarks. To reduce the complexity of the comparisons, we restrict the results in this report to single node only. This allows us to compare the performance of the different compute node architectures without the additional complexity of also comparing different interconnect technologies and topologies. Multi-node comparisons will be the subject of a future report. Architectures compared in this report cover three generations of Intel Xeon CPUs, Marvell Arm ThunderX2 CPUs and NVidia GPUs.

Full report (on GitHub)

Benefits of the ARCHER eCSE Programme

Version 1.0, July 30, 2018

Lorna Smith, Alan Simpson, Chris Johnson, Xu Guo, Neelofer Banglawala, EPCC, The University of Edinburgh

The eCSE programme has allocated funding to the UK computational science community through a series of funding calls over a period of 5 years. The goal throughout has been to deliver a funding programme that is fair, transparent, objective and consistent. The projects funded through this programme were selected to contribute to the following broad aims:

  • To enhance the quality, quantity and range of science produced on the ARCHER service through improved software;
  • To develop the computational science skills base, and provide expert assistance embedded within research communities, across the UK;
  • To provide an enhanced and sustainable set of HPC software for UK science.

The eCSE programme is a significant source of funding for the Research Software Engineering community and all UK Higher Education Institutions are able to apply for funding. This document provides more detail on the programme, looking at how the funding has been spent and examining the various benefits realised from the programme.

Full white paper (PDF)

A survey of application memory usage on a national supercomputer: an analysis of memory requirements on ARCHER

Version 1.0, October 3, 2017

Andy Turner, EPCC, The University of Edinburgh
Simon McIntosh-Smith, Department of Computer Science, University of Bristol

In this short paper we set out to provide a set of modern data on the actual memory per core and memory per node requirements of the most heavily used applications on a contemporary, national-scale supercomputer. This report is based on data from the UK national supercomputing service, ARCHER, a 118,000 core Cray XC30, in the 1 year period from 1 July 2016 to 30 June 2017 inclusive. Our analysis shows that 80% of all usage on ARCHER has a maximum memory use of 1 GiB/core or less (24 GiB/node or less) and that there is a trend to larger memory use as job size increases. Analysis of memory use by software application type reveals differences in memory use between periodic electronic structure, atomistic N-body, grid-based climate modelling, and grid-based CFD applications. We present an analysis of these differences, and suggest further analysis and work in this area. Finally, we discuss the implications of these results for the design of future HPC systems, in particular the applicability of high bandwidth memory type technologies.

Full white paper (PDF)

Source data (CSV format):

Parallel I/O Performance Benchmarking and Investigation on Multiple HPC Architectures

Version 1.4, June 29, 2017

Andy Turner, Xu Guo, Dominic Sloan-Murphy, Juan Rodriguez Herrera, EPCC, The University of Edinburgh
Chris Maynard , Met Office, United Kingdom
Bryan Lawrence, The University of Reading

Solving the bottleneck of I/O is a key consideration when optimising application performance, and an essential step in the move towards exascale computing. Users must be informed of the I/O performance of existing HPC resources in order to make best use of the systems and to be able to make decisions about the direction of future software development effort for their application. This paper therefore presents benchmarks for the write capabilities for ARCHER, comparing them with those of the Cirrus, COSMA, COSMA6, UK-RDF DAC, and JASMIN systems, using MPI-IO and, in selected cases, the HDF5 and NetCDF parallel libraries.

Full white paper (PDF)

Source code (GitHub)

Using Dakota on ARCHER

Version 1.0, March 22, 2017

Gordon Gibb, EPCC, The University of Edinburgh

Dakota[1] is a toolkit that automates running a series of simulations whose input param- eters can be varied in order to determine their effects on the simulation results. In par- ticular, Dakota can be used to determine optimal parameter values, or quantify a model’s sensitivity to varying parameters.

This white paper describes how to use Dakota on ARCHER.

Full white paper (PDF)

UK National HPC Benchmarks

Version 1.0, March 16, 2017

Andy Turner, EPCC, The University of Edinburgh

This paper proposes an updated set of benchmarks for the UK National HPC Service based on historical use patterns and consultations with users.

Full white paper (PDF)

Implementation of Dual Resolution Simulation Methodology in LAMMPS

Version 1.2, October 19, 2016

Iain Bethune, EPCC, The University of Edinburgh
Sophia Wheeler, Sam Genheden and Jonathan Essex, The University of Southampton

This white paper describes the implementation in LAMMPS of the Dual Resolution force-field ELBA. In particular, symplectic and time-reversible integrators for coarse-grained beads are provided for NVE, NVT and NPT molecular dynamics simulations and a new weighted load balancing scheme allows for improved parallel scaling when multiple timestepping (r-RESPA) is used. The new integrators are available in the 30th July 2016 release of LAMMPS and the load balancer in the lammps-icms branch. A version of LAMMPS with all of this functionality has been installed on ARCHER as the module 'lammps/elba' and is available to all users.

Full white paper (PDF)

Invertastic: Large-scale Dense Matrix Inversion

Version 1.0, June 15, 2016

Alan Gray, EPCC, The University of Edinburgh

This white paper introduces Invertastic, a relatively simple application designed to invert an arbitrarily large dense symmetric positive definite matrix using multiple processors in parallel. This application may be used directly (e.g. for genomic studies where the matrix represents the genetic relationships between multiple individuals), or instead as a reference or template for those wishing to implement large-scale linear algebra solutions using parallel libraries such as MPI, BLACS, PBLAS, ScaLAPACK and MPI-IO. The software is freely available on GitHub and as a centrally available package on ARCHER (at /work/y07/y07/itastic)

Full white paper (PDF)

VOX-FE: New functionality for new communities

Version 1.1, June 10, 2016

Neelofer Banglawala and Iain Bethune, EPCC, The University of Edinburgh
Michael Fagan and Richard Holbrey, The University of Hull

This white paper describes new functionality implemented in the VOX-FE finite element bone modelling package funded by ARCHER eCSE project 04-11. In particular, we describe new features in the GUI for setting up realistic muscle-wrapping boundary conditions, improvements to the performance of the solver by using ParMETIS to generate optimal partitioning of the model, and better automation for the dynamic remodelling process. The VOX-FE code is freely available from under a BSD licence.

Full white paper (PDF)

Using NETCDF with Fortran on ARCHER

Version 1.1, January 29, 2016

Toni Collis, EPCC, The University of Edinburgh

This paper explains one particular approach to parallel IO based on the work completed in an ARCHER funded eCSE on the TPLS software package [2]: using NetCDF. There are multiple resources available online for using NetCDF, but the majority focus on software written in C. This guide aims to help users of ARCHER who have software written in modern Fortran (90/95 onwards) to take advantage of NetCDF using parallel file reads and writes.

Full white paper (PDF)

Voxel-based finite element modelling with VOX-FE2

Version 1.0, 20 May 2015

Neelofer Banglawala and Iain Bethune, EPCC, The University of Edinburgh
Michael Fagan and Richard Holbrey, The University of Hull

This white paper summarises the work of an ARCHER eCSE project to redevelop the VOX-FE voxel finite element modelling package to improve its capabilities, performance and usability. We have developed a new GUI, implemented as a Paraview plugin, a new solver which uses PETSc and demonstrated how iterative remodelling simulations can be run on ARCHER.

Full white paper (PDF)

Parallel Software usage on UK National HPC Facilities

Version 1.0, 23 Apr 2015

Andy Turner, EPCC, The University of Edinburgh

Data and analysis of parallel applications on UK National HPC facilities HECToR and ARCHER including:

  • Trends in application usage over time: which applications have declined in use and which have become more important to particular research communities; and why might this be?
  • Trends in the sizes of jobs: which applications have been able to increase their scaling properties in line with architecture changes and which have not? Can we identify why this is the case?
  • Changes in research areas on the systems: which areas have appeared/increased and which have declined?

Full white paper (PDF)

Supplementary Data

Using RSIP Networking with Parallel Applications on ARCHER Phase 2

Version 1.0, 7 Apr 2015

Iain Bethune, EPCC, The University of Edinburgh

Instructions on how to use RSIP to enable TCP/IP communications between parallel jobs on the compute nodes and the login nodes (and beyond). Two case study applications are shown: Parallel visualisation using ParaView, and Path Integral Molecular Dynamics with i-PI and CP2K.

Full white paper (PDF)

Monitoring the Cray XC30 Power Management Hardware Counters

Version 1.3, 19 Dec 2014

Michael Bareford, EPCC, The University of Edinburgh

Monitoring code power usage using the Cray XC30 hardware counters. Impact of compiler and parallel programming model on power consumption on ARCHER.

Full white paper (PDF)

Performance of Parallel IO on ARCHER

Version 1.1, 15 Jun 2015

David Henty, Adrian Jackson, EPCC, The University of Edinburgh
Charles Moulinec, Vendel Szeremi, STFC, Daresbury

Performance benchmarchs and advice for parallel IO on ARCHER. This is work in progress and will be continually updated.

Full white paper (PDF)

Source code

What's with all this Python import scaling, anyhow?

Version 1.0, 17 Dec 2014

Nick Johnson, EPCC, The University of Edinburgh

Addressing the issue of poor scaling performance on Python import. This is work in progress and will be continually updated.

Full white paper (PDF)