ARCHER White Papers
White Papers produced by the ARCHER Service.
A survey of application memory usage on a national supercomputer: an analysis of memory requirements on ARCHER
Version 1.0, October 3, 2017
Andy Turner, EPCC, The University of Edinburgh
Simon McIntosh-Smith, Department of Computer Science, University of Bristol
In this short paper we set out to provide a set of modern data on the actual memory per core and memory per node requirements of the most heavily used applications on a contemporary, national-scale supercomputer. This report is based on data from the UK national supercomputing service, ARCHER, a 118,000 core Cray XC30, in the 1 year period from 1 July 2016 to 30 June 2017 inclusive. Our analysis shows that 80% of all usage on ARCHER has a maximum memory use of 1 GiB/core or less (24 GiB/node or less) and that there is a trend to larger memory use as job size increases. Analysis of memory use by software application type reveals differences in memory use between periodic electronic structure, atomistic N-body, grid-based climate modelling, and grid-based CFD applications. We present an analysis of these differences, and suggest further analysis and work in this area. Finally, we discuss the implications of these results for the design of future HPC systems, in particular the applicability of high bandwidth memory type technologies.
Source data (CSV format):
- Overall Memory Usage Statistics
- VASP Memory Usage Statistics
- CASTEP Memory Usage Statistics
- CP2K Memory Usage Statistics
- GROMACS Memory Usage Statistics
- LAMMPS Memory Usage Statistics
- NAMD Memory Usage Statistics
- Met Office UM Memory Usage Statistics
- MITgcm Memory Usage Statistics
- SBLI Memory Usage Statistics
- OpenFOAM Memory Usage Statistics
Parallel I/O Performance Benchmarking and Investigation on Multiple HPC Architectures
Version 1.4, June 29, 2017
Andy Turner, Xu Guo, Dominic Sloan-Murphy, Juan Rodriguez Herrera, EPCC, The University of Edinburgh
Chris Maynard , Met Office, United Kingdom
Bryan Lawrence, The University of Reading
Solving the bottleneck of I/O is a key consideration when optimising application performance, and an essential step in the move towards exascale computing. Users must be informed of the I/O performance of existing HPC resources in order to make best use of the systems and to be able to make decisions about the direction of future software development effort for their application. This paper therefore presents benchmarks for the write capabilities for ARCHER, comparing them with those of the Cirrus, COSMA, COSMA6, UK-RDF DAC, and JASMIN systems, using MPI-IO and, in selected cases, the HDF5 and NetCDF parallel libraries.
Using Dakota on ARCHER
Version 1.0, March 22, 2017
Gordon Gibb, EPCC, The University of Edinburgh
Dakota[1] is a toolkit that automates running a series of simulations whose input param- eters can be varied in order to determine their effects on the simulation results. In par- ticular, Dakota can be used to determine optimal parameter values, or quantify a model’s sensitivity to varying parameters.
This white paper describes how to use Dakota on ARCHER.
UK National HPC Benchmarks
Version 1.0, March 16, 2017
Andy Turner, EPCC, The University of Edinburgh
This paper proposes an updated set of benchmarks for the UK National HPC Service based on historical use patterns and consultations with users.
Implementation of Dual Resolution Simulation Methodology in LAMMPS
Version 1.2, October 19, 2016
Iain Bethune, EPCC, The University of Edinburgh
Sophia Wheeler, Sam Genheden and Jonathan Essex, The University of Southampton
This white paper describes the implementation in LAMMPS of the Dual Resolution force-field ELBA. In particular, symplectic and time-reversible integrators for coarse-grained beads are provided for NVE, NVT and NPT molecular dynamics simulations and a new weighted load balancing scheme allows for improved parallel scaling when multiple timestepping (r-RESPA) is used. The new integrators are available in the 30th July 2016 release of LAMMPS and the load balancer in the lammps-icms branch. A version of LAMMPS with all of this functionality has been installed on ARCHER as the module 'lammps/elba' and is available to all users.
Invertastic: Large-scale Dense Matrix Inversion
Version 1.0, June 15, 2016
Alan Gray, EPCC, The University of Edinburgh
This white paper introduces Invertastic, a relatively simple application designed to invert an arbitrarily large dense symmetric positive definite matrix using multiple processors in parallel. This application may be used directly (e.g. for genomic studies where the matrix represents the genetic relationships between multiple individuals), or instead as a reference or template for those wishing to implement large-scale linear algebra solutions using parallel libraries such as MPI, BLACS, PBLAS, ScaLAPACK and MPI-IO. The software is freely available on GitHub and as a centrally available package on ARCHER (at /work/y07/y07/itastic)
VOX-FE: New functionality for new communities
Version 1.1, June 10, 2016
Neelofer Banglawala and Iain Bethune, EPCC, The University of Edinburgh
Michael Fagan and Richard Holbrey, The University of Hull
This white paper describes new functionality implemented in the VOX-FE finite element bone modelling package funded by ARCHER eCSE project 04-11. In particular, we describe new features in the GUI for setting up realistic muscle-wrapping boundary conditions, improvements to the performance of the solver by using ParMETIS to generate optimal partitioning of the model, and better automation for the dynamic remodelling process. The VOX-FE code is freely available from https://sourceforge.net/projects/vox-fe/ under a BSD licence.
Using NETCDF with Fortran on ARCHER
Version 1.1, January 29, 2016
Toni Collis, EPCC, The University of Edinburgh
This paper explains one particular approach to parallel IO based on the work completed in an ARCHER funded eCSE on the TPLS software package [2]: using NetCDF. There are multiple resources available online for using NetCDF, but the majority focus on software written in C. This guide aims to help users of ARCHER who have software written in modern Fortran (90/95 onwards) to take advantage of NetCDF using parallel file reads and writes.
Voxel-based finite element modelling with VOX-FE2
Version 1.0, 20 May 2015
Neelofer Banglawala and Iain Bethune, EPCC, The University of Edinburgh
Michael Fagan and Richard Holbrey, The University of Hull
This white paper summarises the work of an ARCHER eCSE project to redevelop the VOX-FE voxel finite element modelling package to improve its capabilities, performance and usability. We have developed a new GUI, implemented as a Paraview plugin, a new solver which uses PETSc and demonstrated how iterative remodelling simulations can be run on ARCHER.
Parallel Software usage on UK National HPC Facilities
Version 1.0, 23 Apr 2015
Andy Turner, EPCC, The University of Edinburgh
Data and analysis of parallel applications on UK National HPC facilities HECToR and ARCHER including:
- Trends in application usage over time: which applications have declined in use and which have become more important to particular research communities; and why might this be?
- Trends in the sizes of jobs: which applications have been able to increase their scaling properties in line with architecture changes and which have not? Can we identify why this is the case?
- Changes in research areas on the systems: which areas have appeared/increased and which have declined?
Supplementary Data
- HECToR Phase 2a Usage Data (txt)
- HECToR Phase 2b Usage Data (txt)
- HECToR Phase 3 Usage Data (txt)
- ARCHER Usage Data (txt)
Using RSIP Networking with Parallel Applications on ARCHER Phase 2
Version 1.0, 7 Apr 2015
Iain Bethune, EPCC, The University of Edinburgh
Instructions on how to use RSIP to enable TCP/IP communications between parallel jobs on the compute nodes and the login nodes (and beyond). Two case study applications are shown: Parallel visualisation using ParaView, and Path Integral Molecular Dynamics with i-PI and CP2K.
Monitoring the Cray XC30 Power Management Hardware Counters
Version 1.3, 19 Dec 2014
Michael Bareford, EPCC, The University of Edinburgh
Monitoring code power usage using the Cray XC30 hardware counters. Impact of compiler and parallel programming model on power consumption on ARCHER.
Performance of Parallel IO on ARCHER
Version 1.1, 15 Jun 2015
David Henty, Adrian Jackson, EPCC, The University of Edinburgh
Charles Moulinec, Vendel Szeremi, STFC, Daresbury
Performance benchmarchs and advice for parallel IO on ARCHER. This is work in progress and will be continually updated.
What's with all this Python import scaling, anyhow?
Version 1.0, 17 Dec 2014
Nick Johnson, EPCC, The University of Edinburgh
Addressing the issue of poor scaling performance on Python import. This is work in progress and will be continually updated.