An Exascale Multi-Scale Framework for Solid Mechanics
eCSE05-005Key Personnel
PI/Co-Is: Dr. Anton Shterenlikht - Bristol University, Dr. Lee Margetts - University of Manchester, Prof. David Emerson - STFC Daresbury
Technical: Luis Cebamanos - EPCC, University of Edinburgh, Dr. David Arregui-Mena - University of Manchester
Relevant Documents
eCSE Technical Report: Exascale Multi-Scale Framework
Project summary
The solid meachanics community suffers from a lack of scalable codes suitable for complex solid mechanics problems. Many researchers are locked into using proprietary FE software limited to workstations, or paying per-core licence fees. This makes analysis at scale prohibitively expensive, even with academic licences. There is a clear need for scalable, flexible, verified and feature-rich simulation codes among solid mechanics researchers.
This project has delivered a scalable tool that can satisfy this need, initially for fracture problems. A multi-scale engineering modelling framework for deformation and fracture in heterogeneous materials (polycrystals, porous graphite, concrete, metal matrix or carbon fibre composites) has been developed. This has good potential for use on HPC systems, such Tier-1 ARCHER, Tier-2 Hartree centre systems, and smaller University and departmental systems.
The majority of the ARCHER community uses Fortran for their projects, due to many factors, some of which are: performance, availability of optimising compilers and of international standards, and a vast codebase including very high-quality libraries. Coarrays are a standard Fortran feature. However, their uptake has been limited so far, due to poor compiler support and a lack of successful examples of coarray programs at scale. This project has delivered a scalable MPI/coarray modelling framework, which will serve as one such example. This, together with dissemination via the Cray User Group (CUG), could encourageexisting ARCHER users to use coarrays in their programs to improve scaling.
In addition to ARCHER, the multi-scale framework has been ported to Intel and GCC/OpenCoarray platforms, which dramatically increases the number of potential users. The added portability enables many new optimisation possibilities, because coarray implementations differ substantially across vendors. This also increases vendor competition, with resultant benefits to users.
Extensive profiling work with the TAU toolkit resulted in substantial improvement of TAU support for coarrays. As a result of this work, TAU is now the most powerful tool for coarray/MPI hybrid programs on non-Cray systems. The scientific impact of this is improved understanding mechanical behaviour of polycrystals on scales from sub-grain (micro- or nano-scale), through to multiple grains (meso-scale), to engineering (macro) scales.
The CAFE model can be used to answer diverse questions, e.g. what is the expected statistical distribution of structural integrity predictions, or what is the scatter in the ductile to brittle transitional behaviour of materials. The generality of the CAFE framework design means it can be applied to other fields of study, e.g. bone fractures or delamination failure of composites.
Achievement of objectives
The project objectives were:
- Sustainable, future proof progress of an open source multi-scale modelling framework.
This objective has been fully achieved. The framework is hosted on sf.net, with CGPACK library achieving 25 downloads/month (up from 10 at the beginning of the project). The ParaFEM library has achieved 8000+ downloads from the ParaFEM website (up from around 4000) and 50 downloads/month from sf.net. According to Google Analytics, the most significant increase in traffic to the ParaFEM website has been from Russia. Increasing usage of the multi-scale framework helped the team secure a grant from the Software Sustainability Institute (SSI) to develop an autotools build framework for CGPACK to increase portability and make the build/install process more user friendly.
Portability, standard compliance and fault tolerance.
This objective has been fully achieved with regard to the planned portability and standard compliance work. The framework has been ported to GCC/OpenCoarrays and Intel compilers with the use of only Fortran 2008 standard features. The fault tolerance objective has not been achieved due to poor compiler support. Only a single feature, FAILED_IMAGES has become available towards the end of this project, and is currently supported only by GCC/OpenCoarrays. Fortran 2015 text has been finalised in June 2016, and it is expected that the fault tolerant features will gradually become available over the next 1-2 years in at least 3 compilers: Cray, Intel and GCC/OpenCoarrays. We will continue monitoring compiler support for these features and implement them in the multi-scale framework as they become available.
Benchmarking data for publication.
This has been partly achieved. Due to slow progress towards improved scaling, elasto-plastic benchmarks at or near full ARCHER capacity were deemed not feasible. Good benchmarking data was collected for other problems. This data has been used already for conference presentation. A publication in Int. J of HPC Applications is in preparation. It will also use this benchmarking data.
Scaling.
Objective has been achieved. A lot of work was dedicated to profiling and tracing analysis, which resulted in replacing all-to-all communication pattern with the nearest neighbour algorithms. The scaling limit was increased from 2k cores at the start of the project to 7k cores at the end.
Implementing MPI/IO.
Objective 5 has been achieved for CGPACK. NetCDF and HDF5 filters have been written for cellular automata data structures. However, the performance of NetCDF and HDF5 writers has been disappointing compared with direct use of MPI/IO. This issue is currently being investigated. With ParaFEM, a decision was made to implement a standard engineering binary format supported by the visualisation tool Paraview instead of NetCDF and HDF5 and use MPI/IO directly. There were two main reasons for this decision; (i) uptake is hindered by usability and our end users will be more comfortable using a standard engineering format rather than having the additional barrier of learning about NetCDF and HDF5 and (ii) the performance improvement using MPI/IO looks more promising than these libraries. We had an issue getting Paraview to read binary files written by ParaFEM. After significant debugging effort, we found that Paraview would not read binary files written by Fortran, only by C or C++. Our workaround was to use C data types in ParaFEM (provided by the ISO C Binding in Fortran 2003 and later) and write out C binary files from the Fortran program. So for ParaFEM, the eCSE has provided us with a big push in the right direction, but we ran out of resource before completion. We will complete this objective ourselves.
Summary of the software
All our software used in this work is freely available from sf.net under BSD license (except for a few specific EPCC contributed routines distributed under Apache license), allowing usage, modification and redistribution in academia and for profit. This includes CGPACK and ParaFEM (also on SourceForge.
All sf.net features are used for CGPACK and ParaFEM development: support tickets, discussion forum, svn read-only access for users and write access for all developers, production of regular versioned distributions, links to documentation, etc.
Paraview has been used extensively in this project for visualisation, together with HDF5, xdmf, and NetCDF.