Improving the ease-of-use and portability of the COSA 3D solver for open rotor unsteady aerodynamics
eCSE03-002Key Personnel
PI: Dr Sergio Campobasso - Lancaster University
Technical: Adrian Jackson - EPCC, University of Edinburgh
Relevant documents
eCSE Technical Report: COSA 3D Technical report
Project summary
In this project, we aimed to significantly improve both the performance and usability of COSA, a Navier-Stokes (NS) Computational Fluid Dynamics (CFD) code designed for internal and external flow applications. COSA also uses a harmonic balance frequency-domain algorithm to reduce the computational cost of simulating periodic flows such as those encountered in wind and tidal turbines, and helicopter rotors.
We have made a number of improvements to COSA to improve the performance and usability of the code. A large part of our optimisation work focused on improving the I/O performance of COSA at scale, and we have reduced the cost of I/O at large core counts by as much as 70%, significantly reducing parallel overheads and saving simulation time. New multi-frequency periodic boundary conditions (MFPBCs) enable harmonic balance simulations to fully realise the performance savings. To complement this optimisation work, we have implemented tools to convert COSA data to and from standard CFD data formats (CGNS and Tecplot), enabling data generated from other packages to be read by COSA and data produced by COSA to be read by other packages. This will simplify the simulation workflow of scientists using COSA and improve productivity.
We have implemented a load-balancing functionality that has demonstrated up to 3x performance improvements, and up to a 4x reduction in resource utilisation when compared to the existing code. This will greatly reduce the time and effort required to generate input grids or meshes for COSA, enabling the output of standard mesh generation tools to be used directly in COSA. It also enables the efficient use of full compute nodes, rather than relying on using a number of MPI processes that evenly divides the number of blocks in the simulations.
Finally, we have made the code easier to use through dynamic memory allocation, and enabled future communication optimisation by splitting sending/receiving and waiting for message completion. Altogether these performance improvements and functionality upgrades significantly increase the potential for COSA to be a highly useful, usable, and scalable simulation package.
Achievement of objectives
This project aimed at substantially improving:
1. The computational performance (reducing runtimes while improving parallel scalability from small to very high core counts)
2. The ease-of-use and portability (adopting portable I/O standards) of the COSA harmonic balance (HB) NS solver, the key tool of the COSA finite volume compressible CFD Fortran code.
We had a number of specific milestones in the project:
Milestone: The HB runtime analysis of rotor flows will be reduced by a factor equal to the number of blades. This will be verified by comparing the runtime of the HB analysis of the provided test case using the complete rotor grid without MFPBCs and that using a single grid sector with MFPBCs.
MFPBCs have been implemented in COSA, and benchmarked against running the same simulations without such boundary conditions (where the full domain has to be simulated rather than one rotor).
Milestone: the standardised parallel I/O functionality of COSA will have a parallel scalability comparable to that of the computing part of the code. This will be assessed by repeating all I/O scalability tests, and verifying that the mean deviation of the new I/O-only scalability curve from the ideal speed-up curve is of the same order of magnitude as that of the computation-only curve.
We have optimised and restructured the parallel I/O functionality in COSA, speeding up I/O performance by up to 70% on large core counts, and up to 50% on smaller core counts.We have implemented tools to convert COSA input and outputs into standardised I/O formats often used in CFD applications (CGNS and Tecplot).
Milestone: Improved parallel communication performance on COSA, ideally removing the MPI communication costs altogether (10-20% of the runtime of the code). This will be evaluated using profiling and benchmarking as with the other work-packages.
We have fully split the sending/receiving of data from the checking of message completion (MPI_Wait), enabling communication to be overlapped with calculation by starting communication, moving on to work on calculations, and then checking for message completion after calculations have finished. However, this has not significantly improved performance as the structure of the code currently does not allow for large amounts of overlapping with calculations.
Milestone: the new dynamic load balancing (DLB) capability will yield runtime reductions of about one order of magnitude for simulations using production multi-block grids generated with state-of-the-art grid generators.
We have implemented DLB of the domain decomposition in COSA, improving performance by around 3x and improving resource utilisation by around 4x for unbalanced grids. This will allow a significant reduction in the time and effort required for constructing simulation grids, and also allows for more efficient use of computational resources.
Milestone: the core computational kernels will vectorise using the Intel and Cray compilers on ARCHERS, which is expected to greatly reduce runtimes of all COSA simulation types.
Optimisation of the computational kernels in COSA has enabled us to achieve between a 3-5% performance improvement for the code.
Summary of the software
COSA is a finite volume compressible Navier-Stokes code, which contains a harmonic balance computational fluid dynamics solver, usable for accurate unsteady aerodynamic analysis of fluid flows and fluid/structure interaction problems (e.g. flow-induced structural vibrations) in renewable energy, mechanical and aeronautical engineering.
COSA is being developed for a wide class of low-, high- and multi-speed flows, with strong emphasis on open rotor unsteady aerodynamics. The HB method is a nonlinear frequency-domain technique that reduces the runtime for calculating periodic solutions of ordinary differential equations with respect to the conventional time-marching approach.
The reduction occurs because the HB method, unlike the conventional time-domain approach, determines directly the periodic solution of interest, bypassing lengthy transient effects. In aerodynamic performance, structural integrity and aero-acoustic assessments, the use of the HB NS technology rather than the conventional time-domain (TD) NS method to accurately determine periodic flows of turbomachinery blade rows, vibrating aircraft wings, and helicopter rotors was shown to reduce runtimes by one to two orders of magnitude. In the case of bladed rotors, the HB speed-up is particularly high due to the possibility of using multi-frequency periodicity boundary conditions enabling the modelling of flow past a single blade rather than the whole rotor. The HB NS COSA solver is pioneering the development and exploitation of this technology in wind turbine engineering worldwide.
COSA is available as a package on ARCHER.