How to run CESM 1.0.6 on ARCHER

CESM 1.0.6 User Guide

The User Guide can be found at http://www.cesm.ucar.edu/models/cesm1.0/cesm/

NB: The PDF version is useful for searching the entire guide, however the PDF version gives no clue as to what text may be presented as links in the HTML version, so it is recommended to use both.

Installing CESM 1.0.6 on ARCHER

Download CESM 1.0.6 into your /work directory using svn. You'll need to run

module load svn

and then by following the instructions on http://www.cesm.ucar.edu/models/cesm1.0/tags/index.html#CESM1_0_6.


Once downloaded, add the following 3 files into the 'scripts' directory.

env_machopts.archer

Macros.archer

106_mkbatch.archer

NB then rename 106_mkbatch.archer to mkbatch.archer (e.g. remove the first 4 characters).

In the same directory, namely scripts/ccsm_utils/Machines, replace the existing config_machines.xml file with the following version (this version is identical to the release version; however, an entry for 'archer' has been added)

config_machines.xml

NB these files produce a debug version for ARCHER, at present, and have passed the 11 test cases described in step 1 of http://www.cesm.ucar.edu/models/cesm1.0/cesm/cesm_doc_1_0_4/x2333.html

All users must edit the config_machines.xml file to reflect their local installation. Specifically, all lines which include the string 'gavin2' must be changed, at least.  

 

Before building CESM

Note, before building CESM, the input and output directories must exist. (These directories are sometimes referred to as the "temporary archives" in the CESM User Guides.) An input directory exists on ARCHER, and resides in a space which any ARCHER user can read and but not write to.  This input directory contains large and popular input data files. NB both input and output directories must be created by hand by each user in their own work directory, e.g.

mkdir /work/ecse0116/ecse0116/gavin2/CESM1.0/output

These directories are then referenced in config_Machines.xml, e.g.

DIN_LOC_ROOT_CSMDATA="/work/ecse0116/shared/CESM1.0/inputdata"
DIN_LOC_ROOT_CLMQIAN="/work/ecse0116/shared/CESM1.0/inputdata/atm/datm7/atm_forcing.datm7.Qian.T62.c080727"
DOUT_S_ROOT="/work/ecse0116/ecse0116/gavin2/CESM1.0/output/$CASE"

Building the cprnc tool

Finally, one must build, by hand, the cprnc tool.

To make the cprnc tool, first upload the following Makefile.archer (20 Aug 2014) to your cprnc directory, which will resemble:

/work/ecse0116/ecse0116/gavin2/CESM1.0/models/atm/cam/tools/cprnc

Once uploaded, run the following commands to make the cprnc tool:

module switch PrgEnv-cray PrgEnv-intel
module load netcdf
make clean -f Makefile.archer
make -f Makefile.archer

Once built, the config_machines.xml file must be updated with the full path and name of your new executable, e.g.

CCSM_CPRNC="/work/ecse0116/ecse0116/gavin/CESM1.0/models/atm/cam/tools/cprnc/cprnc"

NB ensure your current directory is included in your INCLUDE path

Building CESM

Firstly, change directory to the scripts directory in the 'work' installation of CESM, e.g.

cd /work/ecse0116/ecse0116/gavin/CESM1.0/scripts

Then issue the following three commands:

module load svn
module switch PrgEnv-cray PrgEnv-intel
module load netcdf

The first enables any missing input/restart files to be downloaded during the building process.

The second switches to the intel programming environment.

And, finally, the third loads the netcdf library. Due to this specific order, it is the intel netcdf library that is loaded.

Building tests

The process of building the CESM tests is slightly different from building simulations.

For the test ERS.f19_g16.X, say, issue the following commands in the 'scripts' directory.

./create_test -testname ERS.f19_g16.X.archer -testid t01

cd ERS.f19_g16.X.archer.t01

./ERS.f19_g16.X.archer.t01.build

At present, the output of these processes contain multiple instances of the following string. NB this 'error' can safely be ignored.

ModuleCmd_Switch.c(172):ERROR:152: Module 'PrgEnv-cray' is currently not loaded

Running the test

To run the test, run the following command:

qsub ERS.f19_g16.X.archer.t01.run

Building your simulation

The code is built, from scratch, for each simulation the user wants to run.

Consider, say, the following Atmospheric model (2 degree): 1.9x2.5_1.9x2.5 f19_f19 F_AMIP_CAM5.

This is configured, build and submitted, for a case called, my_first_sim, say, using the following commands:

./create_newcase -case my_first_sim -res f19_f19 -compset F_AMIP_CAM5 -mach archer
cd my_first_sim
./configure -case
./my_first_sim.build

Consider the create_newcase command: the -case flag assigns a local name to be given. Here I have used 'my_first_sim'; the -res flag assigns the mesh resolution; the -compset flag assigns the computation set of codes to employ; finally, the -mach flag assigns the name of the platform; in this case 'archer'.

Consider the build command: if the input/restart files are not present, then the build command downloads the necessary files. As such, this command can take over an hour. Further, if the build fails with an error which references the /tmp directory, simply run the build command again as it is likely the system was very busy and the build command temporarily ran out of memory.

Before running the simulation, users should check both the your_name_for_this.archer.run file and the env_run.xml file, as the default values produce only a short run.

Running your simulation

To run the simulation, run the following command:

qsub my_first_sim.run

How to change from the default settings

Before building

Changing the number of cores

Changing the number of cores to 128, say

cd $CASE
NTASKS=128
./xmlchange -file env_mach_pes.xml -id NTASKS_ATM -val $NTASKS
./xmlchange -file env_mach_pes.xml -id NTASKS_LND -val $NTASKS
./xmlchange -file env_mach_pes.xml -id NTASKS_ICE -val $NTASKS
./xmlchange -file env_mach_pes.xml -id NTASKS_OCN -val $NTASKS
./xmlchange -file env_mach_pes.xml -id NTASKS_CPL -val $NTASKS
./xmlchange -file env_mach_pes.xml -id NTASKS_GLC -val $NTASKS
./xmlchange -file env_mach_pes.xml -id TOTALPES -val $NTASKS
./configure -cleanmach
./configure -case
./$CASE.$MACH.build

Changing simulation units

cd $CASE

# change given STOP_OPTION value to nyears

./xmlchange -file env_run.xml -id STOP_OPTION -val nyears

# change given STOP_N to 20

./xmlchange -file env_run.xml -id STOP_N -val 20

# don't produce restart files at end

./xmlchange -file env_run.xml -id REST_OPTION -val never

# or *do* produce restart files at end

#./xmlchange -file env_run.xml -id REST_OPTION -val $STOP_N

./configure -cleanmac

./configure -case

./$CASE.$MACH.build

Parallel netcdf library

The parallel and serial versions of the netcdf are both available within the default build on ARCHER.

The default setting is to employ the serial netcdf libraries.

To employ the parallel netcdf libraries, change directory to the $CASE and run

./xmlchange -file env_run.xml -id PIO_TYPENAME -val pnetcdf

which change the value of PIO_TYPENAME from netcdf to pnetcdf, before building.  (This is contrary to the User Guide which states the value is changed after building)

The number of IO tasks is PIO_NUMTASKS, and the default value is -1 which instructs the library to select a suitable default value.

Changing the batch script

Change the budget to your budget account

 

In the file mkbatch.archer, change the line

set account_name = "ecse0116"

to

set account_name = "<budget>"

where the string <budget> is replaced by the name of your budget on ARCHER.

Requesting high memory nodes

Archer has two types of compute nodes, 2632 nodes with 64MBs of shared memory and 376 nodes with 128MBs. Both have 24 cores which share this memory.

During the Validation Process, it was found that the larger memory nodes were required to run some of the tests. To use the larger memory nodes, update the batch script, namely the *.run file, to select the larger memory nodes, specifically, to run one 4 large memory nodes, set

#PBS -l select=4:bigmem=true

else to run on 4 smaller memory nodes set

#PBS -l select=4:bigmem=false

or, if you don't mind which node you run on, set

#PBS -l select=4

Requesting longer wall times

Users are limited to requesting a maximum walltime of 24 hours, e.g.

#PBS -l walltime=24:00:00

however, if the job requires more time, then users can increase this limit to 48 hours by using the PBS long flag, e.g.

#PBS -q long

#PBS -l walltime=48:00:00