3. File and Resource Management

This section covers some of the tools and technical knowledge that will be key to maximising the usage of the ARCHER system, such as the online administration tool SAFE and calculating the CPU-time available.

The default file permissions are then outlined, along with a description of changing these permissions to the desired setting. This leads on to the sharing of data between users and systems often a vital tool for project groups and collaboration.

Finally we cover some guidelines for I/O and data archiving on ARCHER.

3.1 The ARCHER Administration Web Site (SAFE)

All users have a login and password on the ARCHER Administration Web Site (also know as the 'SAFE'): www.archer.ac.uk/safe/. Once logged into this web site, users can find out much about their usage of the ARCHER system, including:

  • Account details - password reset, change contact details
  • Project details - project code, start and end dates
  • kAU balance - how much time is left in each project you are a member of
  • Filesystem details - current usage and quotas
  • Reports - generate reports on your usage over a specified period, including individual job records
  • Helpdesk - raise queries and track progress of open queries

3.2 Checking your CPU-time (kAU) balance

You can view these details by logging into the SAFE (www.archer.ac.uk/safe/). Scroll down to the panel marked "You are a member of the following project groups" and you will see how much time is left in a particular project.

You can also generate reports on your usage over a particular period and examine the details of how many kAUs individual jobs on the system cost. Scroll down to the bottom of your SAFE main page and click on the "Go to the report generator" button. Further information in the Safe User Guide

You can also get the current kAU quota (how much CPU time you have left) for your budgets on ARCHER login nodes by using the budgets command. This will list all the budgets you are a member of and their current values. For example:

user@system:~> budgets
=========================================
Budget       Remaining kAUs
-----------------------------------------
e05       No resources left
e05-surfin-smw      346.788
z01                2796.859
=========================================

3.3 ARCHER Filesystems

There are three filesystems available to ARCHER users:

  • /home filesystem - backed-up, NFS filesystem. Available on esLogin and service nodes.
  • /work filesystem - not backed-up, high-performance, Lustre filesystem. Available on esLogin, service and compute nodes.
  • RDF filesystems - resiliant, long-term, GPFS data storage. Available on esLogin nodes.

We provide more details on each of these filesystems below.

3.3.1 /home filesystem

Note: /home is not mounted on the compute nodes so all files required for your calculations must be available on the /work filesystem.

The home directory for each user is located at:

/home/[project code]/[group code]/[username]

where:

[project code]
is the code for your project (e.g., x01);
[group code]
is the code for your project group, if your project has groups, (e.g. x01-a) or the same as the project code, if not;
[username]
is your login name.

Each project is allocated a portion of the total storage available, and the project PI will be able to sub-divide this quota among the groups and users within the project. As is standard practice on UNIX and Linux systems, the environment variable $HOME is automatically set to point to your home directory.

The /home filesystem is backed up for disaster recovery purposes only. Data recovery for accidental deletion is not supported.This is the filesystem to use for critical files, such as source code, makefiles or other build scripts, binary libraries and executables obtained from third parties and small permanent datasets.

It should be noted that the /home filesystem is not designed, and does not have the capacity, to act as a long term archive for large sets of results. Users should use the RDF facility (see below); transfer such data to systems at their own institutions; or use suitable facilities elsewhere.

3.3.2 /work filesystem

/work is the only filesystem mounted on the compute nodes. All parallel calculations must be run from directories on the /work filesystem and all files required by the calculation (apart from the executable) must reside on /work.

/work is a collection of high-performance, parallel Lustre filesystems. Each project will be assigned space on a particular Lustre partition with the assignments chosen to balance the load across the available infrastructure. /work should be used for reading and writing during simulations.

In a similar way to the /home filesystem, the work directory for each user is located at:

/work/[project code]/[group code]/[username]

As for the /home filesystem, project PI (or Project Manager) will be able to sub-divide this quota among the groups and users within the project using the SAFE web interface.

/work is a network of high-performance Lustre filesystems, and hence includes some redundancy in case of hardware failure. There is no separate backup of data on either of the work filesystems, which means that in the event of a major hardware failure, or if a user accidently deletes essential data, it will not be possible to recover the lost files.

Note that /work is not a scratch space - files created on work will remain after the completion of the batch job which created them. Users who desire temporary scratch files only for the duration of the batch job must manage this explicitly within their programs or batch scripts.

Links from the /home filesystem to directories or files on /work are strongly discouraged. If links are used, executables and data files on /work to be used by applications on the compute nodes (i.e. those executed via the aprun command) should be referenced directly on /work.

3.3.3 RDF filesystems

The Research Data Facility (RDF) consists of 7.8 PB disk, with an additional 19.5 PB of backup tape capacity for disaster recovery purposes. Data recovery for accidental deletion is not supported. The RDF is external to the national services, and is designed as long term data storage. The RDF has 3 filesystems:

/general
/epsrc
/nerc

The disk storage is based on four DDN 10K storage arrays populated with near-line SAS 3TB 72000rpm HDDs. Metadata storage is based on two IBM DS3524s populated with SAS 300GB 10krpm HDDs. The backup capability is managed via Tivoli Storage Manager, based on an IBM TS3500 tape library with 12 drives.

For more information on RDF please see the UK Research Data Facility Guide. Also for instructions for transferring data to and from the RDF please see the Data Management Guide.

3.4 Disk quotas

Disk quotas on ARCHER are managed via SAFE

3.4.1 Checking disk quotas

On the "work" filesystems, the quota option to the Lustre lfs command can be used to get more detailed quota information than is available on the SAFE.

To check the quota for your project group:

lfs quota -g [project code] /work/[project code]

Information on the disk usage for an individual can be checked with

lfs quota -u [username] /work/[project code]

3.4.2 Allocating quotas to specific groups and users

Allocating quotas to specific groups and users is done via SAFE.

3.5 File permissions and security

By default, each user is a member of the group with the same name as [group_code] in the /home and /work directory paths, e.g. x01-a. This allows the user to share files with only members of that group by setting the appropriate group file access permissions. As on other UNIX or Linux systems, a user may also be a member of other groups. The list of groups that a user is part of can be determined by running the groups command.

It is, of course, possible to make files accessible to all users on ARCHER, by setting the "other" file access permissions appropriately. You should not allow "other" write access to any of your directories, and for security reasons it is recommended that you do not allow anyone else read or write access to important account information (e.g. $HOME/.ssh).

Default Unix file permissions can be specified by the umask command. The default umask value on ARCHER is 22, which provides "group" and "other" read permissions for all files created, and "group" and "other" read and execute permissions for all directories created. This is highly undesirable, as it allows everyone else on the system to access (but at least not modify or delete) every file you create. Thus it is strongly recommended that users change this default umask behaviour, by adding the command umask 077 to their $HOME/.profile file. This umask setting only allows the user access to any file or directory created. The user can then selectively enable "group" and/or "other" access to particular files or directories if required.

3.6 Sharing data with other ARCHER users

How you share data with other ARCHER users depends on whether they belong to the same project as you or not. Each project has two levels of shared directories that can be used for sharing data, as explained below. If you request support from the ARCHER CSE team you may be asked to copy data to a shared directory. This is because the ARCHER CSE team does not have access to user directories.

3.6.1 Sharing data with users in your project

Each project has a directory called:

/work/[project code]/[project code]/shared

that has read/write permissions for all project members. You can place any data you wish to share with other project members in this directory.

For example, if your project code is x01 the shared project directory would be located at:

/work/x01/x01/shared

3.6.2 Sharing data with all users

Each project also has a higher level directory called:

/work/[project code]/shared

that is writable by all project members and readable by any user on the system. You can place any data you wish to share with other ARCHER users who are not members of your project in this directory.

For example, if your project code is x01 the sharing directory would be located at:

/work/x01/shared

3.7 Sharing data between systems

Many users will be generating data on ARCHER and then analysing this data on other systems, e.g. at their host institutions. Users may also be using data files generated elsewhere as input for runs on ARCHER. Therefore it is important to consider the compatibility of data files between ARCHER and other systems. It is also important to consider compatibility issues for results files that are archived for future reference.

3.7.1 ASCII (or formatted) files

These are the most portable, but can be extremely inefficient to read and write. There is also the problem that if the formatting is not done correctly, the data may not be output to full precision (or to the subsequently required precision), resulting in inaccurate results when the data is used. Another common problem with formatted files is FORMAT statements that fail to provide an adequate range to accommodate future requirements, e.g. if we wish to output the total number of processors, NPROC, used by the application, the statement:

WRITE (*,'I3') NPROC

will not work correctly if NPROC is greater than 999.

3.7.2 Binary (or unformatted) files

These are much faster to read and write, especially if an entire array is read or written with a single READ or WRITE statement. However the files produced may not be readable on other systems. Tools and compiler options are available in many Fortran compilers to assist with accessing big-endian files on the Cray. These include:

GNU compiler -fconvert=swap compiler option.
This compiler option often needs to be used together with a second option -frecord-marker, which specifies the length of record marker (extra bytes inserted before or after the actual data in the binary file) for unformatted files generated on a particular system.
To read a binary file generated by a big-endian system on ARCHER, use
-fconvert=swap -frecord-marker=4.
Please note that due to the same 'length of record marker' reason, the unformatted files generated by GNU and other compilers on ARCHER are not compatible. In fact, the same WRITE statements would result in slightly larger files with GNU compiler. Therefore it is recommended to use the same compiler for your simulations and related pre- and post-processing jobs.

Other options for file formats include:

Direct access files
Fortran unformatted files with specified record lengths. These may be more portable between different systems than ordinary (i.e. sequential IO) unformatted files, with significantly better performance than formatted (or ASCII) files. The "endian" issue will, however, still be a potential problem.
Portable data formats
These machine-independent formats for representing scientific data are specifically designed to enable the same data files to be used on a wide variety of different hardware and operating systems. The most common formats are: It is important to note that these portable data formats are evolving standards, so make sure you are aware of which version of the standard/software you are using, and keep up-to-date with any backward-compatibility implications of each new release.

3.8 File IO Performance Guidelines

Here are some general guidelines

  • Whichever data formats you choose, it is vital that you test that you can access your data correctly on all the different systems where it is required. This testing should be done as early as possible in the software development or porting process (i.e. before you generate lots of data from expensive production runs), and should be repeated with every major software upgrade.
  • Document the file formats and metadata of your important data files very carefully. The best documentation will include a copy of the relevant I/O subroutines from your code. Of course, this documentation must be kept up-to-date with any code modifications.
  • Use binary (or unformatted) format for files that will only be used on the Cray system, e.g. for checkpointing files. This will give the best performance. Binary files may also be suitable for larger output data files, if they can be read correctly on other systems.
  • Most codes will produce some human-readable (i.e. ASCII) files to provide some information on the progress and correctness of the calculation. Plan ahead when choosing format statements to allow for future code usage, e.g. larger problem sizes and processor counts.
  • If the data you generate is widely shared within a large community, or if it must be archived for future reference, invest the time and effort to standardise on a suitable portable data format, such as netCDF or HDF.

3.9 Data archiving

The data archiving requirements may vary considerably between different ARCHER user communities, but some common points to consider are:

  • Clear records of what the data represents are vital. Careful choice of filenames and including metadata within the data files can assist in this process.
  • Using a tar file to combine multiple data files associated with a single calculation into a single file may not only improve the performance of downloads from ARCHER, but will also preserve the userid and timestamp information of when the data files were created.
  • Do not separate media from device. For example, having your data archived on a tape (particularly if this is done by someone else on your behalf) is not much use if in say five years time, when you need access to the data, you cannot get access to a compatible tape drive. Also be aware of the shelf life of media such as tapes, DVDs, etc.
  • Be aware of the limits on the redundancy and fault tolerance of your storage system. For example, many RAID storage systems include error correcting functionality, where single-bit errors in the data files can be fixed automatically. However, this may only apply when files are read or written - if files are not accessed for some time, a number of random single-bit errors may accumulate, resulting in corrupted or unreadable data.
  • When changing the formats or file structure of any I/O in your software (both the main production codes and any pre- or post-processing or visualisation codes), consider the impact of these changes on your ability to access old archived data files.

3.10 Use of /tmp

The /tmp directory is not available to applications on the compute nodes of the ARCHER machine. The /tmp directory on compute nodes is a memory resident file system; its use to store temporary files could, therefore, seriously affect application performance on those nodes. Temporary files created as part of an application run should be written to somewhere under your own directory on the "work" filesystem.

Note that some Fortran codes include file OPEN statements specifying STATUS='SCRATCH'. Such codes, if compiled with the GNU Fortran compiler, will attempt to use /tmp and fail. The solution is to set an environment variable to a directory in your "work" space. For GNU it is GFORTRAN_TMPDIR. In your batch script you should have e.g.:

export GFORTRAN_TMPDIR=/work/[project]/[group]/[username]/tmp

(replacing [project], [group] and [username] with your project, group and username). You should include the relevant line in your submission script and make sure that the specified directory exists before running the job.

Similarly some compilers use /tmp during compilation which can cause problems. This can be resolved by setting TMPDIR before compiling. For example, setting

 export TMPDIR=$HOME/tmp

after first ensuring that the specified directory exists before compiling.

On ARCHER login nodes, where full SUSE Linux is installed, /tmp is a regular temporary filespace used, for example, by compilers and tools. It should be noted that /tmp is regularly purged and should not be used by users for the storing of any data.

3.11 Backup policies

To be added.

2. Connecting and Transferring Data | Contents | 4. Application Development Environment