Reproducing ECCO Version 4 Release 5 (Forward Simulation)

Reproducing ECCO Version 4 Release 5 (Forward Simulation)#

ECCO Version 4 Release 5 (V4r5) is ECCO’s next central estimate after V4r4. Reproducing ECCO V4r5 is essentially the same as reproducing ECCO V4r4, as described in Reproducing ECCO Version 4 Release 4. Like ECCO V4r4, ECCO V4r5 is a forward simulation with optimized controls that have been adjusted through an iterative adjoint-based optimization process to minimize the model–data misfit.

Compared to V4r4, ECCO V4r5 extends the model integration period from 1992–2017 to 1992–2019. It also includes ice sheets around Antarctica. Meltwater from these ice sheets is an important component of both regional and global sea level change. Including this melt introduces an important physical process that was missing in V4r4.

In this tutorial, we provide instructions on how to reproduce the ECCO V4r5 estimate on the P-Cluster.

Log in to P-Cluster#

Users first connect to the P-Cluster and change the directory to the user’s directory on /efs_ecco, as described in the P-Cluster introduction tutorial:

ssh -i /path/to/privatekey -X USERNAME@34.210.1.198

The directory /efs_ecco/USERNAME/ (replace USERNAME with the user’s actual username) is where the run should be conducted. Users can change to that directory with the following command:

cd /efs_ecco/USERNAME/

Modules#

Modules on Linux allow users to easily configure their environment for specific software, such as compilers (e.g., GCC, Intel) and MPI libraries (e.g., OpenMPI, MPICH). Users can switch between versions without manually setting environment variables. Running ECCO on different machines and platforms often involves a different set of modules tailored to the system’s architecture and operating system. The modules used in the P-Cluster are loaded in the example .bashrc, which should have been downloaded and renamed to /home/USERNAME/.bashrc as described in the P-Cluster introduction tutorial, so that the required modules are loaded automatically. Specificity, the modules loaded in example .bashrc are as follows:

Module Load Command

module load intel-oneapi-compilers-2021.2.0-gcc-11.1.0-adt4bgf

module load intel-oneapi-mpi-2021.2.0-gcc-11.1.0-ibxno3u

module load netcdf-c-4.8.1-gcc-11.1.0-6so76nc

module load netcdf-fortran-4.5.3-gcc-11.1.0-d35hzyr

module load hdf5-1.10.7-gcc-9.4.0-vif4ht3

Code, Namelists, and Input Files#

For the sake of time, the MITgcm and V4r5-specific code have been downloaded to the P-Cluster at /efs_ecco/ECCO/V4/r5/. Copy the code to your own directory at /efs_ecco/USERNAME/r5/ using the rsync command below. Replace USERNAME with your actual username and maintain the directory structure as specified here so that you can follow the tutorial more easily.

rsync -av /efs_ecco/ECCO/V4/r5/WORKINGDIR /efs_ecco/USERNAME/r5/

Everyone has a directory at /efs_ecco/USERNAME/. There is no need to manually create the subdirectory /efs_ecco/USERNAME/r5/; the rsync command above will create it automatically.

Outside the P-Cluster, obtain the MITgcm code (checkpoint68g) and the V4r5-specific code using the following commands:

mkdir WORKINGDIR
cd WORKINGDIR
git clone https://github.com/MITgcm/MITgcm.git -b checkpoint68g
git clone https://github.com/ECCO-GROUP/ECCO-v4-Configurations.git
mkdir -p ECCOV4/release5
cp -r "ECCO-v4-Configurations/ECCOv4 Release 5/code/" ECCOV4/release5/code
cp -r "ECCO-v4-Configurations/ECCOv4 Release 5/namelist/" ECCOV4/release5/namelist

The input files—such as atmospheric forcing and initial conditions, are one terabyte (1 TB) in size. These input files have also been downloaded and stored on the P-Cluster in /efs_ecco/ECCO/V4/r5/input/. Do not copy them to the user’s own directory. Instead, create a symbolic link in the user’s own directory pointing to the input file directory using the following command:

cd /efs_ecco/USERNAME/r5/
ln -s /efs_ecco/ECCO/V4/r5/input .

The symbolic link will be used to access the input files in the example run script described below.

The directory structure under /efs_ecco/USERNAME/r5/ now looks like the following:

┌── WORKINGDIR
│   ├── ECCO-v4-Configurations
│   ├── ECCOV4
│   │   └── release5
│   │       ├── code
│   │       └── namelist
│   └── MITgcm
└── input

Compile#

The steps for compiling the code are as follows:

cd WORKINGDIR/ECCOV4/release5
mkdir build
cd build
export ROOTDIR=../../../MITgcm
../../../MITgcm/tools/genmake2 -mods=../code -optfile=../code/linux_ifort_impi_aws_sysmodule -mpi
make depend
make all
cd ..

The optfile linux_ifort_impi_aws_sysmodule has been specifically customized for the P-Cluster. If successful, the executable mitgcmuv will be generated in the build directory.

Run the Model#

After successfully compiling the code and generating the executable in the build directory (WORKINGDIR/ECCOV4/release5/build/mitgcmuv), you can proceed with running the model. For this purpose, we provide an example V4r5 run script that integrates the model over a three-month period. (See below for instructions on how to modify the run script to run the full integration period from 1992 to 2019.)

The script is also available on the P-Cluster at /efs_ecco/ECCO/V4/r5/scripts/run_script_slurm_v4r5.bash. Copy it to your working directory at /efs_ecco/USERNAME/r5/WORKINGDIR/ECCOV4/release5 (replace USERNAME with your actual username, but keep the directory structure the same). Then submit the script using sbatch with the following commands:

cd /efs_ecco/USERNAME/r5/WORKINGDIR/ECCOV4/release5
cp /efs_ecco/ECCO/V4/r5/scripts/run_script_slurm_v4r5.bash .
sbatch run_script_slurm_v4r5.bash

Once submitting the job, SLURM will generate a job id and show the following message:

Submitted batch job 194

Users can then check the status the job by using the following command:

squeue

Usually, SLURM takes several minutes to configure a job, with the status (ST) showing CF (for configuring):

             JOBID PARTITION     NAME     USER  ST       TIME  NODES NODELIST(REASON)
               194 sealevel- ECCOv4r5 USERNAME  CF       0:53      4 sealevel-c5n18xl-demand-dy-c5n18xlarge-[1-4]

After a while, squeue will show the status changing to R (for run) as shown in following:

             JOBID PARTITION     NAME     USER  ST       TIME  NODES NODELIST(REASON)
               194 sealevel- ECCOv4r5 USERNAME  R        3:30      4 sealevel-c5n18xl-demand-dy-c5n18xlarge-[1-4]

The run directory is /efs_ecco/USERNAME/ECCO/V4/r5/WORKINGDIR/ECCOV4/release5/run/. The 3-month integration takes about 45 minutes to complete. NORMAL END inside the batch log file /efs_ecco/USERNAME/ECCO/V4/r5/WORKINGDIR/ECCOV4/release5/ECCOv4r5-194-out (replace 194 with the actual job ID) indicates a successfully completed run. Another way to check if the run ended normally is to examine the last line of the file STDOUT.0000 in the run directory. If that line is PROGRAM MAIN: Execution ended Normally, then the run completed successfully.

The run will output monthly-mean core variables (SSH, OPB, and UVTS) in the subdirectory diags/ of the run directory. These fields can be analyzed using Jupyter Notebooks presented in some of the ECCO Summer School tutorials.

Note

If you want to output other variables, modify data.diagnostics following the format of WORKINGDIR/ECCOV4/release5/namelist/data.diagnostics.monthly.inddiag (for monthly output) or data.diagnostics.daily.inddiag (for daily output). To help manage disk usage and I/O performance, please consider outputting only the variables that are essential for your analysis.

To conduct the entire 28-year V4r5 moodel intergation period (1992–2019) run, comment out the following three lines in the script:

unlink data
cp -p ../namelist/data .
sed -i '/#nTimeSteps=2160,/ s/^#//; /nTimeSteps=245423,/ s/^/#/' data