GROMACS on WestGrid Systems

Table of Contents

Introduction

GROMACS software

GROMACS is a molecular dynamics program (along with attendant utilities) designed for simulations of large molecules, such as proteins. This WestGrid GROMACS web page includes instructions on how to submit GROMACS jobs, but, is not a tutorial on the GROMACS software suite itself. Visit www.gromacs.org for a detailed description of program features and instructions on such things as input file structure and command line options.

Although many researchers choose to maintain their own versions of GROMACS, the software has also been installed in publicly-accessible directories on several WestGrid systems. Both serial and parallel versions of the main GROMACS executable, mdrun, are available. On some systems both single and double precision versions are available. The most common usage appears to be single precision, parallel runs, so, example scripts here will illustrate that case.

Steps in a GROMACS analysis

After preparation of various input files (conf.gro for coordinates, topol.top for topology and the main parameters file, grompp.mdp), the basic steps in using GROMACS are to run the preprocessor grompp, then, run the main energy minimization program mdrun. The simulation may be extended with subsequent mdrun steps using checkpoint files.

Other programs in the GROMACS suite are used for analyzing the results, but, these are not discussed here. Additional visualization programs may also be needed. If you would like additional software to be installed, please contact WestGrid support at support@westgrid.ca.

Parallel performance issues

The number of processors that can be effectively used depends on which WestGrid system is being used and on the GROMACS options chosen. For example, performance using the PME (particle mesh Ewald) treatment of long-range electrostatic interactions may not scale as well with the number of processors used as other options. If the PME method is appropriate for the molecular system you are studying, go ahead and use it, but, adjust the number of processors requested to maintain a reasonable level of parallel efficiency. What constitutes "reasonable" is a point for discussion.

Parallelization has improved with GROMACS 4 and there is the option of dedicating processors to PME calculations.  GROMACS includes a tuning utility called g_tune_pme that can be used to optimize parallel performance on a job-by-job basis.

Long simulations

Long simulations will have to be broken up into several jobs. Consequently, users will have to be familiar with restarting runs using files written from previous run.

Batch job submission

Like other jobs on WestGrid systems, if one is doing more than small debugging runs, GROMACS jobs should be submitted for batch scheduling. This is done by embedding the GROMACS commands in a script that is submitting using the qsub command. Details of scheduling and job management are explained on the Running Jobs page, but, examples of job submission are shown in the following sections for some WestGrid systems.

Running GROMACS on Westgrid Systems

GROMACS is run in basically the same way on each WestGrid system on which it has been installed, with the main mdrun program run as an MPI parallel code. (See the Running Jobs page and QuickStart guides for more information.) However, due to differences in such factors as installation location, versions available, walltime limits and processor and network performance, there will likely be small changes needed in job scripts and job submission parameters from one system to another. One of the differences to take into account is whether the environment is initialized with a module command or by sourcing a GMXRC file.  The location of the binaries and naming convention for distinguishing single and double precision versions, or serial and parallel versions are other issues to consider.

Here are the locations of the GROMACS installations on the various WestGrid systems.

System GROMACS installation directory
Breezy
(Not available as GROMACS typically does not require a large amount of memory per process.)
Bugaboo The default version (4.5.4) is in the default PATH already (/usr/local/bin)
Use module load gromacs/<version> for other versions 4.6.1, 4.5.1, 4.0.7, 4.0.5
Checkers

For version 4.5.4: 

module load gromacs/4.5.4
or
module load gromacs

Other versions available through modules are 4.0.5, 4.0.7, 4.5.1 and 4.5.3

Glacier /global/software/gromacs-3.2/intel-fftw-2.1/i686-pc-linux-gnu/bin
/global/software/gromacs-3.2/gcc-fftw-2.1-single_p/i686-pc-linux-gnu/bin
/global/software/gromacs-3.2/single_p_pgi-5.2-fftw-2.1/i686-pc-linux-gnu/bin
/global/software/gromacs-3.3/gcc-fftw-3.1-single
/global/software/gromacs-3.3/gcc-fftw-3.1-double
Grex (Not available)
Hermes/Nestor

For version 4.5.5: 

module load gromacs/4.5.5
or
module load gromacs

See module avail for older versions then, use module load <module_file>.

Hungabee module load application/gromacs/4.5.5
Jasper

module load application/gromacs/4.6.1
module load application/gromacs/4.5.5

Lattice/Parallel

For version 4.6 look in /global/software/gromacs/gromacs46/bin

For version 4.5.4: /global/software/gromacs/gromacs454_intel1202_fftw_mkl

There are various other versions under /global/software/gromacs

Orcinus

See /global/software/gromacs
or use module avail to see available versions, including 4.5.5 .
Then, use module load gromacs/<version>.

 

Please note that on some of the systems, GROMACS has been built with the option to use a suffix to distinguish serial from MPI versions and single from double precision versions. Look in the directory corresponding to the version you want to use to see what convention has been used on the particular system you are using. On several systems mdrun and mdrun_mpi are used for the single precision serial and MPI versions, respectively.  The double precision serial and MPI versions on most systems are mdrun_d and mdrun_mpi_d, respectively.

A tuning aid to optimize GROMACS parameters for improved parallel performance,  g_tune_pme, can be used to help judge how many processors to use for PME (particle mesh Ewald long-range electrostatics treatment) calculations when that feature is being used.  See http://www.mpibpc.mpg.de/home/grubmueller/projects/MethodAdvancements/Gromacs/Download/PosterHuenfeld2009.pdf for a description of how this is used.

Running GROMACS on Checkers

The sample batch job script below illustrates some of the key features necessary for running GROMACS on Checkers.

#!/bin/bash
#PBS -S /bin/bash
#PBS -l procs=2
#PBS -l walltime=12:00:00
 
cd $PBS_O_WORKDIR
 
module load gromacs
 
mpiexec mdrun_mpi -v -s em0a -o em1 -c em1 -g emlog

In particular, note the use of the module command to set up the GROMACS environment.  Use module show gromacs to see the changes to your environment that are made by the module command and which version of GROMACS is being configured. For more information about modules, click here.

On Checkers the mpiexec command does not require an argument telling it how many processors to use. It will determine this information automatically from the TORQUE (PBS) environment in which it is running.

Note the mdrun suffix _mpi to indicate that it is an MPI version that is being run.  By default it is the single precision mdrun.  The double precision versions are mdrun_d and mdrun_mpi_d.

Running GROMACS on Glacier

Some details about running GROMACS on Glacier are given on a separate page.  However, Glacier has only the old 3.2 and 3.3 versions of GROMACS so is not likely to be of much current interest for GROMACS users.

Running GROMACS on Lattice

Since GROMACS typically doesn't require very much memory per process and Lattice has less memory per core than most of the other WestGrid systems, Lattice is one of the most appropriate WestGrid systems on which to run GROMACS. However, accounts are not set up on Lattice automatically.  See the Lattice QuickStart Guide for (simple) instructions for applying for an account.

Here is a sample batch job script for running a single-precision parallel calculation on Lattice.  A complete example using this script is available in the directory /global/software/gromacs/examples/dppc_pme on Lattice. 

#!/bin/bash
#PBS -S /bin/bash

# Lattice version of GROMACS test - no npme specified
# DSP 2008-12-15, 2009-01-07, 2010-01-20, 2010-09-14, 2012-07-04.

cd $PBS_O_WORKDIR
echo "Current working directory is `pwd`"

echo "Node file: $PBS_NODEFILE :"
echo "---------------------"
cat $PBS_NODEFILE
echo "---------------------"
NUM_PROCS=`/bin/awk 'END {print NR}' $PBS_NODEFILE`
echo "Running on $NUM_PROCS processors."
echo ""

echo "Starting run at: `date`"

BINDIR=/global/software/gromacs/gromacs454_intel1202_fftw_mkl/bin
. ${BINDIR}/GMXRC
PRECISION=_s
PARALLEL=_mpi
${BINDIR}/grompp${PRECISION} -f grompp.mdp\
-p topol.top -c conf.gro -o run.tpr
time mpiexec -n ${NUM_PROCS} ${BINDIR}/mdrun${PARALLEL}${PRECISION}\
-s run.tpr -o run.trr -c conf.gro -v -g run.log

echo "Job finished at: `date`"

As indicated in the Lattice QuickStart Guide, the Lattice cluster is intended for multi-node parallel jobs that should be submitted with a resource request of the form -l nodes=nn:ppn=8, where nn is the number of nodes required. To avoid problems with the batch system terminating jobs due to exceeding the default memory allocation, use -l mem=mmm, where mmm is 10gb per node requested.  So, for example, to run a GROMACS job on 128 cores, use -l nodes=16:ppn=8,mem=160gb on the qsub command line or in the equivalent #PBS directive in your batch job script.

Running GROMACS on Parallel

Parallel is intended for large-scale parallel computations (hundreds of cores). It shares file systems and software with Lattice, so, the same sample job script shown above for Lattice should also work on Parallel. Like Lattice, accounts are not set up on Parallel automatically.  See the Parallel QuickStart Guide for instructions for applying for an account. Consider requesting a Parallel account if your GROMACS calculations still perform efficiently when run on more than a hundred cores.

As indicated in the Parallel QuickStart Guide, jobs on the Parallel cluster should request full nodes by using a resource request of the form -l nodes=nn:ppn=12, where nn is the number of nodes required. To avoid problems with the batch system terminating jobs due to exceeding the default memory allocation, use -l mem=mmm, where mmm is 22gb per node requested.  So, for example, to run a GROMACS job on 240 cores, use -l nodes=20:ppn=12,mem=440gb on the qsub command line or in the equivalent #PBS directive in your batch job script.

 


Updated 2013-04-08.