Tantalus QuickStart Guide

About this QuickStart Guide

This QuickStart guide gives a brief overview of the WestGrid Tantalus facility, highlighting some of the features that distinguish it from other WestGrid resources. It is intended to be read by new WestGrid account holders and by current users considering whether to move to the Tantalus system.

Introduction

Tantalus is a Cray XD1 Linux system that provides two capabilities that distinguish it from other WestGrid systems. It has programmable hardware, in the form of on-board field programmable gate arrays (FPGAs).  The other special feature is its extremely fast interconnect , with a very low MPI latency.

Unlike the other WestGrid system, access to Tantalus is granted by special request.  It is intended for researchers who are interested in exploring the use of high-bandwidth FPGA solutions. It was purchased with money reserved for evaluation of advanced technologies. Please send an email to support@westgrid.ca if you are interested in getting access to Tantalus.

Hardware

Processors

The Tantalus system consists of 6 nodes. Each node has two single-core Opteron CPUs (2.2GHz), one Xilinx Virtex II Pro FPGA (XC2VP50 device, FF1152 package), and 4GB of memory.

Interconnect

The nodes are connected by the extremely fast RapidArray interconnect (1.6-1.7µs MPI latency).

Storage

The home directories at Tantalus are shared with the WestGrid storage facility (Gridstore, Blackhole) and the Robson cluster. The Gridstore data and vault file systems are available directly from the Tantalus login node (t6-355), but not on the other nodes.  Therefore, compute jobs should not write to the data and vault directories directly. 

Software

System software

Tantalus runs a 64-bit Linux operating system.
TORQUE (2.0.0.p8) and Moab (4.5.0) to handle batch jobs submissions.
For parallel MPI jobs, a version of MPICH 1.2.6 optimized for the RapidArray interconnect is used (supporting Portland Group compilers).

Compilers

Compilers available are:

Portland Group compilers: pgcc, pgCC, pgf90, pgf77 (version 6.1-3)

GNU compilers: gcc, g++, g77 (version 3.3.3)

gcc4, g++4, gfortran (version 4.0.2)

Parallel MPI programs should be compiled with mpicc, mpicxx, mpif90 or mpif77.

Mathematical libraries

Mathematical libraries available include:

ACML (AMD Core Math Library - a full suite of the BLAS and LAPACK linear algebra routines, FFT routines and others)

Link with -lacml.

BLAS

One can use the ACML library, linking with -lacml as above or alternatively, one can use the ATLAS BLAS library by linking with -lblas.

LAPACK

One can use the ACML library, linking with -lacml as above or alternatively, one can use the ATLAS BLAS library by linking with -lapack -lblas.

FFTW (version 2.1.5)

Link using -lrfftw_mpi -lfftw_mpi -lrfftw -lfftw (or subset) for double precision or -lsfftw, etc. for single precision.

Application software

A limited selection of application software is available. For example the NAMD molecular dynamics package has been tried.  Write to support@westgrid.ca to discuss your Tantalus software requirements.

Using Tantalus

To log in to Tantalus, connect to tantalus.westgrid.ca using an ssh (secure shell) client. For general information about connecting and setting up your environment, see the QuickStart Guide for New Users.

Parallel MPI programs are compiled with the scripts mpicc, mpicxx, mpif90 or mpif77 and run on Tantalus using an mpirun command line of the form:

mpirun -machinefile <hostsfile> -n <nprocs> <executable> <arguments>

as shown in the sample batch job script below.  Batch jobs are submitted with qsub, as described for other WestGrid systems on the Running Jobs page.

#!/bin/bash
#
# job name (default is name of pbs script file)
#PBS -N mpi
#
# path/filename for standard output
#PBS -o parallel.out
#
# path/filename for standard error
#PBS -e parallel.err
#
# requested walltime
#PBS -l walltime=36:00:00
#
# send me mail when job ends or aborts with an error
#PBS -m ae
#
# send mail to address :
#PBS -M poweruser@nowhere.ca
#
# do not rerun this job if it fails
#PBS -r n
#
# specify number of nodes and (optionally) cpus per node
#PBS -l nodes=12
#
###
cd $PBS_O_WORKDIR
umask 027
nodes=`cat $PBS_NODEFILE | tr -s '\n' ' '`
NUM_NODES=`cat $PBS_NODEFILE | sort -u | wc -l`
JOBINFO=mpi.np${PBS_NCPUS}_n${NUM_NODES}.${PBS_JOBID}
date > $JOBINFO
echo "Current working directory is `pwd`" >> $JOBINFO
echo "nodes: $nodes" >> $JOBINFO
echo "Running on $PBS_NCPUS processors on $NUM_NODES nodes." >> $JOBINFO
mpirun -machinefile $PBS_NODEFILE -n $PBS_NCPUS ./myprog_mpi < mympiinput >> $JOBINFO 2>&1
echo "Job finished at `date`" >> $JOBINFO

The maximum number of processors that can be requested is 12.  (see the "#PBS -l nodes=12" line in the example). Alternatively, the form "#PBS -l nodes=n:ppn=2" can be used as well, which would request 2n processors.

The priority of a job is determined by the user's past usage. The waiting time or ranking of jobs in the queue has no effect on the priority.


Updated 2009-10-01.