QuickStart Guide to Orcinus

About this QuickStart Guide

This QuickStart guide gives a brief overview of the WestGrid Orcinus facility, highlighting some of the features that distinguish it from other WestGrid resources. It is intended to be read by new WestGrid account holders and by current users considering whether to move to the Orcinus system. For more detailed information about the Orcinus hardware and performance characteristics, available software, usage policies and how to log in and run jobs, follow the links given below.

Introduction

Orcinus is a cluster comprised of HP Proliant blade servers with 9600 cores, an InfiniBand interconnect, and is running RedHat 5 (on all infrastructure nodes) and CentOS 5 (on all compute nodes). It is intended for both serial and parallel applications. It was purchased in two phases. The Phase One hardware was installed in 2009 and Phase Two was installed in 2011.

Hardware

Processors

Phase One of the Orcinus cluster is comprised of 12 c7000 chassis, each containing 16 dual-density BL2x220 Generation 5 blades. There are 2 compute servers per blade (an A node, and a B node). Every node has 2 sockets, each containing an Intel Xeon E5450 quad-core processor, running at 3.0GHz. In total there are 3072 Phase One cores (12 * 16 * 2 * 2 * 8). The 8 cores in a single Phase One node share 16GB of RAM.

Phase Two is comprised of 17 c7000 chassis, each containing 16 dual-density BL2x220 Generation 6 blades. Again, there are 2 compute servers per blade (an A node, and a B node). Every node has 2 sockets, each containing an Intel Xeon X5650 six-core processor, running at 2.66GHz. In total there are 6528 Phase Two cores (17 * 16 * 2 * 2 * 12). The 12 cores in a single Phase Two node share 24GB of RAM.

The total number of cores available is 9600.

Interconnect

All Orcinus nodes are connected via an InfiniBand network fabric with a 2:1 blocking factor (1:1 all A nodes or all B nodes within a single chassis). The Phase One hardware uses a 288-port Voltaire DDR (20Gb/s) Grid Detector 2012 switch and Phase Two uses a 324-port Voltaire QDR (40Gb/s) Grid Detector 4700 switch. In order to maintain a single fabric across the entire cluster (and to consequently mount shared file systems), both switches are linked via a 14-port trunk. However, to ensure that parallel jobs do not run in a mixed InfiniBand environment, the cluster is sectioned into DDR and QDR partitions. (See the "Batch jobs" section below.) 

Storage

Orcinus uses Lustre, a massively parallel distributed file system, for /home (the location of all users' home directories) and /global/scratch (temporary space ususally reserved for running jobs). The file system characteristics are as follows:

 

/home:

Size: 86TB, Quota (per user): 250GB, Quota (# of files per user): 200,000, Backup Policy: twice per week

/global/scratch:

Size: 342TB, Quota (per user): 500GB, Quota (# of files per user): 250,000, Backup Policy: NO BACKUP 

 

To check your current usage, the top 150 users are listed in daily reports located here:

 

/global/system/info/DU_home_info

/global/system/info/DU_scratch_info

 

With regard to local storage, the Phase One blades have 98GB of /scratch disk space, and the Phase Two blades have 218GB.

Software

See the WestGrid software page for a list of software installed on Orcinus.

Please write to support@westgrid.ca if there is additional software that you would like installed.

Using Orcinus

To log in to Orcinus, connect to orcinus.westgrid.ca using an ssh (secure shell) client. For more information about connecting and setting up your environment, see the QuickStart Guide for New Users.

The general WestGrid Programming and Running Jobs pages also apply to the Orcinus cluster.  Orcinus-specific pages may be developed as time permits.

Some guidelines for running and monitoring batch jobs on Orcinus are given here.

Batch jobs

On most WestGrid systems, the TORQUE command, qsub, is used to submit jobs, as explained on the Running Jobs page.  Although TORQUE is also used on Orcinus, the Moab resource manager msub command can also be used for submitting batch jobs.

The Moab, showq command can be used to monitor jobs. By default, showq shows information about all the jobs on the system.  For a large cluster such as Orcinus, this can be a long list.  So, you may prefer to use showq in the form

showq -u username

to limit the output to information about the jobs belonging to the given username.  As on the Glacier cluster, an alternative is to use the qsort utility, which defaults to showing just your own jobs.

The default queue limits are:

  • pmem = 756mb
  • walltime = 03:00:00  (that is, 3 hours)

These are the values that are used for these resources limits if your batch job submission script or qsub/msub command line arguments do not specify a memory or elapsed time (walltime) limit.

The maximum walltime limit is 240:00:00 (10 days).

The maximum number of jobs that a user may have queued to run is 1000.

Selecting processor type

Although Orcinus is comprised of two different types of nodes, jobs are never assigned to a mixture of node types.  The partition resource parameter can be specified on the qsub command line or in a #PBS directive in the job script to explicitly select either the 8-core or 12-core nodes.

To choose the older 8-core 3 GHz nodes, use:

#PBS -l partition=DDR

To select the newer 12-core 2.67 GHz nodes, use:

#PBS -l partition=QDR

 

Working interactively

Some interactive work is allowed on the Orcinus login machines for editing files, compiling programs, limited debugging, etc.

In addition, there are two 8-core compute nodes reserved for short debugging jobs.  To access these nodes, use the -l qos=debug resource request on the qsub/msub command line or in directives in your batch job submission script:

#PBS -l qos=debug
#PBS -l walltime=mm:ss

Note: for a serial job (using a single core) the maximum walltime limit is 45:00 (45 minutes). Jobs using up to 2 compute nodes, for a total of 16 cores are allowed through the debug quality of service request, but, there is a limit of 14400 processor-seconds total (number of processors times the walltime limit, in seconds, must not exceed 14400). So, as an example, for a 16-core job the maximum allowed walltime specification is 15 minutes.

Another possibility for interactive work is to use an interactive batch job in which the batch job system is used to reserve processors for interactive work.  At the moment this intearctive job feature is enabled only on the Seawolf1 login node.

To start an interactive batch job session, use a qsub command of the form:

qsub -I -l walltime=mm:ss,qos=debug

where, as above, the processors*walltime limit must not exceed 14400 processor-seconds.

Storage Policies

Please see the limits on disk storage mentioned in the Storage section of the Hardware description above.


Updated 2011-10-31.