Checkers QuickStart Guide

About this QuickStart Guide

This QuickStart guide gives a brief overview of the WestGrid Checkers facility, highlighting some of the features that distinguish it from other WestGrid resources. It is intended to be read by new WestGrid account holders and by current users considering whether to move to the Checkers system. For more detailed information about the Checkers hardware and performance characteristics, available software, usage policies and how to log in and run jobs, follow the links given below.

Introduction

Checkers is an SGI Altix XE320-based cluster with 1280 cores connected with Infiniband, running the Scientific Linux operating system.

Hardware

Processors

The Checkers facility is comprised of a login node, storage server and 160 compute nodes providing 1280 processor cores for computations. Each 8-core compute node has 2 sockets, each containing an Intel Xeon L5420 quad-core processor, running at 2.5 GHz.  Each compute node has 16 GB of memory (DDR2 RAM) that can be shared among the 8 cores on that node.

Graphics Processing Units

There are four nVidia Quadro Plex 2200 S4 graphics processing units (GPUs) connected to eight checkers nodes numbered cl1n151 to cl1n158. One Quadro Plex 2200 S4 contains four Quadro FX 5800 GPUs, each with 4 GB of memory. Thus, each checkers node has access to two GPUs available for interactive visualization or GPU computation. 

Interconnect

The compute nodes are connected with double data rate (DDR) infiniband (providing an aggregate 20 Gb/s) with 2 to 1 blocking, using a 288-port Voltaire Grid Director ISR 2012 switch.

Storage

Storage is provided by an SGI IS10k storage array, consisting of 100 one-TB SATA drives configured as RAID 6 ( 8+2 dual parity).  This gives about 80 TB of usable space.  The storage is managed by an 8-core SGI Altix 450 NFS server.

There are also two 250GB SATA 7200 RPM drives per node for local storage.  Just under 400 GB is available to users as /scratch on each compute node.

Filesystems

 File System
Total Size Type Maximum per user without RAC allocation
 /home  7.3 TB  Global Filesystem 
 100 GB
 /global/scratch  22 TB  Global Filesystem   1000 GB
 /scratch  250 GB  Temporary local Filesystem
 Whole filesystem is available but deleted when your job ends

 

Software

See the main WestGrid software web page for tables showing the installed software on Checkers and other WestGrid systems, including information about the operating system and compilers.

Software on checkers is often loaded and run via modules software environment

 

Using Checkers

To log in to Checkers, connect to checkers.westgrid.ca using an ssh (secure shell) client. For more information about connecting and setting up your environment, see the QuickStart Guide for New Users.

For information on compilers, parallel programming and debugging please see the checkers programming page

For information on using the checkers GPUs for remote visualization, please refer to the remote visualization page.

For information on using the checkers GPUs for general purpose GPU programming (GPU computation) please refer to the GPU computation page.

The following limits are currently enforced on the system:

  • Default walltime is 3 hours;   
  • Maximum walltime is 7 days;
  • Maximum number of running jobs per user is 64;
  • Maximum number of queued jobs gaining priority over time is 5 per user;
  • There is a maximum of 30720 processor-hours per job.  This value is the number of processors multiplied by walltime as  requested in the job.

 


Updated 2011-05-31.