Gridstore QuickStart Guide

Gridstore is no longer available to users

The gridstore storage facility no longer exists.  That aspect has been taken over by silo.

gridstore still exists, but only as a backup server for bugaboo and other WestGrid servers at SFU.  Users no longer have access to it,

with the exception that some home directories are still there and those are accessed through some other system.


About this QuickStart Guide

This QuickStart guide gives a brief overview of the WestGrid Gridstore facility, highlighting some of its key features. It is intended to be read by new WestGrid account holders and by current users considering to start using Gridstore system.

For more detailed information about the Gridstore hardware and performance characteristics, available software, usage policies and how to log in and transfer data, follow the links given below.

Introduction

Gridstore has been phased out as a storage facility at WestGrid and can no longer be used for storage.

NOTE: gridstore hardware is now off-maintenance, so gridstore is being re-purposed.  Please use silo instead of gridstore for new data storage.  Existing data from gridstore was moved to silo.

Hardware

Servers

Gridstore is an IBM p570 server with 8 Power5 processors and 16 GB of RAM, running the IBM AIX 5L 5.3 operating system.

As of April 19, 2010 the Blackhole server is no longer available.

Storage

Disk subsystem (attached to the Gridstore server)

About 100 TB of disk storage is available in a RAID 5 configuration through dual FAStT 900 disk controllers. Of the total disk, about 16 TB (mostly for the /home file system) is provided by one hundred and forty 136 GB fibre channel drives and the remaining 93 TB (mostly for the /data, /vault and /projects file systems) is on one hundred and sixty 700 GB SATA drives.

Tape subsystem

Gridstore currently provide two IBM tape libraries with a total capacity of about 600 TB.

Hierarchical Storage Management (HSM) is used to manage the data under control of Tivoli Storage Manager (TSM) software, which means that the system will automatically migrate data to tape that is not used for a certain period of time. If data residing on tape is later reused, it is automatically moved back to the disk subsystem. The process of moving data between the fast disks and the slower tapes takes time and users might therefore be experiencing delays when trying to access data which has been migrated to tape.

Some technical details regarding the tape libraries:

IBM LTO2 3584 tape library:

  • 1000 x 200 GB tapes (without compression): 200 TB
  • 15 x IBM LTOGen2 drives

IBM LTO3 3584 tape library:

  • 1000 x 400 GB tapes (without compression): 400 TB
  • 15 x IBM LTOGen3 drives

 

Using Gridstore

Roles of the Servers

Since the gridstore facility is being phased out the server should only be used to move files off the storage facility, not to store new data.

gridstore.westgrid.ca

This is the file server itself. For large file transfers, accessing the storage facility via Gridstore will give you the best performance. To transfer files to or from the file server, you can use scp, or grid-based tools, such as gcp.

Although you may connect to Gridstore with ssh, the shell on Gridstore is restricted; it can only be used for managing and downloading files. You cannot run your own programs or scripts on Gridstore. To see a list of commands that are available under the restricted shell, type:

ls -1 /usr/local/rbin

Gridstore is accessible via ssh at gridstore.westgrid.ca. Using Gridstore requires special considerations in order to obtain good performance. It is important that researchers understand the various file systems on Gridstore in order to use it appropriately.

File Systems

There are three file systems with different characteristics available on the storage facility to each user:

/home/username

Characteristics: high-performance; backed up; no HSM.

This file system is reserved exclusively for jobs running on snowpatch, robson and tantalus. No other usage is allowed.

This is your home directory on Gridstore and Blackhole, that is, the directory where you end up after you login. The term "no HSM" means that this filesystem is not managed by the Hierarchical Storage Management software. As a consequence none of the files stored in your home directory will be moved to tape in order to make room in the file system. This means that disk space is limited in the home file system, but file access is fast. Your home directory is for files that you need to access frequently and for which you must be sure that they will never be moved to tape. The home file system is not meant for archiving large amounts of data! Before you transfer large amounts of data into your home directory make sure that you will not fill up the file system. The command:

df -k /home

will show you under the "Free" column how much disk space is available (in kB).

The home directory is dedicated to support jobs running on the compute clusters located at SFU (robson, snowpatch, etc.). It is not meant for long-term storage. Please, move any input or output files from jobs to the data (or vault, if required) filesystems as soon as they are no longer needed by running jobs.

Any user who is planning to store more than 250 GB of data in the home directory is requested to contact support@westgrid.ca before doing so.

/data/username

Characteristics: high-performance when accessing files from disk, lesser performance when accessing files from tape; no backup; HSM.

All files off the /data file system.

You can reach your "data" space by typing cd data from your home directory, for example, directly after login. This is the directory should be used for files that you could, in an emergency, regenerate by other means, as files in this directory are not backed up. Thus, in the rare case of damage to the disk system and/or the tape system these files cannot be recovered from the storage facility.

Files under the data directory are HSM managed. This means that, if the data file system on disk gets filled to the limit, files get moved to tape automatically (beginning with files that have not been accessed recently). This is done completely transparently to the user. That is, if you issue an ls command in your data directory you will see all your files even if some of them have actually been moved to tape. The only effect that you will notice when accessing a file that was actually moved to tape is that it will take longer to access the data (due to the time needed to load the tape and write the data back to disk).

/vault/username

Characteristics: high-performance when accessing files from disk, lesser performance when accessing files from tape; backed up; HSM.

All files off the /vault file system.

You can reach your "vault" space by typing cd vault from your home directory, for example, directly after login. This file system is very similar to the data file system, the only difference being that files under the "vault" directory tree get backed up additionally. Thus, even if the original version gets destroyed (usually due to user error, for example if you accidentally deleted a file) that file can be retrieved from backups on the storage facility similarly to the home file system.

While the vault file system is HSM managed, that is, the system will move files to tape, if the file system is in danger of filling up, there is a problem: before a file can be moved to tape is must be backed up. Backups run over night. Therefore, you must not transfer so many files into the vault file system within a single day that you fill up the file system, because the HSM system cannot move your files to tape! Hence, check using the

df -k /vault

command whether there is enough free space available before starting a large file transfer. If there is not enough space available, you must split your transfer into several parts and wait a day before you transfer the next part.

File space in the vault file system is rather expensive because of the additional backups. Before storing data in vault, ask yourself these questions:

  • What are you intending to do with these data?
  • Why do they need to be backed up?
  • Can you regenerate these data?

Any user who is planning to store more than 500 GB of data in the vault file system is requested to contact support@westgrid.ca before doing so.

Summary and additional important points

  • For best performance when copying large amounts of data to the storage facility, use gcp from gridstore - not scp.
  • Storing and accessing files in HSM managed file systems (data, vault) requires additional considerations:
    Accessing files that are migrated to tape will cause the file to be written back to disk. Thus, before accessing files that were stored in data and/or vault issue the command dsmls <filename> on Gridstore. If this shows an "m" in the File State column, the file has been migrated to tape. In that case you can recall the file by issuing a dsmrecall <filename> command.
  • Do not store many small files in data and/or vault. Create one tar archive instead.
  • Do not compress files, in particular not in vault: this will not save space because files are compressed anyway when they are migrated to tape. Compressing them yourself will actually create a new file and the system will keep backup copies of the old and the new file, effectively increasing the required space.
  • As a general rule: If you are not sure, how to deal with files in HSM managed file systems (data and vault), please contact support@westgrid.ca first.
  • Information on grid tools such as gcp, globus-url-copy and grid-proxy-init can be found on the Grid Tools page.
  • Disk and tape capacity is limited. Therefore it is expected that soon there will be limits or quotas in place, and you will have to apply to the Resource Allocation Committee for more than a basic allotment.
  • As a courtesy to other users, and to avoid the system filling up, please delete files you no longer need.
  • If you plan to transfer a large amount of data to the storage facility, please contact support@westgrid in advance to make sure there is room for it.

Updated 2010-12-08.