ICHEC. getting started. Getting Started, an introduction to ICHEC systems. Getting Started :: First steps to using ICHEC resources

ICHEC TECHNICAL REPORT Getting Started :: First steps to using ICHEC resources Getting Started, an introduction to ICHEC systems Dr. Michael Brown...
1 downloads 0 Views 1MB Size
ICHEC

TECHNICAL REPORT

Getting Started :: First steps to using ICHEC resources

Getting Started, an introduction to ICHEC systems

Dr. Michael Browne ICHEC Computational Scientist

getting started

Introduction

Introduction Who can use the systems

ICHEC operates a number of High-Performance

Available hardware

Computing (HPC) systems for use by the Irish Research

Project classes

Community. This is a brief step-by-step guide to getting

Connecting to ICHEC systems

access to these systems and starting to run HPC tasks. It is

Available software

intended to simply address common questions and is not an

Batch job submission

exhaustive guide. If you still have questions or feel you have

Where can I find more help?

unusual requirements, by all means use the contact details

1 1 2 2 3 4 4 5

below to get in touch. being granted membership of class A, B & C projects, Ph.D.

Who can use the systems?

Students may also apply for their own Class C projects. This approach means a student can quickly get access to

Access to ICHEC systems is available free of charge to

significant HPC hardware and software without burdening

members of the Irish 3rd level research community subject to

their supervisors with administrative overhead. Those

an application process. The application process

outside Ireland can also use ICHEC systems if they are in

requirements depend on the level of resources requested.

active collaboration with Irish based researchers.

Access is centred around projects. Staff of 3rd level bodies apply to act as Principal Investigator (PI) of so called Class

While Class C projects are reviewed by ICHEC staff the

A,B or C projects. If their project application is successful

level of resources involved in a Class A or B application

they can then allow others to join their project and share the

means that a peer review process is merited. This process is

resources granted to them on ICHEC systems. In addition to



If you would like to speak in person with ICHEC staff please feel free to contact us via [email protected], further details can be found on the following page: http://www.ichec.ie/contact_us

1

conducted by a Scientific Council which operates

operating system. As a result these systems are relatively

independently of ICHEC.

easy to use and an ideal place to start using HPC. When

ICHEC also collaborates with commercial and state

submitting a project application, time can be requested on

organisations in the field of HPC such as SGI and Met

either or both the Stokes and Stoney systems. Unless the

Éireann. Such collaborations are dealt with on a case-by-

RAM requirement per node is greater than 24GB, the time

case basis and are not the focus of this guide.

should be requested for Stokes. There are also two IBM Blue Gene systems,

Available Hardware

Schrödinger and Lanczos which are ideal for highly

ICHEC operates four primary systems. All use variants

scalable tasks. However, they are not generally not

of the Linux operating system. The most powerful and

suitable for those new to HPC. They are not available to

widely used system is called Stokes, a SGI Altix ICE

users of Class C projects. With this in mind this document

8200EX cluster with 3840 cores and 2 GB of RAM per

largely deals with the use of the Stokes and Stoney

core. Stoney is a Bull Bull Novascale R422-E2 cluster with

systems. Refer to ICHEC’s website or contact us for

512 cores and 6 GB of RAM per core. In many ways these

equivalent Blue Gene details.

are similar to conventional PCs running a Unix-like



More technical details of hardware on offer can be found here: http://www.ichec.ie/infrastructure

will be small research groups or individual

Project Classes

researchers. Successful applications are expected to

There are three different classes of project available:

lead to refereed publications.

•Class A - Class A projects are intended for consortia

•Class C - projects are intended to provide fast access

concerned with "Grand Challenge" problems. These

to modest resources with less review overhead. They

groups will require resources representing a

have multiple possible uses including: introductory

substantial fraction of ICHEC’s resources over a long

access for inexperienced HPC users; exploratory

period of time. Successful applications are expected to

access for researchers who need to develop, port,

yield high-impact scientific publications. Class A

optimise or benchmark codes; easier access for users

project project applicants are expected to have a

planning small scale runs with very modest

good knowledge of the characteristics of the code(s)

requirements; immediate access for researchers

which they intend to use - such as scalability properties

awaiting Class A or Class B application approval.

- before writing their proposal. For this reason, applicants who are not in such a position are advised

ICHEC is happy to engage with applicants and

to first apply for an exploratory Class C project in

endeavours to accept as many Class C applications as

order to undertake a basic scalability and

possible. The acceptance rate for peer reviewed Class B

performance study.

applications which follow our proposed guidelines is in excess of 90%. The following table summarises the

•Class B - projects are intended for the needs of the

properties of the three classes.

bulk of the research community. Typically applicants



Application process details can be found here: http://www.ichec.ie/services/full_national_service The application form can be found here: http://www.ichec.ie/project/apply

2

Class A

Class B

Class C

Max CPU core hours*

4500000

600000

25000

Max Storage

1,500 GB

500 GB

50 GB

Max Project Duration

36 months

24 months

12 months

Max Review Duration

12 weeks

6 weeks

2 weeks

* The times listed apply to Stokes, for Stoney these times should be divided 1.39

access to their network from which you may access ICHEC

Connecting to ICHEC Systems

systems.

Once you have been assigned a username and initial password you can connect to the systems using the Secure

It is of course possible to connect to ICHEC systems

Shell protocol (SSH). SSH is the most common method of

from MS Windows based systems also. However in order

remotely connecting to Unix-like systems for interactive

to so so you will have to install an SSH client. The most

access. If the machine from which you are connecting is

commonly used clients are putty and cygwin. Once SSH is

itself a Unix-like system e.g. an Apple Mac or a Linux

configured you can also use it for copying files to and from

system then it is very likely SSH is already installed. In

ICHEC’s systems using the scp or sftp commands which

which case from a terminal you can login in as follows:

most SSH packages will also provide. If you prefer to use a graphical means of exchanging files there are GUI packages such as Filezilla which support SSH connections.

$ssh [email protected]

Running X Window based graphical applications on

If you wish to connect to an ICHEC machine other than

ICHEC’s systems is also possible. Again if you are using a

Stokes simply substitute the machine name. SSH encrypts

Unix-like system you probably already have a X server

all data which passes over the connection which greatly

installed. For security reasons it is necessary to pass or

benefits security. By default SSH uses network port 22. You

forward the X connection’s data over the SSH connection

will need to have outbound access on this port in order to

in the simplest case this can be accomplished by adding -X

successfully connect. Your local network security policy

to the command line as shown.

may mean that this is not possible by default. For instance you may first have to connect to another machine locally and then connect to ICHEC These settings are beyond

$ssh -X [email protected]

ICHEC’s control and you should contact your local network administrator for more information. An indication that this

If you are connecting from a MS Windows machine

may be the case is if you get a "Connection timed out"

you will need to have Xming, Hummingbird Exceed or

error. If you believe the network port is open as discussed

similar installed and running on your workstation. You also

please submit an issue to the Helpdesk (see page 5).

need to ensure that X11 forwarding is enabled.

ICHEC accepts network connections only from the networks of its partner institutions. This means if you wish

Please note that while ICHEC is happy to assist with

to connect from home or abroad for example it is the

connection issues, which can sometimes be tricky to

responsibility of your home institution to provide for remote

configure in the first instance, the installation of any

3

required software on your system is ultimately your

typically run on fewer cores that as used during code

responsibly or that of your local IT support organisation.

development and testing.

However our use of standard, well proven and secure

•Interactive development jobs share the characteristics

methods minimises problems in the long run.

of normal development jobs but allow you to run command interactively on the compute nodes as you

Available software

might if there was no queuing system in place.

ICHEC centrally installs a large number of software

While sometimes cumbersome, batch processing

packages that are of common interest. Other packages are

allows you to exploit systems as large-scale resources. You

normally built on demand. You can get a list of the

do not have to wait for nodes to be available to run a job,

available packages using the following command.

you simply submit it and your job will be queued until the required resources are free and your job will then be run automatically. Batch processing allows the system to mix

$module avail

and match jobs of different sizes and durations to The module system is used to easily configure your

maximise the utilisation of the system. This makes it much

environment for the packages of interest to you, for

more economical to operate relative to an entirely

example prior to using the Intel compiler you should type

interactive scenario.

the following command. The job submission process works by the user producing a small script that details the resources required $module load intel-cc

by the job any environment setting it may need and the command required to actually run the job. This script is

Loading a module generally sets environment variables

then submitted to the scheduler, which reads it, checks

such as your PATH. You can list the modules you have

some basic parameters and enqueues the job for execution

loaded at any point by using the module list command.

as soon as possible. The following is a sample job submission script for the Stoney or Stokes systems. which gives an impression of the structure of such scripts.

$module list

#!/bin/bash #PBS -l nodes=4:ppn=12 #PBS -l walltime=1:00:00 #PBS -N my_job_name #PBS -A project_name #PBS -r n #PBS -j oe #PBS -m bea #PBS -M me@my_email.ie #PBS -V

It is important to include module load commands in your jobs submission scripts too. More details on using modules can be found here: http://www.ichec.ie/support/documentation/#env

Batch job submission To try to utilise compute resources in a fair and

cd $PBS_O_WORKDIR mpiexec ./my_program my_arguments

efficient manner, all compute jobs must be run through the batch queueing system. The system supports three main classes of jobs:

Note, the scheduler used on both Stokes and Stoney is

•Production jobs are normally day to day HPC jobs

called PBS, hence these scripts are often referred to as

that can run for long periods of time across a large

“PBS scripts”. The # symbol is required at the start of each

number of cores.

PBS directive. The line #PBS -l nodes=4:ppn=12 requests 48 cores in this case i.e. 4 nodes each of which have 12

•Development jobs are short jobs (