BRC Linux Cluster Overview

BRC Linux Cluster Overview Initial NIHR Capital Award: Hardware: £300,000 Hosting & Support: £75,000 BRC Nucleus funded addition of 60TB archive sto...
Author: Brooke Wheeler
1 downloads 0 Views 1MB Size
BRC Linux Cluster Overview

Initial NIHR Capital Award: Hardware: £300,000 Hosting & Support: £75,000 BRC Nucleus funded addition of 60TB archive storage last week. Further NIHR capital award will provide up to 250TB storage, additional blade enclosure and blades, required infrastructure upgrades. Hosted in the SLaM datacentre at the Royal Bethlem Hospital (although a move to the new Maudsley datacentre is in the pipeline).

Applying to use the cluster http://core.brc.iop.kcl.ac.uk/contact-us/cluster-application Mailing List [email protected] Wiki https://compbio.brc.iop.kcl.ac.uk:8443/biowiki Website http://core.brc.iop.kcl.ac.uk/contact-us/brc-cluster-support-request/ BRC Systems Administrator Caroline Johnston x0924 [email protected] [email protected] Skype: cass_j

Hardware Tape Backup /archive

60TB Archive Storage

(not /scratch)

node15

. .. . ..

head1 node01

120TB Panasas Storage

/home /project /scratch node30

enclosure2

SLaM Gateway

. .. . ..

head2 node16

enclosure1

bignode

SLaM Firewall

Grid Engine short.q bignode.q SLaM Gateway

bignode

Grid Engine (Jobs < 3 days)

long.q

HP BL460c Compute Nodes

HP DL360 (bignode) Panasas Cluster Storage

Access from outside of SLaM is through the ssh gateway and requires a cryptocard twofactor authentication token. To apply for a token, fill in the form at: http://core.brc.iop.kcl.ac.uk/contact-us/cluster-application/ For this course, you don't need a token – we will be using accounts on the SGDP cluster for Linux practice.

Linux and the Command Line

History 1970 – first release of the UNIX operating system (Ken Thompson and Dennis Ritchie @ AT&T Bell Labs) 1983 – Richard Stallman starts GNU (GNU's Not Unix) project to develop a free UNIX clone. Develops the concept of CopyLeft and the GNU Public License (GPL) – a license grants permission to reproduce, adapt or distribute code but requires any resulting copies or adaptations to be bound by the same licensing agreement. GNU includes many successful software projects but OS kernel project (Hurd) never gathers momentum 1987- Andrew S Tenebaum releases Minix, a 16-bit OS designed for teaching. Source code is available but licensing is fairly restrictive

1991 – Linux Torvalds, working at the University in Helsinki, announces a project on the Minix newsgroup to develop a free OS for 386 processors. 1992 – Linux kernel released under the GPL. Projects begin to combine Linux with elements of the GNU project. Early projects tried to use the name GNU / Linux but these days most people just use Linux to refer to the combination of kernel and other software that comprises an OS.

Today there are many GNU / Linux distributions tailored to suit everything from tablets to desktops to enterprise servers to embedded systems. Although the code is GPL'd, commercially successful Linux companies provide training and support.

The BRC Cluster runs CentOS 5 Ubuntu or Mint are popular choices for desktops.

The Command Line Interact with a computer via commands typed at a prompt, rather than dragging icons and clicking links. Much quicker for most bioinformatics tasks. System resources that would have been busy drawing user interfaces are free for processing.

Connecting... • OpenSSH allows you to establish a secure connection to a remote computer • Windows users can use the PuTTY Application • http://www.chiark.greenend.org.uk/~sgtatham/putty/ • Linux & Mac OS X users can use the Terminal application • To connect you'll need a minimum of 2 things - IP Address/Hostname of the remote computer - Login & Password

Using a terminal #login to remotehost as username ssh @ # same as above but a different style ssh ­l   # same as above, but draw X windows on local machine ssh ­X @ # same as above but use port 51515 (default is 22) ssh ­p 51515 @

Using PuTTY

Connection > SSH > Tunnels > Enable X11 Forwarding Requires a local X server, eg Xming or XWin-32

TASK: Connect to the SGDP Cluster

We are using the SGDP Cluster for today, as some people don't have BRC cluster tokens yet. You should have been assigned a username, like tusrX Your password is the same as your username. The remote machine you need to connect to is mumak.iop.kcl.ac.uk

You can log out again using the command exit If you get this working, try connecting with X forwarding. Run the command gedit to see if it worked

The command prompt is called a shell (an outer layer of user interface around the kernel) • There are many shells: sh, ksh, csh, zsh, bash - We’ll use bash • Not just an interface to the computer, also a scripting language – allows automation of tasks • You enter your commands at the command prompt: $    $    Options normally start with dashes ( ­a ­­all)

Help?! Command: man (format and display the on-line manual pages) • man cp (Read the man page for the cp command) • man ­k  (Search man pages for string) • Command: info (Read the info documents) • info cp (Read the info document for the cp command) • Help associated with the command itself • ­­help or ­h parameter may display usage (only for some commands)

Task – Getting Help: What does the ls command do? Use the man page to find out how to get a long-listing of a directory with sizes in human readable format Search the man pages to find a command that removes files

The Linux Filesystem: A single hierarchical tree. “Everything is a file” - even mounted external drives, printers etc are represented as files under the root (/) filesystem.

The File System • The hierarchy is separated by the forward slash character / (not like Windows which uses back slash \) • Current directory is a single period: . • Directory above is two periods together: .. • Previous directory is a hyphen: • Home directory is a tilda: ~ • Using the TAB key will auto-complete the file/directory name

Globbing • Use * to represent multiple files • * on its own is all files/directories • *.doc represents all files ending in .doc • Use [ and ] to specify groups: [A-Z]*.jpg represents any .jpg file starting with a captial letter • Use ? as a single character wildcard: ?.jpg JPEG files with 1 char names (eg a.jpg, 1.jpg)

Navigating The File System • Command: cd (Change directory) • cd /var/tmp (Change to a specific directory) • cd .. (Change to the directory above) • cd ~/docs (Change to the docs directory in my home) • cd – (Change to the previous directory) • cd www (Change to the bin sub-directory in current path) • cd (return to your home directory)

Command: ls (list directory contents) • ls (show contents in current working directory) • ls -la (as above, but show long list, and hidden files) • ls -lart (as above, but show in reverse time order) • ls /usr/bin (show contents of a specific directory) • ls -R (recursively list the current working directory) • Command: pwd (Print working Directory): Displays the path of where you are.

Files & Directories • Just like on any other operating system you have normal files and directories • Command: cp (Copy files) • cp myfile.doc myfile2.doc (Make a copy of myfile.doc) • cp -p myfile.doc myfile2.doc (Same as above but preserve permissions/date stamps etc) • cp -r research research2 (Recursively copy directory research)

• Command: rmdir (Remove directory) • rmdir mydir (remove the directory mydir - only works if mydir is empty) • Command: rm (remove file/directory entries) • rm myfile2.doc (remove the file myfile2.doc) • Careful with the next command! • rm -rf research2 (remove the directory research2 and all it’s files & sub-directories)

• Command: mkdir (Make directory) • mkdir assignments (make a directory called assignments) • mkdir -p mydir/dir1 (make a new directory called mydir/dir1, even if mydir does not exist)

Task: Filesystem, Files and Directories: Create a directory called “test” under your home directory Copy the file /tmp/test.txt into your new test directory List the contents of your new directory Make a note of the size of the test.txt file. Copy the entire test directory to a directory called test2 Rename test2 to deleteme Delete the directory deleteme and all of its contents.

Symbolic Links • Symbolic links point to real files or directories – allows you to jump to other files or locations on the file system • Command: ln (Make link between files) • ln -s myfile_v3.doc myfile_latest.doc (creates a symbolic link called myfile_latest.doc that points to myfile_v3.doc) • ln -s /home/demo/dir1/dir2/dir3 /home/demo/jump2dir • (creates a symbolic link called jump2dir that points to a deep directory - allows for quicker access)

Task: Symbolic Links Create a subdirectory in your test directory called 'test2' Create a symbolic link in your home directory called test2, that points to the test2 directory you just created. To check it worked, do a long listing of your home directory and try to cd ~/test2

Linux has “Traditional Unix Permissions” It also has Access Control Lists, but we won't cover those today. Unix permissions are based around Users and Groups:

• Represented as 9-character code to describe permissions drwxrwx­­­ 2 tusr1 tusr1 4096 Jun 13  2012 test

drwxrwx­­­ Other

Type Owner

Group

Octal notation for permissions 0: --- no permission 1: --x execute 2: -w- write 3: -wx write and execute 4: r-- read 5: r-x read and execute 6: rw- read and write 7: rwx read, write and execute

421 rwx

File Permissions • Command: chown (Change ownership) • Only root user (administrator) can use this command • Command: chgrp (Change group ownership) • chgrp staff myfile.doc (Change the group owner to staff) • chgrp -R staff docs/ (Recursively change all files and sub-directories under the directory docs)

• Command: chmod (change file access permissions) • chmod g+rw myfile.doc (Give myfile.doc group read/write permissions) • chmod -R g+rw mydir (Recursively give mydir group read/write permissions) • chmod 755 mydir (Give mydir group read/execute, other read/execute permissions)

Task: File permissions All of the training users have been added to group “training”. You have a folder in /storage/cdata/BRC_Training for that group Have a look at that directory's file permissions (ls -ld ) Create a directory inside that directory named with your username. Check it's permissions. With your neighbour, experiment with changing the group permissions. What happens if you remove the execute permission? Can you allow read, but not write permissions.

IO Redirection and Pipes • stdin - Standard In (keyboard entry to the command line) • Two kinds of screen output • stdout - Standard Out (normal screen output) • stderr - Standard Error (error output - also to screen output, unbuffered) • Pipes - sends stdout to another command • use the | character • eg. cat myfile.txt | grep string

Redirection of Input/Output • > - Redirect the stdout to a file • 2> - Redirect the stderr to a file • 2>&1 - Redirect the stderr to stdout • < - Use contents of file as keyboard input

Task: Pipes and IO Redirection List the contents of the /storage/cdata/BRC_Training directory and redirect the results to a file in your home directory

Differences Between Windows and UNIX Text Files • Windows and UNIX (Linux, Mac OS X & others) have a different way of representing newlines • Windows uses CR+LF (Carriage return + Line Feed), Unix uses just LF • A Unix formatted textfile in Windows will appear all as 1 line • A Windows formatted textfile in Unix will have a ^M at the end of each line Use the commands dos2unix and unix2dos to convert between the two formats.

Viewing Text Files • Command: cat (concatenate files and print on the standard output) • This command will print n number of text files to the screen • cat myfile1.txt (print the contents of myfile1.txt to the screen) • cat myfile1.txt myfile2.txt (print the contents of myfile1.txt and myfile2.txt to the screen) • cat myfile1.txt myfile2.txt > newfile.txt (creates newfile.txt which has the concatenated contents of myfile1.txt and myfile2.txt)

• Command: more (file perusal filter for crt viewing) • more myfile1.txt (print the contents of myfile1.txt but allows the user to scroll down- hit space to scroll, q to quit)

• Command: less (like more, but allows reverse scrolling) • less myfile1.txt (print the contents of myfile1.txt but allows the user to scroll up and down - hit space to scroll down a page, q to quit, arrow up/down to scroll lines)

• Command: head (output the first part of a file) • head myfile.txt (print the first 10 lines of myfile.txt) • head -100 myfile.txt (print the first 100 lines of myfile.txt)

• Command: tail (output the last part of a file) • tail myfile.txt (print the last 10 lines of myfile.txt) • tail -100 myfile.txt (print the last 100 lines of myfile.txt) • tail -f myfile.txt (follow a file myfile.txt - this will print more as the file is being updated - great for watching log files)

Text Editors • gedit – graphical, like notepad. Requires X11 forwarding. • pico, nano. Easy to use but not overly functional • emacs, including jove. Relatively easy to use and more feature rich • vi, including vim. Complicated but very powerful

vi & vim • Command: vim : These are the very basics • vi myfile.txt (open an existing or create a new file called myfile.txt) • You can move your cursor around the screen using the arrow keys • To Insert text press i, and type your text. Press esc when finished. • To delete a line, type d, d. • To save (write) the file, type :, w, [Enter] • To quit type :, q, [Enter]

Task: Text Editors Start gedit (if you use & at the end of the command it will background the process and give you back your command prompt. Write some text and save the file in your home directory Try opening the file with vim. Edit the file in vim and save it again.

Compressing and Archiving Files Command: unzip – unzip a Windows zip archive  # unzip file into current directory   unzip     # unzip file into specified directory   unzip ­d /path/to/dir  Command: zip – Package and compress files into a zip archive  # Add files to archive.zip   zip archive.zip file1 file2 file3

Linux splits up the packaging and compression steps.

Compression: Command: gzip / gunzip – compress / decompress a file using Lempel-Ziv coding (LZ77)  # compress file.txt (creates file.txt.gz)   gzip file.txt    # decompress file.txt.gz   gunzip file.txt.gz

Command: bzip2 / bunzip2 – Like gzip but using Burrows-Wheeler compression and Huffman coding.  # compress file.txt   bzip2 file.txt      # decompress file.txt.bz   bunzip2 file.txt.bz

Packaging: Command: tar  # Add (create, verbose, filename) files to an archive   tar ­cvf archive.tar file1 file2   # unpack (extract, verbose, filename) archive   tar ­xvf archive.tar Tar has gzip and bzip filters so you can archive and compress in a single command:  # archive and gzip files   tar ­cvzf archive.tgz file1 file2 file3   # decompress tar / gzip archive   tar ­xvzf archive.tgz       # archive and gzip files   tar ­cvjf archive.tgz file1 file2 file3   # decompress tar / gzip archive   tar ­xvjf archive.tgz

Transferring Files You can't use ftp on the cluster – it is not secure

Scp • Command scp - secure copy (remote file copy program) • Copy a local file to a remote server $ scp /path/on/localhost @:/path/on/targetserver

• Copy a remote file to the local server $ scp @:/path/on/targetserver /path/on/localhost 

sftp # Connect to remote server for file transfer $ sftp @ # Download remotefile to local working dir sftp> get  # Upload localfile to remote working dir sftp> put  sftp> quit

Processes • Every program/command you run is a process • You can view and stop any of your own processes • Command: top (Display Linux tasks)

• Command: ps (report a snapshot of the current processes) • ps (show your running processes, short listing) • ps -f (show your running processes, full listing) • ps -ef (show all processes, full listing) • ps -fu (show the processes is running)

• Some times you want to stop a process which isn't responding • You first need to know the process number: use the ps command • Command: kill (terminate a process) • kill 12345 (Sends the TERM signal process 12345) • kill -9 12345 (Sends the KILL signal process 12345 – use only if the above fails - doesn’t let the process fail gracefully) • kill -9 -1 (Kills everything you are running. Be careful!)

Task: Processes Start a long running process: $ sleep 2h Use ps to determine its PID Kill the process.