HPC Getting Started: Transferring Data

Within the cluster

Move or copy files between areas on the login node with the mv and cp commands, as on any Unix system. You should have learned how to do that in your first Unix or Linux class, but more information is available from the man mv and man cp commands.

Jobs running on computing nodes have direct access to the home and scratch directories, so you don't have to do anything special to let them access your files your files. That's the biggest advantage of a global clustered filesystem.

The format is mv source destination. When the files are in the same directory, this is simply a rename.

$ ls
pgm1.c
$ mv pgm1.c pgm2.c
$ ls
pgm2.c

Otherwise the file is moved to the new location.

$ ls
pgm2.c
$ mv pgm2.c ~/scratch
$ ls ~/scratch
pgm2.c

Copy works the same way, except the original file will still exist. In either case, any existing destination file will be replaced.

Other machines

You will normally transfer files to and from computers outside of the cluster with the Secure File Transfer Protocol (SFTP). The format is similar to the ssh command, either sftp aaa.bbb.edu or sftp userid@aaa.bbb.edu, depending on whether you need to specify your userid or not.

When you connect to the DLX cluster and SFTP to another machine, then the DLX is your local machine and the other machine is the remote one. The put command copies a file from the local machine to the remote machine. The get command does the reverse. Of course, you must have a userid and password on the remote machine.

$ sftp hpc.uky.edu
Connecting to hpc.uky.edu...
Password:
sftp> lls
a.f90 tst1.sh
sftp> put a.f90
Uploading a.f90 to /home/userid/a.f90
a.f90       [progress bar]
sftp> get a.f90 b.f90
Fetching /home/herman/a.f90 to b.f90
/home/herman/a.f90       [progress bar]
sftp> lls
a.f90 b.f90 tst1.sh
sftp> quit

More information is available from the man sftp or the builtin help command.

sftp> help
Available commands:
bye
Quit sftp
cd path
Change remote directory to 'path'
chgrp grp path
Change group of file 'path' to 'grp'
chmod mode path
Change permissions of file 'path' to 'mode'
chown own path
Change owner of file 'path' to 'own'
df [-hi] [path]
Display statistics for current directory or filesystem containing 'path'
exit
Quit sftp
get [-P] remote-path [local-path]
Download file
help
Display this help text
lcd path
Change local directory to 'path'
lls [ls-options [path]]
Display local directory listing
lmkdir path
Create local directory
ln oldpath newpath
Symlink remote file
lpwd
Print local working directory
ls [-1aflnrSt] [path]
Display remote directory listing
lumask umask
Set local umask to 'umask'
mkdir path
Create remote directory
progress
Toggle display of progress meter
put [-P] local-path [remote-path]
Upload file
pwd
Display remote working directory
quit
Quit sftp
rename oldpath newpath
Rename remote file
rm path
Delete remote file
rmdir path
Remove remote directory
symlink oldpath newpath
Symlink remote file
version
Show SFTP version
!command
Execute 'command' in local shell
!
Escape to local shell
?
Synonym for help

Linux or MacOS Workstations

Just use SFTP as described above. Logon to your workstation and SFTP to the cluster (sftp userid@dlx.uky.edu) as always. Your workstation will be the local machine and the DLX will be the remote one.

Windows Workstations

If you are using PuTTY to connect to the cluster, then use the PuTTY SFTP command (psftp.exe), which you can get from the the PuTTY web page. Follow the directions in the PuTTY manual. The commercial SFTP packages have similar capabilities.

HSM

If you want to store files on the UKIT Hierarchical Storage Manager, then you will need to apply for an HSM account. See the Wiki Pages - HSM for more information. Note that your HSM userid will be the same as your link blue userid, but the HSM password will not be the same.

When files are transferred to HSM, they are written to a large disk cache, then the system copies them to tape. When the cache gets too full, transfers will be suspended until the system catches up. Before you start transferring a large file, check the cache status at HPC Stats.

Check the status of your HSM storage or change your password by connecting to HSM with SSH.

$ ssh userid@hsm.uky.edu
Hostname: tsm-hpc Ram: 18432MB cpus:10 OS: AIX 7100-01-05-1228
Model: IBM,8233-E8B Serial: 62E5B Lpar: 4 BootDevice: hdisk23
******************************************************************
University of Kentucky: FTP (hsm.uky.edu) User Interface
______________________________________________________________

1 Change your password
2 Change your email address
3 Show date
4 Show disk cache usage in GB
5 Space used by this id (can be _slow_ response)

Enter the number of the command you wish to issue or
f or RETURN - forward 1 screen b - back 1 screen
r - redraw screen q - quit
Choice:

You will transfer files to and from the Hierarchical Storage Manager (HSM) using SFTP as described above.

$ ssh userid@hsm.uky.edu
******************************************************************
Hostname: tsm-hpc Ram: 18432MB cpus:10 OS: AIX 7100-01-05-1228
Model: IBM,8233-E8B Serial: 62E5B Lpar: 4 BootDevice: hdisk23
******************************************************************
userid@hsm.uky.edu's password:
Connected to hsm.uky.edu.
sftp> put bigdata.txt
Uploading bigdata.txt to /users2/userid/bigdata.txt
bigdata.txt
sftp> rm bigdata.txt
Removing /users2/userid/bigdata.txt
sftp> exit

859-218-HELP (859-218-4357) 218help@uky.edu