Scp files from cluster to NYBlue
This sets up ssh keypair authentication so that you can scp files between cluster.bnl.gov and fen.bluegene.bnl.gov. Note that the Blue Gene machine is behind an additional firewall so that NYBlue can connect to other machines at BNL, but other BNL machines cannot connect to NYBlue. Therefore all scp commands must be issued from NYBlue and not from cluster. Follow the steps below.
Contents
Generate an SSH key pair for cluster
Log into fen.bluegene.bnl.gov. Generate an SSH key pair to authenticate cluster.
ssh-keygen -q -b 2048 -t rsa -f ~/.ssh/cluster
When prompted for a password, hit return to specify no passphrase. This will create a 2048-bit RSA key pair in your '.ssh' directory, one public (named 'cluster.pub') and one private (named 'cluster').
Install public key on cluster
Copy your public key to cluster.bnl.gov
scp ~/.ssh/cluster.pub cluster.bnl.gov:~/.ssh
Now log into cluster.bnl.gov as usual with your Active Directory password. Append the public to to your authorized_keys files.
ssh cluster.bnl.gov cd .ssh cat cluster.pub >> authorized_keys
Note: Make sure that your authorized_keys file does not already contain a public key from cluster. If so, delete that line in the file and add the new public key instead.
Logging in to cluster with your key
Log out of cluster back to fen.bluegene. Now try logging back into cluster using the following command:
exit ssh -i ~/.ssh/cluster cluster.bnl.gov exit
After the '-i' option you provide the path to your private key file. This command will log you into cluster without a password. This sets up your passwordless login.
Create an ssh config file on fen
cd .ssh vi config
Create the file called "config" in your .ssh folder.
Host cluster.bnl.gov cluster User username Hostname cluster.bnl.gov Protocol 2 StrictHostKeyChecking no
Replace username with your own username. Change permissions of the "config" file to be -rw-r--r--
chmod 644 config
Copying files
You can now log in to cluster from fen with just
ssh cluster
You can now copy files from fen to cluster as
scp file.mol2 cluster:/path/in/cluster
You can also copy files from cluster using
scp cluster:/path/in/cluster/file.txt /path/in/fen
To copy multiple files with wildcards you have to escape the * with a \
scp cluster:/path1/file.\* cluster:/path2/file2 /path/in/fen
If you copy the file without specifying a path, it will be saved in the home directory on seawulf. Every time you copy a file, the cluster login notice will print. Note that you use CTRL-D to autocomplete pathnames on cluster even when using scp on fen. However, this is slow and will cause the cluster login notice to print every time.
Using rsync
rsync allows you to sync the contents of two folders. Only the differences between the two folders are transferred, making this much faster for updating your backups. If you are backing up a large folder of scripts and data files to ringo, you should rysnc rather than using tar+scp. -a preserves file perssions, -v is verbose and -e ssh id required.
rsync -ave ssh /some/folder1 ringo.ams.sunysb.edu:/media/backup rsync -ae ssh large_file.tar.bz2 ringo.ams.sunysb.edu:/media/backup md5sum large_file.tar.bz2
This will cause /media/backup/folder1 to be created on ringo. md5sum can be used to verify the contents of the file transferred. I have tested this for a large 76GB backup archive from cluster to ringo, and the hashes matched up. If you setup keypair authentication, you can put rsync in a script for automated backups offsite.
rsync -av --delete reorg_testset ringo.ams.sunysb.edu:/media/sdb1
Note that there should not be a terminating / after the folder names, otherwise rsync will dump the contents of the folder in the tagret folder, and not create a new folder. This also uses the delete option that will remove files at the destination that do not exist in the source folder. v is the verbose option, so the files copied are listed. the n option performs a dry-running, showing a list of files copied and deleted, but does not actually change anything.