From Rizzo_Lab
Revision as of 13:54, 11 November 2009 by Sudipto (talk | contribs)
(diff) ←Older revision | view current revision (diff) | Newer revision→ (diff)
Jump to: navigation, search

rsync allows you to sync the contents of two folders. Only the differences between the two folders are transferred, making this much faster for updating your backups. If you are backing up a large folder of scripts and data files to ringo, you should rysnc rather than using tar+scp. -a preserves file perssions, -v is verbose and -e ssh id required.

rsync -ave ssh /some/folder1
rsync -ae ssh large_file.tar.bz2
md5sum large_file.tar.bz2

This will cause /media/backup/folder1 to be created on ringo. md5sum can be used to verify the contents of the file transferred. I have tested this for a large 76GB backup archive from cluster to ringo, and the hashes matched up. If you setup keypair authentication, you can put rsync in a script for automated backups offsite.

rsync -av --delete --bwlimit=1000 reorg_testset

Note that there should not be a terminating / after the folder names, otherwise rsync will dump the contents of the folder in the tagret folder, and not create a new folder. This also uses the delete option that will remove files at the destination that do not exist in the source folder. v is the verbose option, so the files copied are listed. the n option performs a dry-running, showing a list of files copied and deleted, but does not actually change anything. --bwlimit=1000 constrains the bandwidth usage to 1000KB/s to keep from saturating the network connection.

rsync -av --delete --bwlimit=1000 --delete-excluded --exclude-from=rsync.testset.excludes reorg_testset

This version of the backup command now includes optional excludes files in the exclude list text file. The --delete-excluded option will delete any of the excluded files if present in the target location. A sample exclude list: