How to transfer data: Difference between revisions
Line 86: | Line 86: | ||
$ rsync -axv my_data1 my_data2 my_data3 username@arc.ucalgary.ca:projects/project2/ | $ rsync -axv my_data1 my_data2 my_data3 username@arc.ucalgary.ca:projects/project2/ | ||
Download '''one file''' <code>output.dat</code> from ARC to the current directory on your workstation. | Download '''one file''' <code>output.dat</code> from ARC to the current directory on your workstation: | ||
$ rsync -v username@arc.ucalgary.ca:projects/project1/output.dat . | |||
Note the '''"."''' at the end of the command, it means '''current directory'''. | Note the '''"."''' at the end of the command, it means '''current directory'''. | ||
Download '''one directory''' <code>outputs</code> from ARC to the current directory on your workstation: | Download '''one directory''' <code>outputs</code> from ARC to the current directory on your workstation: |
Revision as of 21:36, 22 April 2020
General
Linux and MacOS
While you can find transfer programs for MacOS and Linux that have graphical point-and-click interface, both the operating system come with pre-installed (most of the time) command line transfer tools: scp, rsync, sftp. These are powerful and convenient tools that can handle any practical data transfer to and from our compute clusters.
scp -- secure copy
- Manual page on-line: http://man7.org/linux/man-pages/man1/scp.1.html
scp copies files between hosts on a network.
It uses ssh for data transfer, and uses the same authentication and provides the same security as ssh.
In practical terms scp is a minimal and also sufficient transfer tool
to copy files between network connected Unix based computers in a secure manner.
The general format for the command is:
$ scp [options] source destination
The source
and destination
fields can be a local file / directory or a remote one.
The local location is a normal Unix path, absolute or relative and
the remote location has a format username@remote.system.name:file/path
.
The remote relative file path is relative to the home directory of the username
on the remote system.
Examples
The commands below are issued on your local computer.
One file data.dat
on your workstation in your current directory to your ARC's home directory:
$ scp data.dat username@arc.ucalgary.ca:
Several files matching a wildcard on your workstation in your current directory to your ARC's home directory:
$ scp *.dat username@arc.ucalgary.ca:
A directory my_data
on your workstation in your current directory into
projects/project2
directory inside your ARC's home directory:
$ scp -r my_data username@arc.ucalgary.ca:projects/project2/
rsync -- Remote SYNCronizer
- Manual page on-line: http://man7.org/linux/man-pages/man1/rsync.1.html
Rsync is a fast and extraordinarily versatile file copying tool.
It can copy locally, to/from another host over any remote shell.
It is famous for its delta-transfer algorithm, which reduces the amount of data sent over
the network by sending only the differences between the source files and the existing files in the destination.
Rsync is widely used for backups and mirroring and as an improved copy command for everyday use.
In practice, rsync is scp on steroids.
It is designed to synchronize two locations, that is to make them the same.
So, if a transfer stops for some reason, if one restarts the transfer, rsync will check the destination
and only transfers what is needed.
This way, you can conveniently restart the transfer at any moment without loosing the progress.
With scp this is not an option.
The general format for the command is similar to scp:
$ rsync [options] source destination
The source
and destination
fields can be a local file / directory or a remote one.
The local location is a normal Unix path, absolute or relative and
the remote location has a format username@remote.system.name:file/path
.
The remote relative file path is relative to the home directory of the username
on the remote system.
Examples
The commands below are issued on your local computer.
Upload one file data.dat
on your workstation in your current directory to your ARC's home directory:
$ rsync -v data.dat username@arc.ucalgary.ca:
Upload several files matching a wildcard on your workstation in your current directory to your ARC's home directory:
$ rsync -v *.dat username@arc.ucalgary.ca:
Upload a directory my_data
on your workstation in your current directory into
projects/project2
directory inside your ARC's home directory:
$ rsync -axv my_data username@arc.ucalgary.ca:projects/project2/
Upload several directories on your workstation in your current directory into
projects/project2
directory inside your ARC's home directory:
$ rsync -axv my_data1 my_data2 my_data3 username@arc.ucalgary.ca:projects/project2/
Download one file output.dat
from ARC to the current directory on your workstation:
$ rsync -v username@arc.ucalgary.ca:projects/project1/output.dat .
Note the "." at the end of the command, it means current directory.
Download one directory outputs
from ARC to the current directory on your workstation:
$ rsync -axv username@arc.ucalgary.ca:projects/project1/outputs .
sftp -- secure file transfer protocol
- Manual page on-line: http://man7.org/linux/man-pages/man1/sftp.1.html
sftp is a file transfer program, similar to ftp,
which performs all operations over an encrypted ssh transport.
It may also use many features of ssh, such as public key authentication and compression.
Windows
MobaXterm is the recommended tool for remote access and data transfer in Windows OSes.
MobaXterm
- Website: https://mobaxterm.mobatek.net/
MobaXterm is a one-stop solution for most remote access work on a compute cluster or a Unix / Linux server. Along the remote access SSH client and X11 graphics server it provides graphical interface for data transfer operations.