Transferring your Files
There are several transfer mechanism for data to Frontera, some of which depend on where and how the data are to be stored. Please review the following transfer mechanisms.
Windows Users
TACC staff recommends the open-source Cyberduck utility for both Windows and Mac users that do not already have a preferred tool.
Click on the "Open Connection" button in the top right corner of the Cyberduck window to open a connection configuration window (as shown below) transfer mechanism, and type in the server name "frontera.tacc.utexas.edu
". Add your username and password in the spaces provided, and if the "more options" area is not shown click the small triangle or button to expand the window; this will allow you to enter the path to your project area so that when Cyberduck opens the connection you will immediately see your data. Then click the "Connect" button to open your connection.
Once connected, you can navigate through your remote file hierarchy using familiar graphical navigation techniques. You may also drag-and-drop files into and out of the Cyberduck window to transfer files to and from Frontera.
Grid Community Toolkit
The Grid Community Toolkit (GCT) is an open-source fork of the Globus Toolkit and was created in response to the end-of-support of the Globus Toolkit in January 2018.
Frontera has one Grid Community Toolkit endpoint. All users may authenticate using the CILogon myproxy authentication. See Using Grid Community Toolkit at TACC for detailed information.
SSH Utilities: scp
& rsync
The scp
and rsync
commands are standard UNIX data transfer mechanisms used to transfer moderate size files and data collections between systems. These applications use a single thread to transfer each file one at a time. The scp
and rsync
utilities are typically the best methods when transferring Gigabytes of data. For larger data transfers, parallel data transfer mechanisms, e.g., Grid Community Toolkit, can often improve total throughput and reliability.
You can transfer files between Frontera and Linux-based systems using either scp
or rsync
. Both scp
and rsync
are available in the Mac Terminal app. Windows SSH clients, such as Cyberduck and Filezilla, typically include scp
-based file transfer capabilities.
Transferring Files with scp
Data transfer from any Linux system can be accomplished using the scp
utility to copy data to and from the login node. A file can be copied from your local system to the remote server by using the command:
localhost% scp filename \ TACC-username@frontera.tacc.utexas.edu:/path/to/project/directory
Consult the scp
man pages for more information:
login1$ man scp
The Linux scp
(secure copy) utility is a component of the OpenSSH suite. Assuming your Frontera username is bjones
, a simple scp
transfer that pushes a file named myfile
from your local Linux system to Frontera $HOME
would look like this:
localhost$ scp ./myfile bjones@frontera.tacc.utexas.edu: # note colon after net address
You can use wildcards, but you need to be careful about when and where you want wildcard expansion to occur. For example, to push all files ending in .txt
from the current directory on your local machine to /work/01234/bjones/scripts
on Frontera:
localhost$ scp *.txt bjones@frontera.tacc.utexas.edu:/work/01234/bjones/frontera
To delay wildcard expansion until reaching Frontera, use a backslash (\
) as an escape character before the wildcard. For example, to pull all files ending in .txt
from /work/01234/bjones/scripts
on Frontera to the current directory on your local system:
localhost$ scp bjones@frontera.tacc.utexas.edu:/work/01234/bjones/frontera/\*.txt .
You can of course use shell or environment variables in your calls to scp
. For example:
localhost$ destdir="/work/01234/bjones/frontera/data" localhost$ scp ./myfile bjones@frontera.tacc.utexas.edu:$destdir
You can also issue scp
commands on your local client that use Frontera environment variables like $HOME
, $WORK
, and $SCRATCH
. To do so, use a backslash (\
) as an escape character before the $
; this ensures that expansion occurs after establishing the connection to Frontera:
localhost$ scp ./myfile bjones@frontera.tacc.utexas.edu:\$WORK/data # Note backslash
Avoid using scp
for recursive transfers of directories that contain nested directories of many small files:
localhost$scp -r ./mydata bjones@frontera.tacc.utexas.edu:\$WORK# DON'T DO THIS
Instead, use tar
to create an archive of the directory, then transfer the directory as a single file:
localhost$ tar cvf ./mydata.tar mydata # create archive localhost$ scp ./mydata.tar bjones@frontera.tacc.utexas.edu:\$WORK # transfer archive
Transferring Files with rsync
The rsync
(remote synchronization) utility is a great way to synchronize files that you maintain on more than one system: when you transfer files using rsync
, the utility copies only the changed portions of individual files. As a result, rsync
is especially efficient when you only need to update a small fraction of a large dataset. The basic syntax is similar to scp
:
localhost$ rsync mybigfile bjones@frontera.tacc.utexas.edu:\$WORK/data localhost$ rsync -avtr mybigdir bjones@frontera.tacc.utexas.edu:\$WORK/data
The options on the second transfer are typical and appropriate when synching a directory: this is a recursive update (-r
) with verbose (-v
) feedback; the synchronization preserves time stamps (-t
) as well as symbolic links and other meta-data (-a
). Because rsync
only transfers changes, recursive updates with rsync
may be less demanding than an equivalent recursive transfer with scp
.
See Good Conduct for additional important advice about striping the receiving directory when transferring large files; watching your quota on $HOME
and $WORK
; and limiting the number of simultaneous transfers. Remember also that $STOCKYARD
(and your $WORK
directory on each TACC resource) is available from several other TACC systems: there's no need for scp
when both the source and destination involve subdirectories of $STOCKYARD
.
The rsync
command is another way to keep your data up to date. In contrast to scp
, rsync
transfers only the actual changed parts of a file (instead of transferring an entire file). Hence, this selective method of data transfer can be much more efficient than scp. The following example demonstrates usage of the rsync
command for transferring a file named "myfile.c
" from its current location on Stampede to Frontera's $DATA
directory.
login1$ rsync myfile.c \ TACC-username@frontera.tacc.utexas.edu:/data/01698/TACC-username/data
An entire directory can be transferred from source to destination by using rsync
as well. For directory transfers the options "-avtr
" will transfer the files recursively ("-r
" option) along with the modification times ("-t
" option) and in the archive mode ("-a
" option) to preserve symbolic links, devices, attributes, permissions, ownerships, etc. The "-v
" option (verbose) increases the amount of information displayed during any transfer. The following example demonstrates the usage of the "-avtr
" options for transferring a directory named "gauss
" from the present working directory on Stampede to a directory named "data
" in the $WORK file system on Frontera.
login1$ rsync -avtr ./gauss \ TACC-username@frontera.tacc.utexas.edu:/data/01698/TACC-username/data
For more rsync
options and command details, run the command "rsync -h
" or:
login1$ man rsync
When executing multiple instantiations of scp
or rsync
, please limit your transfers to no more than 2-3 processes at a time.
Sharing Files with Collaborators
If you wish to share files and data with collaborators in your project, see Sharing Project Files on TACC Systems for step-by-step instructions. Project managers or delegates can use Unix group permissions and commands to create read-only or read-write shared workspaces that function as data repositories and provide a common work area to all project members.