Transferring Data from CHGI

From RCSWiki
Revision as of 21:06, 28 October 2020 by Lleung (talk | contribs) (Created page with "350px|thumb|right|ARC-DTN and CHGI NFS mounts Due to the large datasets that are currently stored at the Center for Health Genomics and Infor...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
ARC-DTN and CHGI NFS mounts

Due to the large datasets that are currently stored at the Center for Health Genomics and Informatics (CHGI), we have set up a dedicated 10Gbit fibre connection between the ARC DTN and CHGI to help users quickly transfer files as part of the CHGI Transition.

Users needing to migrate their data from the CHGI to ARC are able to do so through the read-only NFS mounts that have been set up on the ARC DTN node. The NFS mounts will automatically take advantage of the dedicated 10Gbit fibre connection to help maximize your transfer speed. Access to the NFS filesystems are restricted to authorized CHGI group members. Please contact support@hpc.ucalgary.ca if you have difficulty accessing your files.

Most filesystems at CHGI have been made available under the /external mount point on ARC DTN. Please refer to the following table to help determine your filesystem path on ARC-DTN.

ARC DTN CHGI
/external/chgihome /gpfs/home
/external/gpfs/achri_data /gpfs/achri_data
/external/gpfs/achri_galaxy /gpfs/achri_galaxy
/external/gpfs/cbousman /gpfs/cbousman
/external/gpfs/charb_data /gpfs/charb_data
/external/gpfs/common /gpfs/common
/external/gpfs/ebg_data /gpfs/ebg_data
/external/gpfs/ebg_gmb /gpfs/ebg_gmb
/external/gpfs/ebg_projects /gpfs/ebg_projects
/external/gpfs/ebg_web /gpfs/ebg_web
/external/gpfs/ebg_work /gpfs/ebg_work
/external/gpfs/gallo /gpfs/gallo
/external/gpfs/qlong /gpfs/qlong
/external/gpfs/snyder_irida /gpfs/snyder_irida
/external/gpfs/snyder_work /gpfs/snyder_work
/external/gpfs/vetmed_data /gpfs/vetmed_data
/external/gpfs/vetmed_stage /gpfs/vetmed_stage
/external/tiered/achri_data /tiered/achri_data
/external/tiered/chgi_data /tiered/chgi_data
/external/tiered/ebg_mic /tiered/ebg_mic
/external/tiered/ewang_scratch /tiered/ewang_scratch
/external/tiered/ewang /tiered/ewang
/external/tiered/jwasmuth /tiered/jwasmuth
/external/tiered/kkurek /tiered/kkurek
/external/tiered/morph /tiered/morph
/external/tiered/mtgraovac /tiered/mtgraovac
/external/tiered/parnold /tiered/parnold
/external/tiered/robbins /tiered/robbins
/external/tiered/smorrissy /tiered/smorrissy
/external/tiered/snyder_data /tiered/snyder_data

To get started with your data migration:

  1. Log in to the ARC DTN via SSH at arc-dtn.ucalgary.ca.
  2. Ensure that you have read permissions to the files you are interested. Verify this by navigating to the filesystem based on the mapping shown above and try listing or reading your files.
    • If you do not have read permission or have any difficulties, please contact us at support@hpc.ucalgary.ca.
    • If you have read permissions, you may follow the instructions at How to transfer data or read the sections for help on initiating a file transfer.

When transferring your files from CHGI to ARC, please remember that:

  • Your ARC home directory has a 500GB quota. If you are transferring files from your CHGI home directory to your ARC home directory, ensure you have sufficient disk space.
  • File permissions via NFS are made possible using the CHGI group permissions. Your files at CHGI must be group readable in order for them to be accessible via the NFS mount.

Starting a File Transfer

For large data transfers, we recommend using rsync within a screen or tmux session. This ensures that even when you disconnect from ARC DTN, your transfers will still continue to run.

Transferring files with rsync within screen

rsync is a file transfer utility which you can use to help transfer files from one location to another. Since processes spawned by your session typically terminates when you log out, we will need to use a terminal multiplexer program (such as screen or tmux) to help ensure your processes remain running after you log out. In this section, we will show you how to run rsync within a screen session to ensure your file transfers remain uninterrupted when you log out.

Log in to ARC DTN and start a new screen session:

# Start a new screen session named 'transfer'
$ screen -S transfer

Once within the screen session, you may initiate a file transfer with rsync. If your data at CHGI was originally located at /gpfs/directoryA and you wish to migrate the data to your new work directory at /work/my-group, run the following rsync command:

$ rsync -axv /external/gpfs/directoryA  /work/my-group/

While the transfer is running, you may disconnect from the screen session by hitting the hotkey Ctrl-a followed by d. You may later reattach to this screen session by running:

# List all available screen sessions
$ screen -ls

# Reconnect to the screen session named 'transfer'
$ screen -r transfer

Once your file transfer is complete, you may quit the screen session by running exit within the screen session.