Difference between revisions of "Group Storage Allocation FAQ"

From RCSWiki
Jump to navigation Jump to search
Line 42: Line 42:
  
 
==How do I transfer data to a work or bulk directory directly from a personal workstation?==
 
==How do I transfer data to a work or bulk directory directly from a personal workstation?==
 +
 +
Data transfers to your group allocation can be done in the same manner as data transfers to your home directory. Please review the article on data transfers: [[How to transfer data]] The only difference is that you need to explicitly point at the path to the work or bulk directory and can't rely on a wildcard like <code>~</code> or the assumption of a relative path starting from <code>~</code> in your path name. For example, a transfer to a data directory in your '''home''' directory might be:
 +
<syntaxhighlight lang="bash">
 +
desktop$ rsync -axv my_data1 my_data2 my_data3 username@arc-dtn.ucalgary.ca:"~/data"
 +
</syntaxhighlight>
 +
 +
Whereas in your '''work''' directory it would look like:
 +
<syntaxhighlight lang="bash">
 +
desktop$ rsync -axv my_data1 my_data2 my_data3 username@arc-dtn.ucalgary.ca:"/work/somepi_lab/data"
 +
</syntaxhighlight>
  
 
==How do I transfer data to a work or bulk directory from my home directory?==
 
==How do I transfer data to a work or bulk directory from my home directory?==

Revision as of 19:08, 14 October 2021

This page provides common questions and answers about the use of /work and /bulk storage on ARC.

General Information

Work and Bulk storage mostly work like any other directories that you have access to on ARC (e.g. your home directory). You can use the standard linux file system commands within them ls, cd, cp, mv, rm. You can also refer to them directly by their full path from any node in the cluster. As long as you set all of the permissions correctly, this means that you can treat these spaces the same as you do your home directory. Most of the complexity of using Work and Bulk storage on ARC comes from the handling of Linux permissions, which are mostly inconsequential in your home directory. For the examples in the rest of this document we will assume that your group allocation is named "somepi_lab".

Frequently Asked Questions

I can't access my advisor's (or other colleague's) work or bulk directory. Why not?

To access any work or bulk directory on ARC you must belong to the unix group associated with it. This can be requested for you by the owner (or their delegate) simply by emailing support@hpc.ucalgary.ca and requesting that you be added to the unix group (including the group name). This can be done at the same time that you ARC account is requested. Once you have been added to the unix group for the group allocation, you may still not be able to access all subdirectories in it, as some groups allow members to keep some data private from other members of the group. The permissions mechanism for this is explained in Linux Permissions

How do I access my work or bulk directory?

Accessing your work or bulk directory is much like accessing home directories. First, you need to connect to ARC. From there, you will need to navigate to your allocation. If it is a work allocation, this would look something like

[username@arc ~]$ cd /work/somepi_lab
[username@arc somepi_lab]$ ls -lh
total 4.7G
-rw-r--r-- 1 username somepi_lab 2.4G Feb  3 10:13 A.csv
-rw-r--r-- 1 otheruser somepi_lab 2.4G Feb  3 10:16 B.csv
[username@arc somepi_lab]$ cp B.csv ~/myData

here we have changed our current working directory to the Work directory, examined the contents of the directory, and copied a file (created by another user in the group) back to our home directory. This is only possible if we belong to the group somepi_lab. We don't have to copy files back to a home directory to work on them. More likely, we will have a subdirectory as a personal workspace under /work/somepi_lab.

How do I reference a work or bulk directory from a job running on ARC?

A work or bulk directory can be referenced from a job script just like you home directory. The work and bulk directories are accessible from every compute node and nothing special needs to be done to write to them (beyond managing permissions). A jobs script could be something like

#!/bin/bash
#SBATCH --partition=single 
#SBATCH --time=2:0:0 
#SBATCH --nodes=1 
#SBATCH --ntasks=1 
#SBATCH --cpus-per-task=2 
#SBATCH --mem=1000

export PATH=/work/somepi_lab/software/anaconda3/bin:$PATH
cd /work/somepi_lab/username/examples/1

python ./scripts/matmul_test.py

here both absolute and relative path references to the work directory are used without issue from within a job.

How do I transfer data to a work or bulk directory directly from a personal workstation?

Data transfers to your group allocation can be done in the same manner as data transfers to your home directory. Please review the article on data transfers: How to transfer data The only difference is that you need to explicitly point at the path to the work or bulk directory and can't rely on a wildcard like ~ or the assumption of a relative path starting from ~ in your path name. For example, a transfer to a data directory in your home directory might be:

desktop$ rsync -axv my_data1 my_data2 my_data3 username@arc-dtn.ucalgary.ca:"~/data"

Whereas in your work directory it would look like:

desktop$ rsync -axv my_data1 my_data2 my_data3 username@arc-dtn.ucalgary.ca:"/work/somepi_lab/data"

How do I transfer data to a work or bulk directory from my home directory?

How do Linux permissions work for sharing data in a work or bulk directory?

My colleague has opened up a directory for me to access. Why can't I use ls to look inside it?

I have access to two group storage allocations. How do I move data between them?

None of my colleagues can read files that I create in my work or bulk directory. What is going on?

How do I share data with another colleague on ARC without adding them to the unix group for my allocation?