Group Storage Allocation FAQ
This page provides common questions and answers about the use of
/bulk storage on ARC.
Work and Bulk storage mostly work like any other directories that you have access to on ARC (e.g. your home directory). You can use the standard linux file system commands within them
rm. You can also refer to them directly by their full path from any node in the cluster. As long as you set all of the permissions correctly, this means that you can treat these spaces the same as you do your home directory. Most of the complexity of using Work and Bulk storage on ARC comes from the handling of Linux permissions, which are mostly inconsequential in your home directory. For the examples in the rest of this document we will assume that your group allocation is named "somepi_lab".
Frequently Asked Questions
I can't access my advisor's (or other colleague's) work or bulk directory. Why not?
To access any work or bulk directory on ARC you must belong to the unix group associated with it. This can be requested for you by the owner (or their delegate) simply by emailing email@example.com and requesting that you be added to the unix group (including the group name). This can be done at the same time that you ARC account is requested. Once you have been added to the unix group for the group allocation, you may still not be able to access all subdirectories in it, as some groups allow members to keep some data private from other members of the group. The permissions mechanism for this is explained in Linux Permissions
How do I access my work or bulk directory?
Accessing your work or bulk directory is much like accessing home directories. First, you need to connect to ARC. From there, you will need to navigate to your allocation. If it is a work allocation, this would look something like
[username@arc ~]$ cd /work/somepi_lab [username@arc somepi_lab]$ ls -lh total 4.7G -rw-r--r-- 1 username somepi_lab 2.4G Feb 3 10:13 A.csv -rw-r--r-- 1 otheruser somepi_lab 2.4G Feb 3 10:16 B.csv [username@arc somepi_lab]$ cp B.csv ~/myData
here we have changed our current working directory to the Work directory, examined the contents of the directory, and copied a file (created by another user in the group) back to our home directory. This is only possible if we belong to the group somepi_lab. We don't have to copy files back to a home directory to work on them. More likely, we will have a subdirectory as a personal workspace under
How do I reference a work or bulk directory from a job running on ARC?
A work or bulk directory can be referenced from a job script just like you home directory. The work and bulk directories are accessible from every compute node and nothing special needs to be done to write to them (beyond managing permissions). A jobs script could be something like
#!/bin/bash #SBATCH --partition=single #SBATCH --time=2:0:0 #SBATCH --nodes=1 #SBATCH --ntasks=1 #SBATCH --cpus-per-task=2 #SBATCH --mem=1000 export PATH=/work/somepi_lab/software/anaconda3/bin:$PATH cd /work/somepi_lab/username/examples/1 python ./scripts/matmul_test.py
here both absolute and relative path references to the work directory are used without issue from within a job.
How do I transfer data to a work or bulk directory directly from a personal workstation?
Data transfers to your group allocation can be done in the same manner as data transfers to your home directory. Please review the article on data transfers: How to transfer data The only difference is that you need to explicitly point at the path to the work or bulk directory and can't rely on a wildcard like
~ or the assumption of a relative path starting from
~ in your path name. For example, a transfer to a data directory in your home directory might be:
desktop$ rsync -axv my_data1 my_data2 my_data3 firstname.lastname@example.org:"~/data"
Whereas in your work directory it would look like:
desktop$ rsync -axv my_data1 my_data2 my_data3 email@example.com:"/work/somepi_lab/data"