How to request an interactive GPU on ARC

From RCSWiki
Jump to navigation Jump to search

GPUs on ARC

Currently, there are three partitions on ARC that have usable CUDA GPUs, gpu-v100, gpu-a100, and bigmem.

The arc.hardware tool can be used to see that.

$ arc.hardware

Node specifications per partition:
      ================================================================================
           Partition |   Node    CPUs    Memory        GPUs  Node list
                     |  count   /node      (MB)       /node  
      --------------------------------------------------------------------------------
              bigmem |      2      80   3000000              fm[1-2]
                     |      1      40   4127000      a100:4  mmg1
                     |      1      40   8256000      a100:2  mmg2
      .........
            gpu-v100 |     13      40    753000      v100:2  fg[1-13]
      .........

For more information about finding hardware on ARC, see How to find available partitions on ARC.

Example 1

1 GPU and 4 CPUs on the gpu-v100 partition for 1 hour (16 GB of RAM):

salloc -N1 -n1 -c4 --mem=16GB --gres=gpu:1 -p gpu-v100 -t 1:00:00 

Example 2

1 GPU and 4 CPUs on the bigmem partition for 1 hour (256 GB of RAM):

$ salloc -N1 -n1 -c4 --mem=256gb --gres=gpu:1 -p bigmem -t 1:00:00

Checking the allocated GPU(s)

Use the nvidia-smi command to check the GPU:

$ nvidia-smi
Fri Jun  3 11:35:14 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A100-PCI...  Off  | 00000000:17:00.0 Off |                    0 |
| N/A   39C    P0    42W / 250W |      0MiB / 40536MiB |     32%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Links

How-Tos