How to request an interactive GPU on ARC: Difference between revisions

From RCSWiki
Jump to navigation Jump to search
(Created page with "Currently, there are two partitions on ARC that have usable CUDA GPUs, '''gpu-v100''', and '''bigmem'''. The <code>arc.hardware</code> tool can be used to see that. <pre> $ a...")
 
m (Added ARC category)
 
(7 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Currently, there are two partitions on ARC that have usable CUDA GPUs, '''gpu-v100''', and '''bigmem'''.
= GPUs on ARC =
Currently, there are three partitions on ARC that have usable CUDA GPUs, '''gpu-v100''', '''gpu-a100''', and '''bigmem'''.


The <code>arc.hardware</code> tool can be used to see that.
The <code>arc.hardware</code> tool can be used to see that.
Line 17: Line 18:
       .........
       .........
</pre>
</pre>
For more information about finding hardware on ARC, see [[How to find available partitions on ARC]].


 
=== Example 1 ===
1 GPU and 4 CPUs on the '''gpu-v100''' partition for 1 hour (16 GB of RAM):
1 GPU and 4 CPUs on the '''gpu-v100''' partition for 1 hour (16 GB of RAM):
  salloc -N1 -n1 -c4 --mem=16GB --gres=gpu:1 -p gpu-v100 -t 1:00:00  
  salloc -N1 -n1 -c4 --mem=16GB --gres=gpu:1 -p gpu-v100 -t 1:00:00  


 
=== Example 2 ===
1 GPU and 4 CPUs on the '''bigmem''' partition for 1 hour (256 GB of RAM):
1 GPU and 4 CPUs on the '''bigmem''' partition for 1 hour (256 GB of RAM):
  $ salloc -N1 -n1 -c4 --mem=256gb --gres=gpu:1 -p bigmem -t 1:00:00
  $ salloc -N1 -n1 -c4 --mem=256gb --gres=gpu:1 -p bigmem -t 1:00:00


 
=== Checking the allocated GPU(s) ===
Use the '''nvidia-smi''' command to check the GPU:
Use the '''nvidia-smi''' command to check the GPU:
<pre>
<pre>
Line 51: Line 53:
+-----------------------------------------------------------------------------+
+-----------------------------------------------------------------------------+
</pre>
</pre>
= Links =
[[How-Tos]]
[[Category:Guides]]
[[Category:How-Tos]]
[[Category:ARC]]

Latest revision as of 22:45, 20 September 2023

GPUs on ARC

Currently, there are three partitions on ARC that have usable CUDA GPUs, gpu-v100, gpu-a100, and bigmem.

The arc.hardware tool can be used to see that.

$ arc.hardware

Node specifications per partition:
      ================================================================================
           Partition |   Node    CPUs    Memory        GPUs  Node list
                     |  count   /node      (MB)       /node  
      --------------------------------------------------------------------------------
              bigmem |      2      80   3000000              fm[1-2]
                     |      1      40   4127000      a100:4  mmg1
                     |      1      40   8256000      a100:2  mmg2
      .........
            gpu-v100 |     13      40    753000      v100:2  fg[1-13]
      .........

For more information about finding hardware on ARC, see How to find available partitions on ARC.

Example 1

1 GPU and 4 CPUs on the gpu-v100 partition for 1 hour (16 GB of RAM):

salloc -N1 -n1 -c4 --mem=16GB --gres=gpu:1 -p gpu-v100 -t 1:00:00 

Example 2

1 GPU and 4 CPUs on the bigmem partition for 1 hour (256 GB of RAM):

$ salloc -N1 -n1 -c4 --mem=256gb --gres=gpu:1 -p bigmem -t 1:00:00

Checking the allocated GPU(s)

Use the nvidia-smi command to check the GPU:

$ nvidia-smi
Fri Jun  3 11:35:14 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A100-PCI...  Off  | 00000000:17:00.0 Off |                    0 |
| N/A   39C    P0    42W / 250W |      0MiB / 40536MiB |     32%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Links

How-Tos