How to check GPU utilization: Difference between revisions
Jump to navigation
Jump to search
(Created page with "= General = = Links = How-Tos") |
(Replaced links section with navbox) |
||
(8 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
= General = | = General = | ||
= | = For Running Jobs = | ||
[[How-Tos]] | == using SLURM == | ||
If you have a job that is running on a GPU node and that is expected to use a GPU on that node, | |||
you can check the GPU use by your code by running the following command on ARC's login node: | |||
$ srun -s --jobid 12345678 --pty nvidia-smi | |||
The number here is the job ID of the running job. | |||
The output should look similar to the following: | |||
<pre> | |||
Mon Aug 22 09:27:38 2022 | |||
+-----------------------------------------------------------------------------+ | |||
| NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 | | |||
|-------------------------------+----------------------+----------------------+ | |||
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | |||
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | |||
| | | MIG M. | | |||
|===============================+======================+======================| | |||
| 0 Tesla V100-PCIE... Off | 00000000:3B:00.0 Off | 0 | | |||
| N/A 33C P0 36W / 250W | 848MiB / 16160MiB | 30% Default | | |||
| | | N/A | | |||
+-------------------------------+----------------------+----------------------+ | |||
+-----------------------------------------------------------------------------+ | |||
| Processes: | | |||
| GPU GI CI PID Type Process name GPU Memory | | |||
| ID ID Usage | | |||
|=============================================================================| | |||
| 0 N/A N/A 2232533 C .../Programs/OpenDBA/openDBA 338MiB | | |||
+-----------------------------------------------------------------------------+ | |||
</pre> | |||
In this case there was 1 GPU allocated and its usage was '''30%'''. | |||
The code <code>openDBA</code> also uses '''338 MB''' of the GPU memory. | |||
= For Past Jobs = | |||
== Using Sampled Metrics == | |||
On the [[Open OnDemand | OOD portal]], here https://ood-arc.rcs.ucalgary.ca | |||
login using your UofC credentials (it may require your full UofC email address instead of just user name). | |||
Once you log in, | |||
* Select '''Help --> View my job metrics''', | |||
: this will open an interface to your past jobs that are available in the database. | |||
* Find the job you want to check and there may be useful graphs for GPU usage. | |||
[[Category:Guides]] | |||
[[Category:How-Tos]] | |||
{{Navbox Guides}} |
Latest revision as of 20:28, 21 September 2023
General
For Running Jobs
using SLURM
If you have a job that is running on a GPU node and that is expected to use a GPU on that node, you can check the GPU use by your code by running the following command on ARC's login node:
$ srun -s --jobid 12345678 --pty nvidia-smi
The number here is the job ID of the running job.
The output should look similar to the following:
Mon Aug 22 09:27:38 2022 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 Tesla V100-PCIE... Off | 00000000:3B:00.0 Off | 0 | | N/A 33C P0 36W / 250W | 848MiB / 16160MiB | 30% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 2232533 C .../Programs/OpenDBA/openDBA 338MiB | +-----------------------------------------------------------------------------+
In this case there was 1 GPU allocated and its usage was 30%.
The code openDBA
also uses 338 MB of the GPU memory.
For Past Jobs
Using Sampled Metrics
On the OOD portal, here https://ood-arc.rcs.ucalgary.ca login using your UofC credentials (it may require your full UofC email address instead of just user name).
Once you log in,
- Select Help --> View my job metrics,
- this will open an interface to your past jobs that are available in the database.
- Find the job you want to check and there may be useful graphs for GPU usage.