How to find detailed information about past jobs on ARC
Background
Using Sampled Metrics
On the OOD portal, https://ood-arc.rcs.ucalgary.ca , login using your UofC credentials (it may require your full UofC email address instead of just user name).
Once you log in,
- Select Help --> View my job metrics,
- this will open an interface to your past jobs that are available in the database.
- Find the job you want to check and there may be useful graphs of CPU, memory, and GPU (if applicable) usage.
How to use the database
If you see a job you want to examine in more details,
- Click its Job ID link. This will take you to the job's metrics record.
- The top of the page shows the summary of the requested and used resources.
- Below the summary graphs of resources usage during the run time will be shown, under Job Resource Utilization by Node title.
- CPU usage, memory usage, and GPU usage, if GPUs were requested.
- Below the graphs more details are available split by Job Steps, under Job information by steps title.
- Click the batch on xxx tab to see more details of the batch job step. Normally, this is the most important part of the run time.
- The job step details show CPU usage, Memory usage, and GPU utilization for the job step.
- You can click specific time slices on the utilization graphs to see more details in the tables below the graphs for the specific moment:
- Process list;
- Opened files;
- Information about GPU utilization.