MARC Cluster Status: Difference between revisions

From RCSWiki
Jump to navigation Jump to search
(Created page with "{{MARC Cluster Status}} == System Messages == {{Message of the day item | title = ⚠️ January System Updates | date = 2023/01/01 | message = Beginning January 23, 2023, the ARC cluster will undergo operating system updates. We shall do our utmost to minimize disruption and allow ongoing jobs to be completed. New jobs may be temporarily held from scheduling. The ARC login node will reboot on the morning of January 16. Please save your work and log out if possible. T...")
 
No edit summary
 
(3 intermediate revisions by 2 users not shown)
Line 6: Line 6:
| date = 2023/01/01
| date = 2023/01/01
| message =
| message =
Beginning January 23, 2023, the ARC cluster will undergo operating system updates. We shall do our utmost to minimize disruption and allow ongoing jobs to be completed. New jobs may be temporarily held from scheduling.
Beginning January 23, 2023, the MARC cluster will undergo operating system updates. We shall do our utmost to minimize disruption and allow ongoing jobs to be completed. New jobs may be temporarily held from scheduling.


The ARC login node will reboot on the morning of January 16. Please save your work and log out if possible.
The MARC login node will reboot on the morning of January 23. Please save your work and log out if possible.


The upgrade is planned to be fully complete by January 20.
The upgrade is planned to be fully complete by January 27.


If you encounter any system issues, do not hesitate to let us know.
If you encounter any system issues, do not hesitate to let us know.
Line 30: Line 30:


Thank you for your cooperation.
Thank you for your cooperation.
}}
{{Message of the day item
| title = Apptainer (Singularity) on MARC Login Node
| date = 2023/06/23
| message =
Apptainer (Singularity) containers may experience an error when
running on the MARC login node. If apptainer complains that a system
administrator needs to enable user namespaces, simply run your
containers inside a job.
This is a temporary measure due to security vulnerability that will be
patched soon.
}}
{{Message of the day item
| title = Storage Upgrade MARC/ARC cluster
| date = 2023/10/23
| message =
We will be performing storage upgrades on the MARC/ARC cluster on
November 16 and 17, 2023. To facilitate this, we will be throttling
down the number of jobs on both clusters while the upgrades are
performed
}}
{{Message of the day item
| title = OS Upgrade MARC/ARC cluster
| date = 2024/09/11
| message =
MARC will be going down for OS upgrades on 2024/Sep/16. The cluster
will be unavailable temporarily to complete this work. Please contact
support@hpc.ucalgary.ca if you have any questions or concerns.
}}
}}

Latest revision as of 16:50, 11 September 2024

MARC status: Cluster operational


No upgrades planned. Please contact us if you experience system issues.

See the MARC Cluster Status page for system notices.

System Messages

⚠️ January System Updates - 2023/01/01

Beginning January 23, 2023, the MARC cluster will undergo operating system updates. We shall do our utmost to minimize disruption and allow ongoing jobs to be completed. New jobs may be temporarily held from scheduling.

The MARC login node will reboot on the morning of January 23. Please save your work and log out if possible.

The upgrade is planned to be fully complete by January 27.

If you encounter any system issues, do not hesitate to let us know.

Thank you for your cooperation.

System Updates Completed - 2023/01/24

The upgrade has been completed. The following has been changed:
  • OS Updated to Rocky Linux 8.7
  • Slurm updated to 22.05.7
  • Apptainer replaces Singularity
  • Each job will have its own /tmp, /dev/shm, /run/user/$uid mounted

If you encounter any system issues, do not hesitate to let us know.

Thank you for your cooperation.


Apptainer (Singularity) on MARC Login Node - 2023/06/23

Apptainer (Singularity) containers may experience an error when

running on the MARC login node. If apptainer complains that a system administrator needs to enable user namespaces, simply run your containers inside a job.

This is a temporary measure due to security vulnerability that will be

patched soon.

Storage Upgrade MARC/ARC cluster - 2023/10/23

We will be performing storage upgrades on the MARC/ARC cluster on

November 16 and 17, 2023. To facilitate this, we will be throttling down the number of jobs on both clusters while the upgrades are

performed

OS Upgrade MARC/ARC cluster - 2024/09/11

MARC will be going down for OS upgrades on 2024/Sep/16. The cluster

will be unavailable temporarily to complete this work. Please contact

support@hpc.ucalgary.ca if you have any questions or concerns.