RCS Software Management Group: Difference between revisions
(15 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
= General = | = General = | ||
== Topics to discuss == | |||
Date: May 27th 2021<br> | |||
* Singularity container management. Installation standard. | |||
Present:<br> | |||
Dmitri Rozmanov<br> | |||
Tannistha Nandi<br> | |||
===== Topics Discussed ===== | |||
# Test phase- build singularity containers for research softwares on ARC | |||
# Make it available to the users as a central module | |||
# Singularity container of the software will be kept in the same directory as the software under /global/software | |||
# Evaluate the performance of familiar softwares (eg gromacs) - container Vs non-container version | |||
# Need to come up with a standard format for modulefiles | |||
= RCS Standards = | = RCS Standards = | ||
Line 6: | Line 22: | ||
* Naming conventions | * Naming conventions | ||
* /global/software/openmpi/gnu-8.3.1/3.1.2-opa/ | |||
== Module file standard == | == Module file standard == | ||
Line 19: | Line 36: | ||
= Minutes = | = Minutes = | ||
== Meeting, 2021-10-14 == | |||
* Present: Ian, Dmitri | |||
* Conda install processes on the login node take too much resources and run for too long. | |||
* New OpenMPI modules. Obsolete old modules. OPA module auto detection. | |||
Way to find an Omni-path card in a node: | |||
<pre> | |||
[drozmano@fc1 ~]$ lspci -nn | grep Omni | |||
5e:00.0 Fabric controller [0208]: Intel Corporation Omni-Path HFI Silicon 100 Series [discrete] [8086:24f0] (rev 11) | |||
</pre> | |||
The <code>lsmod</code> approach also seems to work: | |||
<pre> | |||
[drozmano@fc1 ~]$ /usr/sbin/lsmod | grep opa | |||
opa_vnic 32768 0 | |||
ib_core 385024 14 rdma_cm,ib_ipoib,opa_vnic,rpcrdma,ib_srpt,iw_cm,ib_iser,ib_umad,ib_isert,hfi1,rdma_ucm,ib_uverbs,ib_cm,rdmavt | |||
</pre> | |||
== Meeting, 2021-05-27 == | |||
* Present: | |||
== Meeting, 2021-04-01 == | == Meeting, 2021-04-01 == |
Latest revision as of 20:54, 14 October 2021
General
Topics to discuss
Date: May 27th 2021
- Singularity container management. Installation standard.
Present:
Dmitri Rozmanov
Tannistha Nandi
Topics Discussed
- Test phase- build singularity containers for research softwares on ARC
- Make it available to the users as a central module
- Singularity container of the software will be kept in the same directory as the software under /global/software
- Evaluate the performance of familiar softwares (eg gromacs) - container Vs non-container version
- Need to come up with a standard format for modulefiles
RCS Standards
Installation tree standard
- Naming conventions
- /global/software/openmpi/gnu-8.3.1/3.1.2-opa/
Module file standard
- What env variables to set;
- How to provide dynamic libraries functionality.
How to build software
- How to point to external libraries at the build time.
- Building static executables.
Minutes
Meeting, 2021-10-14
- Present: Ian, Dmitri
- Conda install processes on the login node take too much resources and run for too long.
- New OpenMPI modules. Obsolete old modules. OPA module auto detection.
Way to find an Omni-path card in a node:
[drozmano@fc1 ~]$ lspci -nn | grep Omni 5e:00.0 Fabric controller [0208]: Intel Corporation Omni-Path HFI Silicon 100 Series [discrete] [8086:24f0] (rev 11)
The lsmod
approach also seems to work:
[drozmano@fc1 ~]$ /usr/sbin/lsmod | grep opa opa_vnic 32768 0 ib_core 385024 14 rdma_cm,ib_ipoib,opa_vnic,rpcrdma,ib_srpt,iw_cm,ib_iser,ib_umad,ib_isert,hfi1,rdma_ucm,ib_uverbs,ib_cm,rdmavt
Meeting, 2021-05-27
- Present:
Meeting, 2021-04-01
Present:
- Dmitri
- Ian
Topics:
- Fluent Ansys licenses on CC (number of licenses limitation?).
- Organizational issues and ideas.
Ansys Licenses
limited number of Ansys licenses makes it a potential issue to use them on CC
Used to not allow this on CC or parallel nodes but we have become lax in this
Going forward we will allow this even though it isn't efficient
Against intervening in license disputes, just let people run lmutils
if we take RCS purchased licenses and make them strictly TRES we could track them in SLURM but this would potentially introduce blocking/fragmentation issues
Just going to note potential license issues in the wiki and the use of lmutils to identify who is using them.
Tracking module usage to enable removal of clutter:
add to module files to write out jobids they were called in as well as the module name with touch to a file in the /global/software/var directory
create a file per job, have a cron job to roll them all up into a single DB every hour
need to update the whatis for the modules -what do we want this to be used for?
Meeting, 2021-03-18
Participants:
- David Schulz
- Dmitri Rozmanov
- Ian Percel
- Tannistha Nandi
Minutes of the meeting:
Things to prepare before the next meeting in two weeks time:
- Disable Open-mpi 1.6 (Dmitri)
- Install a new version of open-mpi (??Dmitri)
- Install gcc 10.2 (Tannistha)
- Look into modification of an existing build to use RPATH. Compare RPATH Vs LD_LIBRARY_PATH (Ian)
Topics Discussed
- Prepare a list of the obsolete / non-functional modules and disable them.
- Test out a standard environment with gcc compiler. Multiple standard environments to be considered in the future.
- Switch to singularity container completely? Investigate the pros and cons.
Completed tasks
- Dmitri:
- Disabled the openmpi/1.6.5-gnu, python/anaconda-2.7-5.1.0, python/andconda-3.6-5.10 modules on ARC.
- These have to be removed some time in April.
- Tannistha:
- Installed GCC 10.x.x on ARC.