RCS Software Management Group
Topics to discuss
Date: May 27th 2021
- Singularity container management. Installation standard.
- Test phase- build singularity containers for research softwares on ARC
- Make it available to the users as a central module
- Singularity container of the software will be kept in the same directory as the software under /global/software
- Evaluate the performance of familiar softwares (eg gromacs) - container Vs non-container version
- Need to come up with a standard format for modulefiles
Installation tree standard
- Naming conventions
Module file standard
- What env variables to set;
- How to provide dynamic libraries functionality.
How to build software
- How to point to external libraries at the build time.
- Building static executables.
- Present: Ian, Dmitri
- Conda install processes on the login node take too much resources and run for too long.
- New OpenMPI modules. Obsolete old modules. OPA module auto detection.
Way to find an Omni-path card in a node:
[drozmano@fc1 ~]$ lspci -nn | grep Omni 5e:00.0 Fabric controller : Intel Corporation Omni-Path HFI Silicon 100 Series [discrete] [8086:24f0] (rev 11)
lsmod approach also seems to work:
[drozmano@fc1 ~]$ /usr/sbin/lsmod | grep opa opa_vnic 32768 0 ib_core 385024 14 rdma_cm,ib_ipoib,opa_vnic,rpcrdma,ib_srpt,iw_cm,ib_iser,ib_umad,ib_isert,hfi1,rdma_ucm,ib_uverbs,ib_cm,rdmavt
- Fluent Ansys licenses on CC (number of licenses limitation?).
- Organizational issues and ideas.
limited number of Ansys licenses makes it a potential issue to use them on CC
Used to not allow this on CC or parallel nodes but we have become lax in this
Going forward we will allow this even though it isn't efficient
Against intervening in license disputes, just let people run lmutils
if we take RCS purchased licenses and make them strictly TRES we could track them in SLURM but this would potentially introduce blocking/fragmentation issues
Just going to note potential license issues in the wiki and the use of lmutils to identify who is using them.
Tracking module usage to enable removal of clutter:
add to module files to write out jobids they were called in as well as the module name with touch to a file in the /global/software/var directory
create a file per job, have a cron job to roll them all up into a single DB every hour
need to update the whatis for the modules -what do we want this to be used for?
- David Schulz
- Dmitri Rozmanov
- Ian Percel
- Tannistha Nandi
Minutes of the meeting:
Things to prepare before the next meeting in two weeks time:
- Disable Open-mpi 1.6 (Dmitri)
- Install a new version of open-mpi (??Dmitri)
- Install gcc 10.2 (Tannistha)
- Look into modification of an existing build to use RPATH. Compare RPATH Vs LD_LIBRARY_PATH (Ian)
- Prepare a list of the obsolete / non-functional modules and disable them.
- Test out a standard environment with gcc compiler. Multiple standard environments to be considered in the future.
- Switch to singularity container completely? Investigate the pros and cons.
- Disabled the openmpi/1.6.5-gnu, python/anaconda-2.7-5.1.0, python/andconda-3.6-5.10 modules on ARC.
- These have to be removed some time in April.
- Installed GCC 10.x.x on ARC.