RCS Software Management Group: Difference between revisions

From RCSWiki
Jump to navigation Jump to search
 
(8 intermediate revisions by 2 users not shown)
Line 2: Line 2:


== Topics to discuss ==
== Topics to discuss ==
Date: May 27th 2021<br>
Date: May 27th 2021<br>
* Singularity container management. Installation standard.
* Singularity container management. Installation standard.
Line 14: Line 15:
# Singularity container of the software will be kept in the same directory as the software under /global/software
# Singularity container of the software will be kept in the same directory as the software under /global/software
# Evaluate the performance of familiar softwares (eg gromacs) - container Vs non-container version
# Evaluate the performance of familiar softwares (eg gromacs) - container Vs non-container version
# Need to come up with a format for the modulefile
# Need to come up with a standard format for modulefiles


= RCS Standards =
= RCS Standards =
Line 35: Line 36:


= Minutes =
= Minutes =
== Meeting, 2021-10-14 ==
* Present: Ian, Dmitri
* Conda install processes on the login node take too much resources and run for too long.
* New OpenMPI modules. Obsolete old modules. OPA module auto detection.
Way to find an Omni-path card in a node:
<pre>
[drozmano@fc1 ~]$ lspci -nn | grep Omni
5e:00.0 Fabric controller [0208]: Intel Corporation Omni-Path HFI Silicon 100 Series [discrete] [8086:24f0] (rev 11)
</pre>
The <code>lsmod</code> approach also seems to work:
<pre>
[drozmano@fc1 ~]$ /usr/sbin/lsmod | grep opa
opa_vnic              32768  0
ib_core              385024  14 rdma_cm,ib_ipoib,opa_vnic,rpcrdma,ib_srpt,iw_cm,ib_iser,ib_umad,ib_isert,hfi1,rdma_ucm,ib_uverbs,ib_cm,rdmavt
</pre>
== Meeting, 2021-05-27 ==
== Meeting, 2021-05-27 ==



Latest revision as of 20:54, 14 October 2021

General

Topics to discuss

Date: May 27th 2021

  • Singularity container management. Installation standard.


Present:
Dmitri Rozmanov
Tannistha Nandi

Topics Discussed
  1. Test phase- build singularity containers for research softwares on ARC
  2. Make it available to the users as a central module
  3. Singularity container of the software will be kept in the same directory as the software under /global/software
  4. Evaluate the performance of familiar softwares (eg gromacs) - container Vs non-container version
  5. Need to come up with a standard format for modulefiles

RCS Standards

Installation tree standard

  • Naming conventions
  • /global/software/openmpi/gnu-8.3.1/3.1.2-opa/

Module file standard

  • What env variables to set;
  • How to provide dynamic libraries functionality.

How to build software

  • How to point to external libraries at the build time.
  • Building static executables.

Minutes

Meeting, 2021-10-14

  • Present: Ian, Dmitri


  • Conda install processes on the login node take too much resources and run for too long.
  • New OpenMPI modules. Obsolete old modules. OPA module auto detection.

Way to find an Omni-path card in a node:

[drozmano@fc1 ~]$ lspci -nn | grep Omni
5e:00.0 Fabric controller [0208]: Intel Corporation Omni-Path HFI Silicon 100 Series [discrete] [8086:24f0] (rev 11)

The lsmod approach also seems to work:

[drozmano@fc1 ~]$ /usr/sbin/lsmod | grep opa
opa_vnic               32768  0
ib_core               385024  14 rdma_cm,ib_ipoib,opa_vnic,rpcrdma,ib_srpt,iw_cm,ib_iser,ib_umad,ib_isert,hfi1,rdma_ucm,ib_uverbs,ib_cm,rdmavt

Meeting, 2021-05-27

  • Present:


Meeting, 2021-04-01

Present:

  • Dmitri
  • Ian

Topics:

  • Fluent Ansys licenses on CC (number of licenses limitation?).
  • Organizational issues and ideas.


Ansys Licenses

limited number of Ansys licenses makes it a potential issue to use them on CC

Used to not allow this on CC or parallel nodes but we have become lax in this

Going forward we will allow this even though it isn't efficient

Against intervening in license disputes, just let people run lmutils

if we take RCS purchased licenses and make them strictly TRES we could track them in SLURM but this would potentially introduce blocking/fragmentation issues

Just going to note potential license issues in the wiki and the use of lmutils to identify who is using them.


Tracking module usage to enable removal of clutter:

add to module files to write out jobids they were called in as well as the module name with touch to a file in the /global/software/var directory

create a file per job, have a cron job to roll them all up into a single DB every hour

need to update the whatis for the modules -what do we want this to be used for?

Meeting, 2021-03-18

Participants:

  • David Schulz
  • Dmitri Rozmanov
  • Ian Percel
  • Tannistha Nandi

Minutes of the meeting:

Things to prepare before the next meeting in two weeks time:

  1. Disable Open-mpi 1.6 (Dmitri)
  2. Install a new version of open-mpi (??Dmitri)
  3. Install gcc 10.2 (Tannistha)
  4. Look into modification of an existing build to use RPATH. Compare RPATH Vs LD_LIBRARY_PATH (Ian)

Topics Discussed

  1. Prepare a list of the obsolete / non-functional modules and disable them.
  2. Test out a standard environment with gcc compiler. Multiple standard environments to be considered in the future.
  3. Switch to singularity container completely? Investigate the pros and cons.

Completed tasks

  • Dmitri:
Disabled the openmpi/1.6.5-gnu, python/anaconda-2.7-5.1.0, python/andconda-3.6-5.10 modules on ARC.
These have to be removed some time in April.
  • Tannistha:
Installed GCC 10.x.x on ARC.