RCS Summer School 2024

From RCSWiki
Jump to navigation Jump to search

Research Computing Services' 3rd annual summer school will run from Monday, June 10 through to Wednesday, June 12, 2024 from 9AM to 5PM. This summer school consists of various sessions and workshops throughout these 3 days and is completely free to all University of Calgary members.

Our goal for this year's summer school is to Empower our researchers: Inspiring what is possible on HPC infrastructure.

RCS Summer School 2024 Poster

Registration

Registration is required to attend the RCS Summer School sessions. Registration is free to all members of the University of Calgary.

Register now

There will be a limit of approximately 100 seats. If you are unable to attend after registering, please cancel/modify your registration or notify us via email.

Topics

  • Introduction to RCS services and HPC resources
  • Introduction to Linux & Bash command line
  • Using Linux utilities for large datasets
  • Hands on with Linux & Slurm: Workshop
  • Using Open OnDemand on ARC
  • Develop a research data management plan
  • Reproducible data management with Datalad
  • Digital File Management
  • Using containers in HPC with Apptainer
  • Managing scientific software with Conda
  • Research workflow development with Prefect
  • AWS: ML in the Cloud, a walkthrough followed by a workshop
  • NVIDIA: Workflow optimization using NVIDIA GPUs
  • Dell & AMD: Machine learning with Dell and AMD

Schedule

The summer school sessions will be held in ICT 102 and ICT 114. Refreshments will be available in ICT 114 on all 3 days.

Time June 10 June 11 June 12
Track 1 Track 2 Track 1 Track 2 Track 1 Track 2
8:30 AM Registration & check-in
ICT 102
Registration & check-in
ICT 102
Registration & check-in
ICT 102
9:00 AM Introduction to RCS
ICT 102, 9:00 AM - 9:20 AM
Jill Kowalchuk
Refreshments
ICT 114
The Alliance: Introduction
ICT 102
Brock Kahanyshyn
Refreshments
ICT 114
TBD

ICT 102

Refreshments
ICT 114
9:30 AM Introduction to Linux, Bash,
and the command line

ICT 102, 9:30 AM - 10:30 AM
Robert Fridman
Data in Motion: Navigating Storage Solutions for Active Research Data
ICT 114, 9:30 AM - 11:20 AM
Ian Percel
Introduction to HPC resources
ICT 102, 9:30 AM - 10:20 AM
Robert Fridman, Dave Schulz
Reproducible Data Management with Datalad: Part II
ICT 114, 9:30 AM - 10:20 AM
David Deepwell, Pedro Martinez
NVIDIA: Workflow Optimization with NVIDIA GPUs
ICT 102, 9:30 AM - 12:00 PM
Jonathan Dursi
10:00 AM Refreshments
ICT 114
10:30 AM Workshop: Hands on with Linux & Slurm
ICT 102, 10:30 AM - 11:50 AM
Robert Fridman
Linux tools & utilities for working with large data sets
ICT 102, 10:30 AM - 11:20 AM
Leo Leung, Dave Schulz
11:00 AM
11:30 AM Reproducible Data Management with Datalad: Part I
ICT 114, 11:30 AM - 12:20 AM
David Deepwell, Pedro Martinez
RCS Q&A period: Ask RCS anything
ICT 102, 11:30 AM - 12:00 PM
RCS Team
12:00 PM Open OnDemand on ARC
ICT 102, 12:00 AM - 12:20 AM
Leo Leung
Lunch break
12:00 PM - 1:00 PM
Lunch break
12:00 PM - 1:00 PM
12:30 PM Lunch break
12:30 PM - 1:30 PM
1:00 PM Research Data Management and Data File Management
ICT 102, 1:00 PM - 2:20 PM
Jennifer Abel, Alex Thistlewood, Ingrid Reiche
Refreshments
ICT 114
Dell & AMD: Machine learning with Dell & AMD
ICT 102, 1:00 PM - 1:50 PM

Rob Lucas

1:30 PM AWS: Inspiring the art of the possible
ICT 102, 1:30 PM - 1:50 PM

AWS

Refreshments
ICT 114
2:00 PM AWS: How AWS works with Researchers
ICT 102, 2:00 PM - 2:20 PM

AWS

2:30 PM AWS: Machine Learning with low-code workshop
ICT 102, 2:30 PM - 4:50 PM
AWS
Introduction to containers with Apptainer
ICT 102, 2:30 PM - 3:20 PM
Tannistha Nandi
Prefect for Research Workflow Development
ICT 102, 2:30 PM - 3:50 PM
David Deepwell, Pedro Martinez
3:00 PM
3:30 PM Managing scientific software with Conda
ICT 102, 3:30 PM - 4:20 PM
Dmitri Rozmanov
4:00 PM End of day: 4:00 PM
4:30 PM End of day: 4:30 PM
5:00 PM End of day: 5:00 PM

Sessions

Session Time and Location Synopsis

Introduction to RCS

June 10, 9:00AM - 9:20AM

ICT 102

We will begin the RCS summer school with a quick introduction by Jill Kowalchuk, the Interim director of Research Computing Services. We will introduce the RCS team, provide a high level overview of our services, and how to get help and support from our analysts.
  • Speaker: Jill Kowalchuk
  • Format: Lecture
  • Level: Introductory
  • Prerequisites: None

Introduction to Linux, Bash, and the command line

June 10, 9:30AM - 10:30AM

ICT 102

This course provides you with essential skills to effectively use the Linux command line. We will go over from ground up how to log-in and interact with our HPC cluster, traverse the filesystem, execute programs, and manage files.

This beginner friendly session requires no prior experience to Linux. We recommend bringing your own device to follow along. By the end of the course, you should be familiar with what is possible with the Linux command line.

  • Speaker: Robert Fridman
  • Format: Lecture + Follow along
  • Level: Introductory
  • Prerequisites: None

Workshop: Hands on with Linux & Slurm

June 10, 10:30AM - 11:50 AM

ICT 102

This follow-up workshop comes immediately after the Introduction to Linux session. We will build on what we learned in the previous session and go into details on how to use the HPC cluster using the Slurm scheduler.

This workshop will provide you with the skills necessary to write a simple Slurm batch script, submit jobs to Slurm, view and manage your jobs. By the end of the course, you will be familiar with what Slurm is, how it fits in in a HPC environment, and how to start using Slurm on our HPC clusters for your research. This is a beginner friendly workshop. You should be familiar with the Linux command line. We recommend bringing your own device to follow along.

  • Speaker: Robert Fridman
  • Format: Workshop + Hands on
  • Level: Introductory
  • Prerequisites: None

Open OnDemand on ARC

June 10, 12:00 AM - 12:20 AM

ICT 102

Did you know you can run a Linux desktop and graphical tools on ARC? This session will cover what ARC Open OnDemand is and how it may help with your research. We will show you how to:
  • Connect to Open OnDemand through your browser
  • Start a graphical desktop environment in our ARC HPC cluster environment
  • View and mange files in your home directory via Open OnDemand
  • Connect to ARC through your web browser
  • View the status of your submitted jobs

By the end of this session, you will be familiar with the options available on Open OnDemand and be able to start graphical sessions through this service. This is a beginner friendly workshop and no prior experience is necessary. We recommend bringing your own device to follow along.

  • Speaker: Leo Leung
  • Format: Lecture + Follow along
  • Level: Introductory
  • Prerequisites: None

Data in Motion: Navigating Storage Solutions for Active Research Data

June 10, 9:30AM - 11:20AM

ICT 114, Track 2

Planning for and requesting specialized storage for large research projects can be a daunting proposition. The variety of storage options and the expected justifications for allocations locally to UCalgary, at national supercomputing sites, and in the public cloud can quickly become overwhelming. This talk aims to provide an introduction to the cost/benefit tradeoff in using different storage systems, when to reach out to different support services around the university for help in making critical decisions, and basic techniques for providing a quantitative justification for a storage request.

By the end of the session, you will be familiar with the types of storage related questions that should be answered when tackling large research projects and the different types of solutions that the University offers our researchers.

  • Speaker: Ian Percel
  • Format: Lecture
  • Level: Introductory
  • Prerequisites: None

Reproducible Data Management with Datalad

June 10, 10:30AM - 11:20AM

June 11, 9:30AM - 10:20AM

ICT 114, Track 2

Data management and research data is critical to research. This is a two part workshop that introduces you to DataLad, a digital data management system based on the Git version control system.

Content to be covered in the two-part session includes:

  • dataset basics,
  • capturing data-provenance, and
  • collaborative data analysis.Background content will be covered before conducting the primary hands-on training where attendees will create a small demonstrative research project containing data provenance.

Although no git knowledge is required, familiarity with git is strongly advised. Command line experience is required.

  • Speaker: David Deepwell and Pedro Martinez
  • Level: Introductory
  • Prerequisites: Command line experience

Introduction to HPC resources

June 11, 9:30AM - 10:20AM

ICT 102

An introduction to high performance computing resources offered by RCS. We will go over how our infrastructure ties in to your research and how to make the most out of Slurm. How to download and transfer data with other institutions.
  • Speaker: Robert Fridman, Dave Schulz
  • Level: Introductory
  • Prerequisites: None

Linux tools & utilities for working with large data sets

June 11, 10:30AM - 11:20AM

ICT 102

As researchers use larger and larger datasets, it is imperative to effectively handle and manage these datasets. In this session, we will go through some common methods to work with datasets using standard Linux tools and utilities. We will cover common use cases on how to download large datasets from the Internet, parsing text-based data using tools such as sed, awk, grep, and will then tie everything together with pipes.
  • Speaker: Robert Fridman, Dave Schulz
  • Level: Introductory
  • Prerequisites: Command line experience

RCS Q&A period: Ask RCS anything

June 11, 11:30AM - 12:00PM

ICT 102

A general question and answers period where you can ask us anything related to RCS and HPC.
  • Speaker: The RCS team
  • Level: Introductory
  • Prerequisites: None

Research Data Management and Data File Management

June 11, 1:00PM - 2:20PM

ICT 102

Managing your digital files and research materials is critical for keeping yourself organized, collaborating, and communicating with colleagues. In this session, we will cover Research Data Management (RDM) and Data Management Plan (DMP). We will also go over best practices in digital file management depending on your individual and organizational needs. This presentation will also discuss best practices, versioning, and how to document and share your file and folder convention using a README file.
  • Speaker: Jennifer Abel, Alex Thistlewood, and Ingrid Reiche (from The University of Calgary Libraries and Cultural Resources)
  • Level: Introductory
  • Prerequisites: None

Introduction to containers with Apptainer

June 11, 2:30PM - 3:20PM

ICT 102

Make your research workflows reproducible through the power of containers. We will go through in detail how to run containers on ARC using Apptainer.
  • Speaker: Tannistha Nandi
  • Level: Introductory
  • Prerequisites: None

Managing scientific software with Conda

June 11, 3:30PM - 4:20PM

ICT 102

Running customized scientific software on a shared HPC environment may be challenging. This session, we will go over how to set up customized software environments using Conda.
  • Speaker: Dmitri Rozmanov
  • Level: Introductory
  • Prerequisites: None

Prefect for Research Workflow Development

June 12, 2:30PM - 3:50PM

ICT 102

Modernize your research workflows using Prefect, an open source workflow orchestration tool. In this session we will cover some of the fundamentals of building workflows with Prefect, with examples on how to deploy Prefect on local and distributed computing infrastructure.
  • Speaker: David Deepwell and Pedro Martinez
  • Level: Introductory
  • Prerequisites: None

AWS: Inspiring the art of the possible

June 11, 1:30PM - 1:50PM

ICT 102

Learn what is possible on AWS Cloud for research.
  • Speaker: AWS
  • Level: Introductory
  • Prerequisites: None

AWS: How AWS works with Researchers

June 11, 1:30PM - 1:50PM

ICT 102

AWS has many programs to support researchers such as credits, letter of supports, immersion days, working on proof of concepts. In this session, we will cover how we engage with researchers and what programs are out there to help accelerate your research with the AWS Cloud.
  • Speaker: AWS
  • Level: Introductory
  • Prerequisites: None

AWS: Machine learning with low-code workshop

June 11, 1:30 PM - 4:45 PM

ICT 102

The Machine Learning (ML) journey requires continuous experimentation and rapid prototyping to be successful. In order to create highly accurate and performant models, data scientists have to first experiment with feature engineering, model selection and  optimization techniques. These processes are traditionally time consuming and expensive.

In this workshop attendees will learn the following:

  • How the Low-Code ML capabilities found in Amazon SageMaker Data Wrangler, Autopilot and Jumpstart, make it easier to experiment faster and bring highly accurate models to production more quickly and efficiently
  • How to simplify the process of data preparation and feature engineering, and complete each step of the data preparation workflow
  • Understand how to automatically build, train, and tune the best machine learning models based on your data, while allowing you to maintain full control and visibility.
  • Get started with ML easily and quickly using pre-built solutions for common financial use cases and open source models from popular model zoos.
  • Speaker: AWS
  • Level: Introductory
  • Prerequisites: None

Workflow Optimization with NVIDIA GPUs

June 12, 9:30AM - 12:20AM

ICT 102

We will discuss how to optimizing workflows with NVIDIA powered GPUs to help accelerate your research.
  • Speaker: Jonathan Dursi from NVIDIA
  • Level: Introductory
  • Prerequisites: None

Dell Presentation: TBD

June 12, 1:00 PM - 1:50 PM

ICT 102

TBD
  • Speaker: Rob Lucas from Dell
  • Level: Introductory
  • Prerequisites: None