How to convert a Docker container to an Apptainer container

From RCSWiki
Revision as of 21:44, 21 September 2023 by Lleung (talk | contribs) (Formatting and editing)
Jump to navigation Jump to search

Container images are a popular way to bundle and distribute software as it contains all the necessary dependencies that are needed for the software to run properly. The most common and popular container system is Docker and many types of software are bundled and distributed as a Docker container image.

Unfortunately, Docker is not a container technology that is compatible within a shared HPC environment. Fortunately, there are other container runtimes that do work in a shared HPC environment including Apptainer. This page will go over the necessary steps to convert an existing Docker container image into an Apptainer container image.

Conversion guide

The Apptainer command is capable of converting a Docker image hosted on the Docker Hub into a Apptainer .sif image file.

Finding the image on Docker Hub

The first step to take is to find the appropriate Docker image from Docker Hub by searching on https://hub.docker.com/.

Converting the image to the Apptainer format

Use apptainer build to pull and convert the Docker image. Be aware that this process may require a large amount of storage, depending on the size of the image, and may also require a large of memory.

$ apptainer build output_image.sif docker://broadinstitute/gatk

Run and test the container

Use apptainer run image.sif <command> to run the container. Be aware that the container will run as your normal user instead of root. Containers that expect root access may not work.

Conversion example

Converting a GATK Docker image

GATK is a popular bioinformatics toolbox.

To find a container for the latest version of we have to go to its home page at https://www.broadinstitute.org/gatk/. Navigate to the Download latest version of GATK page and look at the most recent releases at https://github.com/broadinstitute/gatk/releases.

We can see that a Docker image is hosted on Docker Hub at: https://hub.docker.com/r/broadinstitute/gatk/ with the container image called broadinstitute/gatk.

To pull and convert this Docker image:

  1. We will create a new directory to house this image. Ensure that this location has enough space to hold the image and the temporary files.
$ cd my_containers
  1. Now we can check if apptainer is available:
$ apptainer --version

apptainer version 1.1.4-2.el8
  1. Convert the Docker imagebroadinstitute/gatk to a new Apptainer image gatk.sif:
$ apptainer build gatk.sif docker://broadinstitute/gatk

INFO:    Starting build...
Getting image source signatures
Copying blob 53d8c492d3e6 done  
Copying config 17075ce1d9 done  
Writing manifest to image destination
Storing signatures
2023/02/13 14:22:04  info unpack layer: sha256:53d8c492d3e6c2d88e40ff11d3c606a6ede7e61b2c11541e3b721c52b8410026
INFO:    Creating SIF file...
INFO:    Build complete: gatk.sif

$ ls -lh

-rwxr-xr-x 1 username username 1.7G Feb 13 14:49 gatk.sif
  1. Test the image:
$ apptainer run gatk.sif gatk --help

INFO:    underlay of /etc/localtime required more than 50 (94) bind mounts

 Usage template for all tools (uses --spark-runner LOCAL when used with a Spark tool)
    gatk AnyTool toolArgs

 Usage template for Spark tools (will NOT work on non-Spark tools)
    gatk SparkTool toolArgs  [ -- --spark-runner <LOCAL | SPARK | GCS> sparkArgs ]
....
....

Please note, the different software can be setup differently by different developers who crated the original container. Therefore, the way how software is run from within a container may differ, and differ significantly. You have to read the information about the container to find out how to use it properly.

At this point

  • The container can be moved to a different location for convenience of use.
  • It can also be shared with other people,
  • It can be moved to a different Linux system.
  • It is expected to work the same way on most Linux systems.