Managing tool repositories

In Bioinfopipe, all tools and environments are packaged as Docker images and stored in AWS ECR (Elastic Container Registry). The tools managed by Bioinfopipe are sourced from either our public ECR registry 'b1n7j4p9' or other public ECR registries like 'biocontainers', which is a community-driven project with over 9,000 bioinformatics containers available. One of the benefits of using ECR images is their quick launch capability through the AWS Batch service, ensuring reliable and consistent execution of analysis pipelines.

In this section, we will explore the process of creating a tool repository object in Bioinfopipe and generating its container image objects. Additionally, we will learn how to create or import an ECR image and testing it.


1. Browsing tool repositories

You can open tool browser page by clicking 'Tool repositories' button in tool browser or go to menu bar -> Tools -> Tool repositories. You will see all tool repositories in table by default, the table has 5 columns which are:

Name : The name of tool repository.

Type : Tool repository type which are:

  • Tools : Executable tools
  • R : R environment
  • Python : Python environment
  • Perl : Perl environment
  • Java : Java environment
  • Ruby : Ruby environment
  • Others : Other environment

Registry : The name of ECR registry.

Created at : The time when the repository created.

#Images : The number of images contained in the repository.

You can view the repository details by clicking the 'View details' icon button, apart from above properties the rest of properties are descried as follows:

ID : A pseudo ID assigned to the repository for a user or team.

Description : Brief description of the tool repository.

Updated at : The time the tool last updated. 

Owner : The owner of the tool repository object.

Public : Indicating if the tool repository is for public  or not.

Container images : a table showing list of images from this repository, with columns 'Tag', 'Status' and 'Size'; you can click on image tag for quick access to this image. 

Each page in the repository view displays 25 items by default, and you can navigate through the pages using the icon buttons labeled 'First', 'Previous', 'Next', and 'Last'. By default, only the public items maintained by Bioinfopipe are shown. To view your own tools or tools shared with you, you can select the 'Owner <id>' or 'Shared with me' links from the dropdown menu located in the upper right corner.

To filter the repositories, you can enter the repository name in the search input box labeled 'Search...'. Additionally, you can apply filters based on repository types and registries. However, only one category type filter or one registry filter can be selected at a time. If you choose both a category type filter and a registry filter, the table will display the intersection of repositories that satisfy both filters.


2. Handling tool repositories

In this section, you will learn how to create a tool repository object and its container image objects, as well as how to create an ECR image and test it. There are two methods for creating container images with their corresponding ECR images:

1. Creating from scratch in the application: This method involves creating a container image and its associated ECR image directly within the application. You can define the specifications and configurations of the container image and generate the corresponding ECR image.

2. Importing container images from public or private ECR repositories: Alternatively, you can import existing container images from public or private ECR repositories. This allows you to leverage pre-existing container images and incorporate them into your tool repository.


2.1. Creating a tool repository

To create a repository object, just click 'Create repository' button in repository browser, it will pop up a form titled 'Create a repository' which contains 3 fields as follows:

Name : Specify a repository name under 30 characters containing only letters, underlines and hyphens. Basically it is a tool name.

Type : Select a repository type to classify it as a tool container or a environment of a language.

Description : Put a brief description about this tool or environment.

Public : Check it if you want this repository to be shown up in public.

Clicking 'Create' button, a new repository will be created, and it will open a repository page of the newly created repository, from here you can create container images or synchronise your repository from your ECR registry.

You can edit repository details by clicking 'Edit repository' button, and you can delete a repository by clicking 'Delete repository' button.


2.2. Creating a container image

You can choose to create a new container image object and its ECR image from scratch. Click 'Create container image' button to open a form titled 'Create a container image' which has 4 fields as follows:

Tag : Specify a image tag name under 50 characters containing only letters, mumbers, underscores, hyphens and dots; you can put your tool version here, it is 'latest' by default.

Description : Put some description about this image which often a specific version of a tool.

Set this container image as imported image : Do not check it if you need to create ECR image.

Size : The ECR image size in Bytes; do not check it if you need to create ECR image.

Public : Check it if you want this image to be shown up in public.

After you clicked the 'Create' button, a new container image object is created, and it will open a container image page for the newly created container image object. Here, you can view container image details which has following properties:

ID : A pseudo ID assigned to the object for a user or team.

Repository name : The repository name.

Tag :  The created image tag.

Description : A brief description about this image.

Registry : The AWS registry endpoint; it is 'public.ecr.aws/b1n7j4p9' for Bioinfopipe registry.

Created at : The time when the container image created.

Owner : The owner of the container image object.

ECR status : A status to indicate if the image is created or imported or yet to create; there are 4 status as follows:

  • ToCommit : need to create ECR image but have not created yet.
  • Imported : the ECR image is imported from a public respotory.
  • Failed : failed to created ECR image, there is something wrong.
    • Created : the ECR image have been successfully created.
      • Cancelled : creating ECR image was cancelled.
        • InProgress : creating ECR image is on the way. 

          ECR image size : The ECR image size.

          ECR image digest : The ECR image digest which is a unique, immutable identifier for a container image to deploy.

          ECR start time : Start time when start to build ECR image.

          ECR time spent : Time spent on creating ECR image.

          To create ECR image, you need write your dockerfile script. You can start to edit dockerfile script by clicking the 'Edit' icon button. Following shows some of example scripts: 

          FROM ubuntu:latest
          RUN apt-get update && apt-get install -y bowtie2
          FROM rocker/r-ubuntu:20.04
          RUN Rscript -e "install.packages('BiocManager')"
          RUN Rscript -e "BiocManager::install('arrayQualityMetrics')"
          RUN Rscript -e "BiocManager::install('affy')"
          RUN Rscript -e "BiocManager::install('affycoretools')"
          RUN Rscript -e "BiocManager::install('sva')"
          RUN Rscript -e "BiocManager::install('limma')"
          RUN Rscript -e "BiocManager::install('annotate')"
          RUN Rscript -e "BiocManager::install('gplots')"
          Note: do not use or make an executable image as tool image because executable containers cannot be used when with container native platforms such Kubernetes and AWS Batch.

          You can also click the 'AI' button to automatically generate a Dockerfile script based on your image description using AI LLMs. Make sure to clearly specify the base system and the packages that should be included in the image. 

          By clicking 'Submit ECR image creation' button, the process of creating ECR image is started and the ECR status will turn into 'InProgress', meanwhile button 'Stop ECR image creation' will show up for users to stop the process. If the ECR image was successfully created, the ECR status will turn into 'Created', and will pop up 'Log record of container image creation' below the 'Dockerfile script'. If the ECR image creation failed the ECR status will turn into 'Failed', and pop up 'Log record of container image creation' from which you can debug which part of process has issue.

          You can Edit the container image object by clicking 'Edit container image' button. To delete container image object, you just click 'Delete container image', note that it only delete the container image object in the application not related ECR image which can be deleted by clicking 'Delete ECR image' button. 


          2.3. Testing a ECR image

          After a ECR image was created successfully, it is a good idea to have a quick test to see if the image can be pulled and the tool's executable can be called by AWS batch. 

          To start a test, just click 'Test ECR image' button in container image page, it will pop up a form with field 'Comman-line' where you can put a simple command-line to test, e.g. mytool -h[--help]. Click 'Submit test' button to submit the test, it will take about 1 - 5 min to finish test. Once finished test it will pop up window titled 'Results of test ECR image' which showing the command-line output, e.g. a help document from the tool.


          2.4. Importing a tool repository

          You can directly import repository and its images from ECR public gallery. Firstly click the link button 'AWS ECR public gallery' to open 'Amazon ECR Public Gallery' page where you can search and find tool containers you are looking for. Then click 'Import repository' button in repository browser, it will pop up a form titled 'Import a repository' which has 3 fields as follows:

          Registry id : Specify a public AWS ECR registry ID, e.g. biocontainers.

          Repository name : Put the repository name you found in the public registry.

          Repository type : Select a repository type to classify it as a tool container or a environment of a language.

          By clicking 'Import', a new tool repository object will be created and lead to its repository page. Here you can import a container image by clicking 'Create container image' button, it will pop up a form with 5 fields as described in section 'Create a container image'. Note that you need to put the same tag as in the one in public repository, then checking the field 'Set this container image as imported image' and specify ECR image size in Bytes. After clicking 'Create' button, a new container image object will be created and its status marked as 'Imported'. Finally you can test it to make sure it works.

          Note the image size is in MB shown in tab 'Image tags' in public repository, which can be easily converted to Bytes using a online tool.

          You can also import container images from your private ECR. After you created a repository object as described in section 'Create a tool repository', go to its repository page and click 'Sync repository' button, it will perform synchronising all ECR images from your this repository if it already exists in your private ECR. It will create all container images in the repository of private ECR which are not exists in the current repository object and marked their status as 'Imported'.  In this way, you can create and test your container images locally and push them to your private ECR through AWS CLI, and then import them into the application. This could be more efficient when handling a batch of container images. 

          Here is a example using command-lines to build and push a image to private ECR locally.

          DOCKER_BUILDKIT=1 docker build --no-cache -f Dockerfile -t mytool_repository .
          docker tag mytool_repository:latest my_private_ecr_endpoint/mytool_repository:latest
          aws ecr get-login-password --region eu-west-2 | docker login --username AWS --password-stdin my_private_ecr_endpoint
          aws ecr create-repository --repository-name mytool_repository
          docker push my_private_ecr_endpoint/mytool_repository:latest