You are here

    • You are here:
    • Home > Events > Containers and Workflow Pipelines for reproducible and automated data analysis

Containers and Workflow Pipelines for reproducible and automated data analysis

Containers and Workflow Pipelines for reproducible and automated data analysis

Registration HERE

16/04/2020 18/04/2020
Add to Calendar

Containers and Workflow Pipelines for reproducible and automated data analysis

CRG Training Center
Dates: 16th & 17th April 2020 and hackathon (optional) on the 28th of April 2020
Time: 09:30-17:00h
Trainers: CRG Bioinformatics core facility
Location: CRG Training Center (PRBB Patio)
Maximum nº of attendees: 19
Registration deadline: 23rd March 2020

 

Description of the course

The first day is dedicated to Linux Containers (Docker & Singularity) which are great tools for code portability and analysis reproducibility. You will learn how to build a container from scratch, share it with others and how to re-use and modify existing containers. 
On the second day, you will learn how to use Nextflow for building scalable and reproducible bioinformatics pipelines and running them on a personal computer, cluster and cloud. 
After two days of the course, there will be 10 days of the hackathon, during which the teams will work on building up real pipelines by topics of interest, followed up by the day of the hackathon follow-up and troubleshooting.

Objectives

Containers

  • Learn the concept of and the difference between Docker & Singularity containers 
  • Write a Docker recipe, build and run a Docker image and containers
  • Pull and push Docker container to / from Docker hub
  • Docker files and layers; Docker cashing
  • Working with volumes
  • Pull Docker containers as a Singularity image

Pipelines

  • Understand Nextflow's basic concepts: processes, channels, ...
  • Write and run a Nextflow pipeline (using a Singularity containers)

 

Programme:

Day 1: Docker containers

09:00 - 09:30 Introduction to containers

  • History of containers, what are containers and why should we use them? 
  • Containers vs. virtual machines

13:00 - 17:00 Singularity Containers

  • Differences between Singularity and Docker: why and when to use one or the other. Pros and cons.
  • Singularity recipes
  • Building a basic Singularity image
  • Pull and run an image with Singularity from Docker hub
  • Volumes in Singularity
  • Use a Singularity image interactively

Day 2:

09:00 - 17:00 Nextflow pipelines

  • Run a simple Nextflow pipeline and obtain a thorough understanding of config and pipeline files
  • Modify a pipeline and rerun processes
  • Theoretic approach to processes, channels and operators; the basics of Nextflow
  • Write and run a simple Nextflow pipeline (e.g. print text, process a simple calculation)
  • Including Singularity containers in Nextflow pipelines

Day 3:

09:00 – 17:00 Hackathon

  • Team presentations of developed pipelines and troubleshooting

Attendees: Researchers using the Linux command line on a regular basis with no or little knowledge of containers or workflow pipelines.