You are here

    • You are here:
    • Home > Events > Complete Cluster Course

Complete Cluster Course

Complete Cluster CourseComplete Cluster Course

12/11/2025 to 17/11/2025

Complete Cluster Course

CRG Training Center (Bioinformatic room)

This course will introduce and consolidate material presented in all levels of cluster course and expand on the concepts to be aware of when trying to optimize use of the cluster. 
If you have never used the CRG cluster, this is your mandatory course to obtain the user license. If you are already using the CRG Cluster and need an update, you will need to take the this course. 

The main message of the course is to embrace the parallelism available within the cluster and that pipelines should be made from lots of small independent pieces that are spread throughout the cluster rather than large monolithic long jobs that run on a single node. The course will show why this should be done and how to achieve it. 

Topics that are going to be addressed:  

  • Video tour of the data centre 
  • What is a cluster 
  • Logging in 
  • Queuing / the scheduler 
  • What resource are available at the CRG cluster 
  • Simple batch scripts - directives 
  • Troubleshooting - what happened to my jobs? 
  • Interactive sessions 
  • Supercomputers, beowulf clusters, horizontal v vertical scaling 
  • Hardware considerations 
  • Multithreaded jobs, parallelism, Amdahl's Law 
  • Job arrays 
  • Job dependencies 
  • Building a pipeline 
  • Storage issues, treemap 
  • Job stats, resource estimation 
  • Scaling analysis 

What NOT to expect:
Specific bioinformatics methods, pipeline builders (nextflow, snakemake etc.)

Pre-requisite: Linux Terminal for beginners course (or Linux experience)

Target audience: CRG staff
Instructors and teachers: Emyr James (Head of SIT) and other SIT members
Dates: 12th, 13th, 14th and 17th of November 2025
Time: 10:00-13:00h 
Level: Intermediate-advanced 
Location: Bioinformatics room, CRG Training Centre 
Maximum number of participants: 18
Registration deadline: 7th November 2pm

Registration HERE

For any information, please send an email to CRG Training and Academic office (TAO): training@crg.eu

Feedback from previous editions:
Around the world, many institutions now have clusters in order to help perform complex calculations (or just resource-intensive computations) and advance sciences. However, we are currently running into a shortage in computing power with an exponential increase in the number of users around the world. And the way these institutions are responding to this issue is by increasing the compute units. However, from an environmental point of view, this is a crisis as the increase in the computational units is leading to an increased consumption of other essential natural resources and an increased contribution to global warming. However, in reality, most of the time these compute-related bottlenecks can be solved very simply by optimising the use and thereby giving everyone a better chance at using the available units rather than increasing the units. Hence, this type of course is of utmost importance as it makes the users aware of the consequences of their actions while running the compute units recklessly.
This gives us a structured understanding of the cluster use and, more importantly, how the storage and the jobs are managed, and how important it is to respect the fact that many people share this cluster. We must be efficient with our code so that the cluster use is optimised and everyone gets to benefit from it equally.
It is a mandatory course if you want to understand how to work in the cluster and use properly the resources available.
I would recommend this course because it provides all the essential foundational knowledge needed to work in a cluster environment. It clearly explains what a cluster system is and guides you step by step on how to work effectively within it. Overall, it offers a solid groundwork for anyone who needs to operate in real cluster environments.


Training financiado por Ayuda:CEX2020-001049-S financiada por MCIN/ AEI / 10.13039/501100011033