Novoa Lab

Novoa Lab

Gene Regulation, Stem Cells and Cancer

Novoa Lab
Epitranscriptomics and RNA Dynamics
Group leader

Novoa Lab

Epitranscriptomics and RNA Dynamics
Group leader

2007 - B.Sc. in Biochemistry, University of Barcelona (UB), Barcelona (Spain)
2009 - M.Sc. in Bioinformatics, University Pompeu Fabra (UPF), Barcelona (Spain)
2012 - Ph.D. in Biomedicine, Institute for Research in Biomedicine (IRB), Barcelona (Spain)
2013 - EMBO Postdoctoral Fellow, Massachusetts Institute of Technology (MIT) and Broad Institute of MIT and Harvard, Boston (USA)
2014 - HSFP Postdoctoral Fellow, Massachusetts Institute of Technology (MIT) and Broad Institute of MIT and Harvard, Boston (USA)
2016 - Group Leader/Senior Postdoctoral Fellow at Garvan Institute of Medical Research, Sydney (Australia)
2018 - Group Leader at the Centre for Genomic Regulation (CRG), Barcelona (Spain)

CoVID - public processed data from coronavirus direct RNA nanopore sequencing runs

Our lab is uniformly analyzing all publicly available direct RNA nanopore sequencing data using our MasterOfPores Nextflow pipeline (https://doi.org/10.3389/fgene.2020.00211

This efforts will allow anyone to easily access uniformly processed direct RNA sequencing datasets, including many different types of processed data, which can be used for different downstream analyses.

For each sequencing run, we are currently providing the following data: base-called data (FAST5 and FASTQ), mapped data (BAM), and processed datasets (TXT files) including predictions of RNA modifications, per-gene counts, and polyA tail length estimations. We will keep extending this resource with new datasets as soon as they become available, and will keep processing the data uniformly and uploading all processed data in publicly available resources. 

In addition, together with the CRG Bioinformatics facility, we are currently improving our MasterOfPores workflow with new modules that will allow allow for accurate isoform quantification and consensus basecalling. We will keep updating this resource with additional processed data as soon as new modules are finished. You can find all the information here: https://covid.crg.eu/

Please note - this is a collaborative effort! If you have additional suggestions on new modules, feedback, or have sequenced any coronavirus direct RNA sequencing dataset(s), we will gladly process it and include it in this resource.

Summary

A current major challenge in biology is to understand how gene expression is regulated with surgical precision in a tissue-dependent, spatial and temporal dimension. Historically, genome-wide studies of gene expression have typically measured mRNA abundance rather than protein synthesis, in large part because such data are much easier to obtain. However, the correlation between mRNA levels and protein abundance is as low as r=0.35-0.40, suggesting that transcriptional regulation alone is not sufficient to unveil the complex orchestration of gene expression. In the last few decades, the scientific community has started to acknowledge the pivotal role that post-transcriptional regulatory mechanisms play in gene expression, however, we are still far from understanding how gene expression is finely tuned and regulated across tissues and conditions, suggesting that we are missing variables in the equation.

In our lab, we are employing a combination of experimental (RNASeq, polysome profiling, mouse/cell knockouts, Oxford Nanopore direct RNA sequencing) and computational techniques (NGS data analysis, algorithm development, machine learning), to unveil the secrets of three post-transcriptional regulatory layers: the epitranscriptome, RNA structure and ribosome specialization.


(Illustrations adapted from: Novoa et al., Nat Rev Mol Cell Biol 2017; Imanishi et al., Chem Communic 2017; Li et al., Nature Methods 2017; Hauenschild et al., Nucl Acids Res 2015; Stoecklin and Diederichs, EMBO J 2014)