Software

Software

ADAN
A database for the prediction of protein-protein interactions of modular domains
Serrano´s Group (EMBL-CRG Partnership)
[Type: Database / Category: Protein Function Analysis]
 
Most of the structures and function of globular domains from the proteome are yet unknown. In order to get some information about the biological role of these domains, the development of a methodology for the modelling, prediction and localization of putative partners is of crucial importance since it can be of general applicability for any domain involved in protein-protein interactions.
 
ADAN is a database of putative ligands for the most well-known modular protein-protein interaction domains like SH3, PDZ, etc. Many of these domains have a large number of homologues of which only a small fraction has been crystallized and only limited number of ligands is known. Based on the known structures, the ADAN project creates full atomic models of unknown protein-ligand structures using the FoldX algorithm and predicts new putative binders. A querying function allows the user to input a polypeptide sequence and retrieve the putative binding segments for a particular domain.
 
ADAN results from the collaboration between the group of Luis Serrano, currently at the CRG in Barcelona and Gregorio Fernández, researcher from the Cellular and Molecular Biology Institute of the Miguel Hernández University in Elche, Alicante.
 
For more information about this software, please click here
 
AGADIR
An algorithm for the prediction of the helical content of peptides
Serrano´s Group (EMBL-CRG Partnership)
[Type: Software / Category: Protein Structure Analysis]

Agadir is a prediction algorithm based on the helix/coil transition theory. Agadir predicts the helical behaviour of monomeric peptides. It only considers short range interactions. Conditions such as pH, temperature and ionic strength are used in the calculation. Modifications of the termini are also allowed.

Agadir was originally developed by Luis Serrano and his team at the European Molecular Biology Laboratory in Heidelberg, Germany.

For more information about this software, please click here.

BriX
A structural classification of protein fragments
Serrano´s Group (EMBL-CRG Partnership)
[Type: Database / Category: Protein Structure Analysis]

BriX is a structural classification of protein fragments. The library comprises fragments ranging from 4 to 14 amino acids that are clustered against 6 different distance thresholds. This has lead to an alphabet of around 2000 frequently observed letters or structural classes per chain length. These classes are accessible through a search and a browse interface.

BriX is being developed by Joost Schymkowitz and Frederic Rousseau and their team at the SWITCH Laboratory of VIB in Brussels, Belgium, in collaboration with Luis Serrano and his team at the CRG in Barcelona, Spain.

For more information about this software, please click here.

FoldX
An automatic protein design algorithm that can be used to rationally modify protein stability, change protein specificity and affinity and predict metal binding sites. It can also be used to design protein-DNA interactions
 Serrano´s Group (EMBL-CRG Partnership)
[Type: Software / Category: Protein Structure Analysis]

FoldX is a computer algorithm developed to provide a fast and quantitative estimation of the importance of the interactions contributing to the stability of proteins and protein complexes. The predictive power of the algorithm has been tested on a very large set of point mutants spanning most of the structural environments found in proteins, as well as on protein complexes and protein-DNA complexes of medical and biotechnological relevance. FoldX uses a full atomic description of the structure of the proteins. The different energy terms taken into account in FoldX have been weighted using empirical data obtained from protein engineering experiments.

FoldX is being developed by the group of Luis Serrano at the CRG in Barcelona, Spain, in collaboration with Joost Schymkowitz and Frederic Rousseau and their team at the SWITCH Laboratory of VIB in Brussels, Belgium.

For more information about this software, please click here.

 
SmartCell
A cell network simulation program supporting localisation and diffusion by using a mesoscopic stochastic reaction model
 Serrano´s Group (EMBL-CRG Partnership)
[Type: Software / Category: Systems Biology]

SmartCell is a program developed to provide an idea of the evolution of a network in a whole, single cell. Based on stochastic algorithms, SmartCell needs multiple runs to have mean results. To help the user, SmartCell is distributed with a graphic user interface that allows for the creation of a model with a user friendly interface, as well as for the analysis and treatment of the results after the runs.

SmartCell is being developed by Luis Serrano and his team at the CRG in Barcelona.

For more information about this software, please click here.

SNPeffect
A database for the molecular phenotyping of human SNPs and disease mutations
 Serrano´s Group (EMBL-CRG Partnership)
[Type: Database / Category: Protein Function Analysis]

Single nucleotide polymorphisms (SNPs) are, together with copy number variation, the primary source of variation in the human genome. SNPs are associated with altered response to drug treatment, susceptibility to disease, and other phenotypic variation. Furthermore, during genetic screens for disease-associated mutations in groups of patients and control individuals, the distinction between disease causing mutations and polymorphisms is often unclear. Annotation of the functional and structural implications of single nucleotide changes thus provides valuable information to interpret and guide experiments.

SNPeffect is a database of non-synonymous SNPs and their predicted effect on the functional and physicochemical properties of the affected proteins. More precisely, SNPeffect analyses the effect of coding, non-synonymous SNPs on 3 categories of functional and physico-chemical properties of the affected proteins, namely protein structure and dynamics [stability, aggregation, dynamics, etc.], integrity of functional sites and cellular processing.

SNPeffect was originally developed by Joost Schymkowitz and Frederic Rousseau and their team at the SWITCH Laboratory of VIB in Brussels, Belgium, in collaboration with Luis Serrano and his team at the European Molecular Biology Laboratory in Heidelberg, Germany.

For more information about this software, please click here.

TANGO
An algorithm for the prediction of aggregating regions in unfolded polypeptide chains
Serrano´s Group (EMBL-CRG Partnership)
[Type: Software / Category: Protein Structure Analysis]

TANGO is a statistical mechanics computer algorithm developed for the prediction of aggregation nucleating regions in proteins, as well as of the effect of mutations and environmental conditions on the aggregation propensity of these regions.

TANGO is based on the physico-chemical principles of b-sheet formation extended by the assumption that the core regions of an aggregate are fully buried. TANGO was benchmarked against 175 peptides of over 20 proteins and was able to predict the sequences experimentally observed to contribute to the aggregation of these proteins. TANGO also correctly predicts the aggregation propensities of several disease-related mutations in the Alzheimer´s b-peptide, human lysozyme and transthyrethin, and discriminates between b-sheet tendency and aggregation.

The success of TANGO confirms the model of intermolecular b-sheet formation as a wide-spread underlying mechanism of protein aggregation and opens the possibility of screening large databases for potential disease-related aggregation motifs, as well as optimizing recombinant protein yields by rationally out-designing protein aggregation.

TANGO was originally developed by Luis Serrano and his team at the European Molecular Biology Laboratory in Heidelberg, Germany.

For more information about this software, please click here

Waltz
An algorithm for the prediction of amylogenic regions in protein sequences
Serrano´s Group (EMBL-CRG Partnership)
[Type: Software / Category: Protein Structure Analysis]

We have developed the WALTZ algorithm for the identification of amyloid forming hexapeptides in amino acid sequences. WALTZ combines terms from amino acid sequence scoring in the learning set, physical property analysis and homology modelling. The method shows ~84% sensitivity at ~92% specificity on the AmylHex dataset, and correctly identifies mutations in human proteins known to be associated with amyloid deposition. The combination of the aggregation predicting algorithm TANGO with WALTZ, provides a complete cover of aggregation and amyloid tendency in protein sequences.

WALTZ is being developed by Joost Schymkowitz and Frederic Rousseau and their team at the SWITCH Laboratory of VIB in Brussels, Belgium, in collaboration with Luis Serrano and his team at the CRG in Barcelona, Spain.

For more information about this software, please click here

Astalavista
AStalavista, the Alternative Splicing Transcriptional Landscape Visualization Tool.
Guigó's Group
Category: n/a

AStalavista, the Alternative Splicing Transcriptional Landscape Visualization Tool and more, retrieves all alternative splicing events from generic transcript annotations.

For more information about this software, please click here

geneid
geneid is a program to predict genes along a DNA sequence in a large set of organisms
Guigó's Group
Category: Gene Analysis

geneid is a program developed at CRG (Roderic Guigó''s group) with collaboration of Fundació Institut Mar d’Investigacions Mèdiques (IMIM), able to predict genes along a DNA sequence in a large set of organisms. While its accuracy compares favorably to that of other existing tools, geneid is more efficient in terms of speed and memory usage and it offers some rudimentary support to integrate predictions from multiple source.

For more information about this software, please click here

 

gff2aplot
Visualizing pair-wise alignments with annotated axes from GFF files
Guigó's Group
Category: n/a

gff2aplot is a software developed at Fundació Institut Mar d’Investigacions Mèdiques (IMIM) in collaboration with CRG (Roderic Guigó group) that allows visualizing pair-wise alignments with annotated axes from GFF files

For more information about this software, please click here

gffps
gff2ps is a program for visualizing annotations of genomic sequences
Guigó's Group
Category: Sequence Analysis

gff2ps is a program developed at Fundació Institut Mar d’Investigacions Mèdiques (IMIM) in collaboration with CRG (Roderic Guigó group), to visualize annotations of genomic sequences. The program takes as input the annotated features on a genomic sequence in GFF format, and produces a visual output in PostScript. It can be used in a very simple way, because it assumes that the GFF file itself carries enough formatting information, but it also allows through a number of options and/or a configuration file, for a great degree of customization.

For more information about this software, please click here

meta
meta is a program to produce and to align the TF-maps of two gene promoter regions.
Guigó's Group
Category: Gene Analysis

meta is a program is a program developed at Fundació Institut Mar d’Investigacions Mèdiques (IMIM) in collaboration with CRG (Roderic Guigó group) to produce and to align the TF-maps of two gene promoter regions. meta is very useful to characterize promoter regions from orthologous genes, or from co-regulated genes in microarrays, as it reduces the signal/noise ratio in a very significant manner, still detecting the real functional sites.

For more information about this software, please click here

mmeta
mmeta is a program to produce and to align the TF-maps of multiple promoter regions.
Guigó's Group
Category: Gene Analysis

mmeta is a program is a program developed at Fundació Institut Mar d’Investigacions Mèdiques (IMIM) in collaboration with CRG (Roderic Guigó''s group) to produce and to align the TF-maps of multiple promoter regions. mmeta is very powerful to characterize promoter regions from multiple orthologous genes, or from co-regulated genes in microarrays, as it reduces the signal/noise ratio in a very significant manner, still detecting the real functional sites.

For more information about this software, please click here

overlap
overlap is a program that computes the overlap between two sets of genomic features.
Guigó's Group
Category: Gene Analysis

overlap is a program that computes the overlap between two sets of genomic features. More precisely it takes two gff files of genomic features as input and for each feature of the first set, says whether it is overlapped by a feature of the second set (basic mode, however more and more precise information can be retrieved).

For more information about this software, please click here

Patronus
PATRONUS is a program designed to compute in a very fast way the exact probability of observing a given number of occurrences of a simple motif in a sequence.
Guigó's Group
Category: Sequence Analysis

PATRONUS (from "PATtern Recognition by Optimized Numerical Universal Scoring") is a program designed to compute in a very fast way the exact probability of observing a given number of occurrences of a simple motif (that is, a continuous word without gaps) in a sequence. Its intended scope is the analysis of very long biological sequences, like chromosomes or whole genomes of complex organisms. The probability is computed on the basis of the Markovian statistics of order m for the sequence, that is the recorded number of the occurrences of all the submotifs of length m + 1 in the sequence. Contrary to what many people believe, computing such a probability for a generic motif is a computationally demanding task, mainly because motifs can overlap in non-trivial ways.

For more information about this software, please click here

Project
project is a program that projects genomic features onto their sequences.
Guigó's Group
Category: Sequence Analysis

project is a program that projects genomic features onto their sequences.

For more information about this software, please click here

SECISaln
SECISaln will predict a SECIS element in the query sequence, split it into its constituent parts and align these against a precompiled database of eukaryotic SECIS elements.
Guigó's Group
Category: Sequence Analysis

SECISaln will predict a SECIS element in the query sequence, split it into its constituent parts and align these against a precompiled database of eukaryotic SECIS elements. The user can choose whether the database sequences are sorted by protein family or by species, thereby offering the possibility of comparing the submitted sequence to other, known SECISes. In addition, SECISaln returns a graphical image of the predicted structure of the user-submitted sequence as well as a multiple structural alignment of all SECIS elements of that type already present in the database.

For more information about this software, please click here

Selenoprofiles package
Selenoprofiles is a homology-based gene finding tool which is suitable for selenoprotein prediction in large nucleotide databases, like genomes.
Guigó's Group
Category: Gene Analysis

Selenoprofiles is a homology-based gene finding tool which is suitable for selenoprotein prediction in large nucleotide databases, like genomes. Selenoproteins are a group of proteins that contain selenocysteine (Sec), a rare amino acid inserted co-translationally into the protein chain. The Sec codon is UGA, which is normally a stop codon. In selenoproteins UGA is recoded to Sec in presence of specific signals on selenoprotein gene transcripts. Due to the dual role of the UGA codon, selenoprotein prediction and annotation are difficult tasks and are left mostly to manual analysis, since there are no reliable “golden standard” programs for this purpose. Here we present an homology-based in silico tool to scan genomes for members of the known selenoprotein families: selenoprofiles. This pipeline has features that make it suitable for selenoprotein prediction, and is shown to correctly predict selenoproteins that are badly annotated in Ensembl. Selenoprofiles is a python-built pipeline that internally runs psitblastn, exonerate, genewise and SECISearch.

For more information about this software, please click here

SymCurv
SymCurv is a computational ab initio method for nucleosome positioning prediction.
Guigó's Group
Category: n/a

SymCurv is a computational ab initio method for nucleosome positioning prediction. It is based on the structural property of natural nucleosome forming sequences, to be symmetrically curved around a local minimum of curvature. The method takes as input the primary DNA sequence, calculates the expected curvature from which it deduces possible centers of nucleosomal sequences, by imposing symmetry constraints. SymCurv''s performance is comparable to existing tools but offers the additional advantages of predicting nucleosome positions under two assumed-states (stationary and dynamic) providing insight on the remodelling potential of nucleosomes of possible regulatory function.

For more information about this software, please click here

The Flux Capacitator
The Flux Capacitor predicts abundances for transcript molecules and alternative splicing events from RNAseq experiments.
Guigó's Group
Category: n/a

The Flux Capacitor predicts abundances for transcript molecules and alternative splicing events from RNAseq experiments. Additionally, there is a simulation pipeline that is capable to simulate whole transcriptome sequencing experiments.

For more information about this software, please click here

The GEM (GEnome Multi-tool) Library
The GEM (GEnome Multi-tool) Library is a set of very optimized tools for indexing/querying huge genomes/files.
Guigó's Group
Category: Gene Analysis

The GEM (GEnome Multi-tool) Library is a set of very optimized tools for indexing/querying huge genomes/files. Provided so far are a very fast exhaustive mapper (the GEM mapper), an unconstrained split mapper (the GEM split mapper), and a very fast program to compute genome mappability (the GEM mappability).

For more information about this software, please click here

U12DB: The U12 Intron Database
The resource described, the U12 Intron Database (U12DB), aims to catalog the U12 introns of completely sequenced eukaryotic genomes and associate orthologous introns with each other.
Guigó's Group
Category: Gene Analysis

U12-type introns are spliced by the U12-dependent spliceosome and are present in the genomes of many higher eukaryotic lineages including plants, chordates and some invertebrates. Investigations into the evolution and mechanism of U12-depending splicing would be facilitated by access to a catalog of such introns. However, due to their relatively recent discovery and a systematic bias against recognition of non-canonical splice sites in general, the introns defined by U12-type splice sites are under-represented in genome annotations. Such under-representation compounds the already difficult problem of determining gene structures. It also impedes attempts to study these introns genome-wide or phylum-wide. The resource described here, the U12 Intron Database (U12DB), aims to catalog the U12 introns of completely sequenced eukaryotic genomes and associate orthologous introns with each other.

For more information about this software, please click here

GOToolBox
This site provides a series of programs for the functional investigation of groups of genes, based on the Gene Ontology resource.
Guigó's Group
Category: Gene Analysis

This site provides a series of programs for the functional investigation of groups of genes, based on the Gene Ontology resource.

For more information about this software, please click here

Encode transcriptome "dashboard"
The transcriptome project aims to sequence various cell lines, and within those cell lines, different compartments, and RNA fractions, using different technologies.
Guigó's Group
Category: Gene Analysis

The transcriptome project aims to sequence various cell lines, and within those cell lines, different compartments, and RNA fractions, using different technologies. The main dashboard page allows you to filter experiments by the parameters given (i.e. cell_line, rna_type, technology, compartment). The experiments link to a list of file downloads for the experiments, as well as some statistical data on the files that have been produced.

For more information about this software, please click here

PhylomeDB
PhylomeDB is a public database for complete collections of gene phylogenies (phylomes).
Gabaldón's Group
Category: Gene Analysis

PhylomeDB is a public database for complete collections of gene phylogenies (phylomes). It allows users to interactively explore the evolutionary history of genes through the visualization of phylogenetic trees and multiple sequence alignments. Moreover, phylomeDB provides genome-wide orthology and paralogy predictions based on the analysis of the phylogenetic trees.

For more information about this software, please click here

DeathBase
Deathbase is a database of proteins involved in cell death.
Gabaldón's Group
Category: Protein Analysis

Deathbase is a database of proteins involved in cell death. It compiles relevant data on the function, structure and evolution of proteins involved in apoptosis and other forms of cell death in several organisms. Information contained in this database is subjected to manual curation. You can contribute to maintain the DeathBase by editing the wikipage for any protein.

For more information about this software, please click here

trimAl
trimAl is a tool for the automated removal of spurious sequences or poorly aligned regions from a multiple sequence alignment It also includes readAl, a format converter between most alignment formats.
Gabaldón's Group
Category: Sequence Analysis

trimAl is a tool for the automated removal of spurious sequences or poorly aligned regions from a multiple sequence alignment It also includes readAl, a format converter between most alignment formats.

For more information about this software, please click here

ETE
ETE (Environment for Tree Exploration) is a python programming toolkit that assists in the automated manipulation, analysis and visualization of hierarchical trees.
Gabaldón's Group
Category: n/a

ETE (Environment for Tree Exploration) is a python programming toolkit that assists in the automated manipulation, analysis and visualization of hierarchical trees. Besides a broad set of tree handling options, ETE’s current version provides specific methods to analyze phylogenetic and clustering trees. It also supports large tree data structures, node annotation, independent editing and analysis of tree partitions, and the association of trees with external data such as multiple sequence alignments or numerical matrices. ETE first version was developed in collaboration with Dr. Joaquín Dopazo lab at Centro de Investigación Príncipe Felipe (CIPF).

For more information about this software, please click here

MetaPhOrs
MetaPhOrs is a public repository of phylogeny-based orthology and paralogy predictions for most species with fully-sequenced genomes.
Gabaldón's Group
Category: Gene Analysis

MetaPhOrs is a public repository of phylogeny-based orthology and paralogy predictions for most species with fully-sequenced genomes.

For more information about this software, please click here