model Selection in Phylogenetics based on algebraic INvariants
sQTLseekeR is a R package to detect splicing QTLs (sQTLs), which are variants associated with change in the splicing pattern of a gene. Here, splicing patterns are modeled by the relative expression of the transcripts of a gene.
Starcode is a DNA sequence clustering software.
Starcode is a DNA sequence clustering software. Sequence clustering is performed by finding all pairs below a Levenshtein distance metric. Typically, a file containing a set of related DNA sequences is passed as input, jointly with a parameter specifying the desired cluster distance. Starcode aligns and computes the distance between all the sequence pairs and prints a line for each cluster containing: canonical DNA sequence, sequence count and the list of sequences that belong to the cluster.
Starcode has many applications in the field of biology, such as DNA/RNA motif recovery, barcode clustering, sequencing error recovery, etc.
SymCurv is a computational ab initio method for nucleosome positioning prediction.
SymCurv is a computational ab initio method for nucleosome positioning prediction. It is based on the structural property of natural nucleosome forming sequences, to be symmetrically curved around a local minimum of curvature.
T-Coffee is a multiple sequence alignment package.
T-Coffee is a multiple sequence alignment package. You can use T-Coffee to align sequences or to combine the output of your favorite alignment methods (Clustal, Mafft, Probcons, Muscle, etc.) into one unique alignment (M-coffee). T-Coffee can align Protein, DNA and RNA sequences. It is also able to combine sequence information with protein structural information (Expresso), profile information (PSI-Coffee) or RNA secondary structures (R-Coffee).
An algorithm for the prediction of aggregating regions in unfolded polypeptide chains
TANGO is a statistical mechanics computer algorithm developed for the prediction of aggregation nucleating regions in proteins, as well as of the effect of mutations and environmental conditions on the aggregation propensity of these regions.
TANGO is based on the physico-chemical principles of b-sheet formation extended by the assumption that the core regions of an aggregate are fully buried. TANGO was benchmarked against 175 peptides of over 20 proteins and was able to predict the sequences experimentally observed to contribute to the aggregation of these proteins. TANGO also correctly predicts the aggregation propensities of several disease-related mutations in the Alzheimer´s b-peptide, human lysozyme and transthyrethin, and discriminates between b-sheet tendency and aggregation.
The success of TANGO confirms the model of intermolecular b-sheet formation as a wide-spread underlying mechanism of protein aggregation and opens the possibility of screening large databases for potential disease-related aggregation motifs, as well as optimizing recombinant protein yields by rationally out-designing protein aggregation.
TANGO was originally developed by Luis Serrano and his team at the European Molecular Biology Laboratory in Heidelberg, Germany.
The Flux Capacitor predicts abundances for transcript molecules and alternative splicing events from RNAseq experiments.
The Flux Capacitor predicts abundances for transcript molecules and alternative splicing events from RNAseq experiments. Additionally, there is a simulation pipeline that is capable to simulate whole transcriptome sequencing experiments.
The GEM (GEnome Multi-tool) Library is a set of very optimized tools for indexing/querying huge genomes/files.
A set of very optimized tools for indexing/querying huge genomes/files. Provided so far: a very fast exact mapper, and an unconstrained split-mapper
trimAl is a tool for the automated removal of spurious sequences or poorly aligned regions from a multiple sequence alignment It also includes readAl, a format converter between most alignment formats.
The resource described, the U12 Intron Database (U12DB), aims to catalog the U12 introns of completely sequenced eukaryotic genomes and associate orthologous introns with each other.
U12-type introns are spliced by the U12-dependent spliceosome and are present in the genomes of many higher eukaryotic lineages including plants, chordates and some invertebrates. Investigations into the evolution and mechanism of U12-depending splicing would be facilitated by access to a catalog of such introns. However, due to their relatively recent discovery and a systematic bias against recognition of non-canonical splice sites in general, the introns defined by U12-type splice sites are under-represented in genome annotations. Such under-representation compounds the already difficult problem of determining gene structures. It also impedes attempts to study these introns genome-wide or phylum-wide. The resource described here, the U12 Intron Database (U12DB), aims to catalog the U12 introns of completely sequenced eukaryotic genomes and associate orthologous introns with each other.
An algorithm for the prediction of amylogenic regions in protein sequences
We have developed the WALTZ algorithm for the identification of amyloid forming hexapeptides in amino acid sequences. WALTZ combines terms from amino acid sequence scoring in the learning set, physical property analysis and homology modelling. The method shows ~84% sensitivity at ~92% specificity on the AmylHex dataset, and correctly identifies mutations in human proteins known to be associated with amyloid deposition. The combination of the aggregation predicting algorithm TANGO with WALTZ, provides a complete cover of aggregation and amyloid tendency in protein sequences.
WALTZ is being developed by Joost Schymkowitz and Frederic Rousseau and their team at the SWITCH Laboratory of VIB in Brussels, Belgium, in collaboration with Luis Serrano and his team at the CRG in Barcelona, Spain.
Zerone discretizes several ChIP-seq replicates simultaneously and resolves conflicts between them. After the job is done, Zerone checks the results and tells you whether it passes the quality control.