biotools
627 views | +0 today
Follow
biotools
tools for biological data analysis
Curated by bitslife
Your new post is loading...
Your new post is loading...
Scooped by bitslife
Scoop.it!

baySeq : Empirical Bayesian analysis of patterns of differential expression in count data

Empirical Bayesian analysis of patterns of differential expression in count data

 

This package identifies differential expression in high-throughput 'count' data, such as that derived from next-generation sequencing machines, calculating estimated posterior likelihoods of differential expression (or more complex hypotheses) via empirical Bayesian methods.

more...
No comment yet.
Scooped by bitslife
Scoop.it!

DEGseq : Identify Differentially Expressed Genes from RNA-seq data

DEGseq is an R package to identify differentially expressed genes from RNA-Seq data.

more...
No comment yet.
Scooped by bitslife
Scoop.it!

MEME Suite - Motif-based sequence analysis tools

MEME Suite - Motif-based sequence analysis tools | biotools | Scoop.it

The MEME Suite web server is a unified portal for discovery and analysis of sequence motifs (e.g. DNA binding sites, protein interaction domains). Motifs with/without gaps may be discovered, searched for in DNA and protein databases, compared to other motifs and analyzed for putative function by association with gene ontology terms. MEME is a tool designed for discovering and searching for DNA motifs such as transcription factor binding sites (TFBS) or protein domains.

more...
No comment yet.
Scooped by bitslife
Scoop.it!

CummeRbund - An R package for persistent storage, analysis, and visualization of RNA-Seq from cufflinks output

CummeRbund - An R package for persistent storage, analysis, and visualization of RNA-Seq from cufflinks output | biotools | Scoop.it

High-throughput sequencing of RNA-fragments is a powerful technique that has many applications, including but not limited to transcript assembly, qantitation, and differential expression analysis. The results of these analyses is often very large data sets with a high degree of relations between various data types and can be somewhat overwhelming. CummeRbund was designed to help simplify the analysis and exploration portion of RNA-Seq data derrived from the output of a differential expression analysis using cuffdiff with the goal of providing fast and intuitive access to your results.

CummeRbund takes the various output files from a cuffdiff run and creates a SQLite database of the results describing appropriate relationships betweeen genes, transcripts, transcription start sites, and CDS regions. Once stored and indexed, data for these features, even across multiple samples or conditions, can be retrieved very efficiently and allows the user to explore subfeatures of individual genes, or genesets as the analysis requires. We have implemented numerous plotting functions as well for commonly used visualizations. Check back often as we are constantly updating features.

more...
No comment yet.
Scooped by bitslife
Scoop.it!

CAP3 and PCAP Assembly Programs


CAP3 and PCAP are available for use at company under a licensing agreement from Michigan Tech. For information on a licensing agreement, please contact Robin Kolehmainen by email at rakolehm@mtu.edu or by phone at 906-487-2228. Michigan Tech handles licensing agreements on PCAP for Iowa State.

more...
No comment yet.
Scooped by bitslife
Scoop.it!

Burrows-Wheeler Aligner

Introduction

Burrows-Wheeler Aligner (BWA) is an efficient program that aligns relatively short nucleotide sequences against a long reference sequence such as the human genome. It implements two algorithms, bwa-short and BWA-SW. The former works for query sequences shorter than 200bp and the latter for longer sequences up to around 100kbp. Both algorithms do gapped alignment. They are usually more accurate and faster on queries with low error rates. Please see the BWA manual page for more information.

more...
No comment yet.
Scooped by bitslife
Scoop.it!

Bowtie: An ultrafast, memory-efficient short read aligner

Bowtie is an ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp reads per hour. Bowtie indexes the genome with a Burrows-Wheeler index to keep its memory footprint small: typically about 2.2 GB for the human genome (2.9 GB for paired-end).

more...
No comment yet.
Scooped by bitslife
Scoop.it!

DESeq : analyse count data from high-throughput sequencing assays

DESeq : analyse count data from high-throughput sequencing assays | biotools | Scoop.it

DESeq is an R package to analyse count data from high-throughput sequencing assays such as RNA-Seq and test for differential expression.

more...
No comment yet.
Scooped by bitslife
Scoop.it!

edgeR : Empirical analysis of digital gene expression data in R

Differential expression analysis of RNA-seq and digital gene expression profiles with biological replication. Uses empirical Bayes estimation and exact tests based on the negative binomial distribution. Also useful for differential signal analysis with other types of genome-scale count data.

more...
No comment yet.
Scooped by bitslife
Scoop.it!

Contrail: Assembly of Large Genomes using Cloud Computing

Contrail: Assembly of Large Genomes using Cloud Computing | biotools | Scoop.it

The first step towards analyzing a previously unsequenced organism is to assemble the reads by merging similar reads into progressively longer sequences. New assemblers such as Velvet and Eulerattempt to solve the assembly problem by constructing, simplifying, and traversing the de Bruijn graph of the read sequences. Nodes in the graph represent substrings of the reads, and directed edges connect consecutive substrings. Genome assembly is then modeled as finding an Eulerian tour through the graph, although repeats may lead to multiple possible tours. As such, assemblers primarily focus on correcting errors, reconstructing unambiguous regions, and resolving short repeats. These assemblers have successfully assembled small genomes from short reads, but have had limited success scaling to larger mammalian-sized genomes, in part, because they require constructing and manipulating graphs far larger than can fit into memory.
Addressing this limitation, we have developed a new assembly program Contrail, that uses Hadoop for de novo assembly of large genomes from short sequencing reads. Similar to other leading short read assembler, Contrail relies on the graph-theoretic framework of de Bruijn graphs. However, unlike these programs, which require large RAM resources, Contrail relies on Hadoop to iteratively transform an on-disk representation of the assembly graph, allowing an in depth analysis even for large genomes. Preliminary results show Contrail’s contigs are of similar size and quality to those generated by Velvet when applied to small (bacterial) genomes, but provides vastly superior scaling capabilities when applied to large genomes. We are also developing extensions to Contrail to efficiently compute a traditional overlap-graph based assembly of large genomes within Hadoop, strategy that will be especially valuable as read lengths increase beyond 100bp.

Contrail enables de novo assembly of large genomes from short reads by bridging research in computation biology with research in high performance computation. This combination is essential in light of the large data sets involved, and has the potential to unlock discoveries of critical magnitude. Whereas the published analysis of the African and Asian human individuals used read mapping to discover conserved regions and regions with small polymorphisms, de novo assembly has the unique potential to also discover large scale polymorphisms between these individuals and the reference human genome. Mapping the large-scale differences is an important step towards better understanding of our own biology, and may reveal previously unknown characteristics of the human genome related to health or disease. Furthermore, a short read assembler for large genomes is also essential for sequencing the vast numbers of complex organisms that have never been sequenced before, and will directly contribute to new biological knowledge.

 

more...
Gaurav Kaul's curator insight, May 22, 2013 7:35 AM

Thanks this is helpful...It would be nice to have more info on have contrail on some kind of cloud infrastructure

Scooped by bitslife
Scoop.it!

TopHat : A spliced read mapper for RNA-Seq

TopHat is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons.

 

TopHat is a collaborative effort between the Institute of Genetic Medicine at Johns Hopkins University, the Departments of Mathematics and Molecular and Cell Biology at the University of California, Berkeley and the Department of Stem Cell and Regenerative Biology at Harvard University.

more...
No comment yet.
Scooped by bitslife
Scoop.it!

Cufflinks - Transcript assembly, differential expression, and differential regulation for RNA-Seq

Open source tools for transcript assembly, differential expression, and differential regulation for RNA-Seq...
more...
No comment yet.
Scooped by bitslife
Scoop.it!

Myrna: Cloud-scale differential gene expression for RNA-seq

Myrna is a cloud computing tool for calculating differential gene expression in large RNA-seq datasets. Myrna uses Bowtie for short read alignment and R/Bioconductor for interval calculations, normalization, and statistical testing. These tools are combined in an automatic, parallel pipeline that runs in the cloud (Elastic MapReduce in this case) on a local Hadoop cluster, or on a single computer, exploiting multiple computers and CPUs wherever possible.

more...
No comment yet.