Databases & Softw...
Follow
Find
3.9K views | +4 today
 
Scooped by Biswapriya Biswavas Misra
onto Databases & Softwares
Scoop.it!

Dragon TIS Spotter - Search for Translation Initiation Sites (TISs) in Arabidopsis Genomic Sequences

Dragon TIS Spotter - Search for Translation Initiation Sites (TISs) in Arabidopsis Genomic Sequences | Databases & Softwares | Scoop.it

:: DESCRIPTION

Dragon TIS Spotter searches for Translation Initiation Sites (TISs) in Arabidopsis genomic sequences provided in fasta sequence format. The program analyzes contend of the sliding windows of 300 bp of DNA sequence, assuming the TIS is located at 150-152 position of the window counted from the 5′ end.

more...
No comment yet.

From around the web

Databases & Softwares
Genomic, Proteomic, Transcriptomic, Metabolomic Softwares and Databases
Your new post is loading...
Your new post is loading...
Scooped by Biswapriya Biswavas Misra
Scoop.it!

DupChecker: a bioconductor package for checking high-throughput genomic data redundancy in meta-analysis

Abstract
Background

Meta-analysis has become a popular approach for high-throughput genomic data analysis because it often can significantly increase power to detect biological signals or patterns in datasets. However, when using public-available databases for meta-analysis, duplication of samples is an often encountered problem, especially for gene expression data. Not removing duplicates could lead false positive finding, misleading clustering pattern or model over-fitting issue, etc in the subsequent data analysis.
Results

We developed a Bioconductor package Dupchecker that efficiently identifies duplicated samples by generating MD5 fingerprints for raw data. A real data example was demonstrated to show the usage and output of the package.
Conclusions

Researchers may not pay enough attention to checking and removing duplicated samples, and then data contamination could make the results or conclusions from meta-analysis questionable. We suggest applying DupChecker to examine all gene expression data sets before any data analysis step.
Biswapriya Biswavas Misra's insight:
AbstractBackground

Meta-analysis has become a popular approach for high-throughput genomic data analysis because it often can significantly increase power to detect biological signals or patterns in datasets. However, when using public-available databases for meta-analysis, duplication of samples is an often encountered problem, especially for gene expression data. Not removing duplicates could lead false positive finding, misleading clustering pattern or model over-fitting issue, etc in the subsequent data analysis.

Results

We developed a Bioconductor package Dupchecker that efficiently identifies duplicated samples by generating MD5 fingerprints for raw data. A real data example was demonstrated to show the usage and output of the package.

Conclusions

Researchers may not pay enough attention to checking and removing duplicated samples, and then data contamination could make the results or conclusions from meta-analysis questionable. We suggest applying DupChecker to examine all gene expression data sets before any data analysis step.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

Welcome to Tomato Genomic Resources Database: Home Page

Welcome to Tomato Genomic Resources Database: Home Page | Databases & Softwares | Scoop.it
Tomato Genomic Resources Database: An Integrated Repository of Useful Tomato Genomic Information for Basic and Applied Research.
Biswapriya Biswavas Misra's insight:

Tomato Genomic Resources Database: An Integrated Repository of Useful Tomato Genomic Information for Basic and Applied Research.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

Eggplant Genome DataBase

Eggplant Genome DataBase | Databases & Softwares | Scoop.it
Eggplant Genome Database at Kazusa DNA Research Institute.
Biswapriya Biswavas Misra's insight:
The eggplant (Solanum melongena L.) is one of the most important vegetable crop species in Japan as well as in other Asian, Middle and Near Eastarn, Mediterranean and African countries. Eggplant belongs to the Solanaceae family including tomato, potato and pepper, but unlike these allies, it is endemic to the Old World. To get this unique solanaceous member up on the stage of genomics and let it act as a crucial cast in molecular genetic and physiological studies, eggplant whole-genome sequencing has been done to construct a draft genome dataset.
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

DupChecker: a bioconductor package for checking high-throughput genomic data redundancy in meta-analysis

Abstract (provisional)
Background

Meta-analysis has become a popular approach for high-throughput genomic data analysis because it often can significantly increase power to detect biological signals or patterns in datasets. However, when using public-available databases for meta-analysis, duplication of samples is an often encountered problem, especially for gene expression data. Not removing duplicates could lead false positive finding, misleading clustering pattern or model over-fitting issue, etc in the subsequent data analysis.
Results

We developed a Bioconductor package Dupchecker that efficiently identifies duplicated samples by generating MD5 fingerprints for raw data. A real data example was demonstrated to show the usage and output of the package.
Conclusions

Researchers may not pay enough attention to checking and removing duplicated samples, and then data contamination could make the results or conclusions from meta-analysis questionable. We suggest applying DupChecker to examine all gene expression data sets before any data analysis step.
Biswapriya Biswavas Misra's insight:
Abstract (provisional)Background

Meta-analysis has become a popular approach for high-throughput genomic data analysis because it often can significantly increase power to detect biological signals or patterns in datasets. However, when using public-available databases for meta-analysis, duplication of samples is an often encountered problem, especially for gene expression data. Not removing duplicates could lead false positive finding, misleading clustering pattern or model over-fitting issue, etc in the subsequent data analysis.

Results

We developed a Bioconductor package Dupchecker that efficiently identifies duplicated samples by generating MD5 fingerprints for raw data. A real data example was demonstrated to show the usage and output of the package.

Conclusions

Researchers may not pay enough attention to checking and removing duplicated samples, and then data contamination could make the results or conclusions from meta-analysis questionable. We suggest applying DupChecker to examine all gene expression data sets before any data analysis step.

more...
No comment yet.
Rescooped by Biswapriya Biswavas Misra from Bioinformatics Software: Sequence Analysis
Scoop.it!

Fiona: a parallel and automatic strategy for read error correction

RT @druvus: Fiona: a parallel and automatic strategy for read error correction http://t.co/voU7wksk48

Via Mel Melendrez-Vallard
more...
No comment yet.
Rescooped by Biswapriya Biswavas Misra from Bioinformatics Software: Sequence Analysis
Scoop.it!

PRISE2: Software for designing sequence-selective PCR primers and probes

Background:
PRISE2 is a new software tool for designing sequence-selective PCR primers and probes.

Via Mel Melendrez-Vallard
more...
No comment yet.
Rescooped by Biswapriya Biswavas Misra from Bioinformatics Software: Sequence Analysis
Scoop.it!

Sushi.R: flexible, quantitative and integrative genomic visualizations for publication-quality multi-panel figures

Sushi.R looks pretty sweet. http://t.co/aXHXaejzjr

Via Mel Melendrez-Vallard
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

RPPanalyzer Toolbox: An improved R package for analysis of reverse phase protein array data

RPPanalyzer Toolbox: An improved R package for analysis of reverse phase protein array data | Databases & Softwares | Scoop.it
Analysis of large-scale proteomic data sets requires specialized software tools, tailored toward the requirements of individual approaches. Here we introduce an extension of an open-source software solution for analyzing reverse phase protein array (RPPA) data. The R package RPPanalyzer was designed for data preprocessing followed by basic statistical analyses and proteomic data visualization. In this update, we merged relevant data preprocessing steps into a single user-friendly function and included a new method for background noise correction as well as new methods for noise estimation and averaging of replicates to transform data in such a way that they can be used as input for a new time course plotting function. We demonstrate the robustness of our enhanced RPPanalyzer platform by analyzing longitudinal RPPA data of MET receptor signaling upon stimulation with different hepatocyte growth factor concentrations.
Biswapriya Biswavas Misra's insight:

Analysis of large-scale proteomic data sets requires specialized software tools, tailored toward the requirements of individual approaches. Here we introduce an extension of an open-source software solution for analyzing reverse phase protein array (RPPA) data. The R package RPPanalyzer was designed for data preprocessing followed by basic statistical analyses and proteomic data visualization. In this update, we merged relevant data preprocessing steps into a single user-friendly function and included a new method for background noise correction as well as new methods for noise estimation and averaging of replicates to transform data in such a way that they can be used as input for a new time course plotting function. We demonstrate the robustness of our enhanced RPPanalyzer platform by analyzing longitudinal RPPA data of MET receptor signaling upon stimulation with different hepatocyte growth factor concentrations.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

InvFEST, a database integrating information of polymorphic inversions in the human genome. - Abstract - Europe PubMed Central

Abstract: The newest genomic advances have uncovered an unprecedented degree of structural variation throughout genomes, with great amounts of data...
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

ExaBayes: Massively Parallel Bayesian Tree Inference for the Whole-Genome Era

ExaBayes: Massively Parallel Bayesian Tree Inference for the Whole-Genome Era | Databases & Softwares | Scoop.it
RT @RamiroHojas: Nice, a software to run Bayesian phylogenetic analyses for genomic datasets! http://t.co/IhAgHZn5pc
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

PeptideManager: a peptide selection tool for targeted proteomic studies involving mixed samples from different species. - ncbi.nlm.nih.gov

PeptideManager: a peptide selection tool for targeted proteomic studies involving mixed samples from different species. - ncbi.nlm.nih.gov | Databases & Softwares | Scoop.it
PeptideManager: a peptide selection tool for targeted proteomic studies involving mixed samples from different species. (PeptideManager: a peptide selection tool for targeted proteomic studies involving mixed samples from ...
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

ARG-ANNOT, a new bioinformatic tool to discover antibiotic resistance genes in bacterial genomes.

Abstract: ARG-ANNOT (Antibiotic Resistance Gene-ANNOTation) is a new bioinformatic tool that was created to detect existing and putative new antibiotic...
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

RAMONA: a web application for gene set analysis on multilevel omics data

RAMONA: a web application for gene set analysis on multilevel omics data | Databases & Softwares | Scoop.it
Summary: Decreasing costs of modern high-throughput experiments allow for the simultaneous analysis of altered gene activity on various molecular levels. However, these multi-omics approaches lead to a large amount of data which is hard to interpret for a non-bioinformatician. Here, we present the remotely accessible multilevel ontology analysis (RAMONA). It offers an easy-to-use interface for the simultaneous gene set analysis of combined omics datasets and is an extension of the previously introduced MONA approach. RAMONA is based on a Bayesian enrichment method for the inference of overrepresented biological processes among given gene sets. Overrepresentation is quantified by interpretable term probabilities. It is able to handle data from various molecular levels, while in parallel coping with redundancies arising from gene set overlaps and related multiple testing problems. The comprehensive output of RAMONA is easy to interpret and thus allows for functional insight into the affected biological processes. With RAMONA, we provide an efficient implementation of the Bayesian inference problem such that ontologies consisting of thousands of terms can be processed in the order of seconds.
Biswapriya Biswavas Misra's insight:

Summary: Decreasing costs of modern high-throughput experiments allow for the simultaneous analysis of altered gene activity on various molecular levels. However, these multi-omics approaches lead to a large amount of data which is hard to interpret for a non-bioinformatician. Here, we present the remotely accessible multilevel ontology analysis (RAMONA). It offers an easy-to-use interface for the simultaneous gene set analysis of combined omics datasets and is an extension of the previously introduced MONA approach. RAMONA is based on a Bayesian enrichment method for the inference of overrepresented biological processes among given gene sets. Overrepresentation is quantified by interpretable term probabilities. It is able to handle data from various molecular levels, while in parallel coping with redundancies arising from gene set overlaps and related multiple testing problems. The comprehensive output of RAMONA is easy to interpret and thus allows for functional insight into the affected biological processes. With RAMONA, we provide an efficient implementation of the Bayesian inference problem such that ontologies consisting of thousands of terms can be processed in the order of seconds.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

Investigation of therapeutic effectiveness of active components in Sini decoction by a comprehensive GC/LC-MS based metabolomics and network pharmacology approaches

Investigation of therapeutic effectiveness of active components in Sini decoction by a comprehensive GC/LC-MS based metabolomics and network pharmacology approaches | Databases & Softwares | Scoop.it
As a classical formula, Sini decoction (SND) has been fully proved to be clinically effective in treating doxorubicin (DOX)-induced cardiomyopathy. Current chemomics and pharmacology proved that the total alkaloids (TA), total gingerols (TG), total flavones and total saponins (TFS) are major active ingredients of Acontium Carmichaeli, Zingiber Officinale and Glycyrrhiza Uralensis in SND respectively. Our animal experiments in this study demonstrated that above active ingredients (TAGFS) were more effective than formulas formed by any one or two of the three individual components and nearly the same as SND. However, very little is known about the action mechanisms of TAGFS. Thus, this study aimed to use for the first time the combination of GC/LC-MS based metabolomics and network pharmacology for solving this problem. By metabolomics, it was found that TAGFS worked by regulating six primary pathways. Then, network pharmacology was applied to search specific targets. 17 potential cardiovascular related targets were found through molecular docking and 11 of which were identified by references, which demonstrated the therapeutic effectiveness of TAGFS by network pharmacology. Among these targets, four targets, including phosphoinositide 3-kinase gamma, insulin receptor, ornithine aminotransferase and glucokinase, were involved in the pathways TAGFS regulated. What is more, phosphoinositide 3-kinase gamma, insulin receptor and glucokinase were proved to be targets of active components in SND. In addition, our data indicated TA as the principal ingredients in SND formula, whereas TG and TFS served as adjuvant ingredients. We therefore suggest that dissecting the mode of action of clinically effective formulae with the combination use of metabolomics and network pharmacology may be a good strategy in exploring action mechanisms of Traditional Chinese Medicine.
Biswapriya Biswavas Misra's insight:

As a classical formula, Sini decoction (SND) has been fully proved to be clinically effective in treating doxorubicin (DOX)-induced cardiomyopathy. Current chemomics and pharmacology proved that the total alkaloids (TA), total gingerols (TG), total flavones and total saponins (TFS) are major active ingredients of Acontium Carmichaeli, Zingiber Officinale and Glycyrrhiza Uralensis in SND respectively. Our animal experiments in this study demonstrated that above active ingredients (TAGFS) were more effective than formulas formed by any one or two of the three individual components and nearly the same as SND. However, very little is known about the action mechanisms of TAGFS. Thus, this study aimed to use for the first time the combination of GC/LC-MS based metabolomics and network pharmacology for solving this problem. By metabolomics, it was found that TAGFS worked by regulating six primary pathways. Then, network pharmacology was applied to search specific targets. 17 potential cardiovascular related targets were found through molecular docking and 11 of which were identified by references, which demonstrated the therapeutic effectiveness of TAGFS by network pharmacology. Among these targets, four targets, including phosphoinositide 3-kinase gamma, insulin receptor, ornithine aminotransferase and glucokinase, were involved in the pathways TAGFS regulated. What is more, phosphoinositide 3-kinase gamma, insulin receptor and glucokinase were proved to be targets of active components in SND. In addition, our data indicated TA as the principal ingredients in SND formula, whereas TG and TFS served as adjuvant ingredients. We therefore suggest that dissecting the mode of action of clinically effective formulae with the combination use of metabolomics and network pharmacology may be a good strategy in exploring action mechanisms of Traditional Chinese Medicine.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

Multicriteria global optimization for biocircuit design

Abstract
Background

One of the challenges in Synthetic Biology is to design circuits with increasing levels of complexity. While circuits in Biology are complex and subject to natural tradeoffs, most synthetic circuits are simple in terms of the number of regulatory regions, and have been designed to meet a single design criterion.
Results

In this contribution we introduce a multiobjective formulation for the design of biocircuits. We set up the basis for an advanced optimization tool for the modular and systematic design of biocircuits capable of handling high levels of complexity and multiple design criteria. Our methodology combines the efficiency of global Mixed Integer Nonlinear Programming solvers with multiobjective optimization techniques. Through a number of examples we show the capability of the method to generate non intuitive designs with a desired functionality setting up a priori the desired level of complexity.
Conclusions

The methodology presented here can be used for biocircuit design and also to explore and identify different design principles for synthetic gene circuits. The presence of more than one competing objective provides a realistic design setting where every solution represents an optimal trade-off between different criteria.
Biswapriya Biswavas Misra's insight:
AbstractBackground

One of the challenges in Synthetic Biology is to design circuits with increasing levels of complexity. While circuits in Biology are complex and subject to natural tradeoffs, most synthetic circuits are simple in terms of the number of regulatory regions, and have been designed to meet a single design criterion.

Results

In this contribution we introduce a multiobjective formulation for the design of biocircuits. We set up the basis for an advanced optimization tool for the modular and systematic design of biocircuits capable of handling high levels of complexity and multiple design criteria. Our methodology combines the efficiency of global Mixed Integer Nonlinear Programming solvers with multiobjective optimization techniques. Through a number of examples we show the capability of the method to generate non intuitive designs with a desired functionality setting up a priori the desired level of complexity.

Conclusions

The methodology presented here can be used for biocircuit design and also to explore and identify different design principles for synthetic gene circuits. The presence of more than one competing objective provides a realistic design setting where every solution represents an optimal trade-off between different criteria.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

GOLD | Home

GOLD | Home | Databases & Softwares | Scoop.it
GOLD:Genomes Online Database, is a World Wide Web resource for comprehensive access to information regarding genome and metagenome sequencing projects, and their associated metadata, around the world.
Biswapriya Biswavas Misra's insight:

GOLD:Genomes Online Database, is a World Wide Web resource for comprehensive access to information regarding genome and metagenome sequencing projects, and their associated metadata, around the world.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

MAKER-P: a tool-kit for the rapid creation, management, and quality control of plant genome annotations

MAKER-P: a tool-kit for the rapid creation, management, and quality control of plant genome annotations | Databases & Softwares | Scoop.it
We have optimized and extended the widely used annotation-engine MAKER to in order to better support plant genome annotation efforts. New features include better parallelization for large repeat-rich plant genomes, ncRNA annotation capabilities, and support for pseudogene identification. We have benchmarked the resulting software toolkit, MAKER-P, using the A. thaliana and Z. mays genomes. Here we demonstrate the ability of the MAKER-P toolkit to automatically update, extend, and revise the A. thaliana annotations in light of newly available data; and to annotate pseudogenes and ncRNAs absent from the TAIR10 build. Our results demonstrate that MAKER-P can be used to manage and improve the annotations of even A. thaliana, perhaps the best-annotated plant genome. We have also installed and benchmarked MAKER-P on the Texas Advanced Computing Center (TACC). We show that this public resource can de novo annotate the entire Arabidopsis and Zea mays genomes in less than three hours, and produce annotations of comparable quality to those of the current TAIR10 and Z. mays V2 annotation builds.
Biswapriya Biswavas Misra's insight:

We have optimized and extended the widely used annotation-engine MAKER to in order to better support plant genome annotation efforts. New features include better parallelization for large repeat-rich plant genomes, ncRNA annotation capabilities, and support for pseudogene identification. We have benchmarked the resulting software toolkit, MAKER-P, using the A. thaliana and Z. mays genomes. Here we demonstrate the ability of the MAKER-P toolkit to automatically update, extend, and revise the A. thaliana annotations in light of newly available data; and to annotate pseudogenes and ncRNAs absent from the TAIR10 build. Our results demonstrate that MAKER-P can be used to manage and improve the annotations of even A. thaliana, perhaps the best-annotated plant genome. We have also installed and benchmarked MAKER-P on the Texas Advanced Computing Center (TACC). We show that this public resource can de novo annotate the entire Arabidopsis and Zea mays genomes in less than three hours, and produce annotations of comparable quality to those of the current TAIR10 and Z. mays V2 annotation builds.

more...
No comment yet.
Rescooped by Biswapriya Biswavas Misra from Bioinformatics Software: Sequence Analysis
Scoop.it!

PLEK: a tool for predicting long non-coding RNAs and messenger RNAs based on an improved k-mer scheme

PLEK: a tool for predicting long non-coding RNAs and messenger RNAs based on an improved k-mer scheme | Databases & Softwares | Scoop.it
High-throughput transcriptome sequencing (RNA-seq) technology promises to discover novel protein-coding and non-coding transcripts, particularly the identification of long non-coding RNAs (lncRNAs) from de novo sequencing data.

Via Mel Melendrez-Vallard
more...
No comment yet.
Rescooped by Biswapriya Biswavas Misra from Bioinformatics Software: Sequence Analysis
Scoop.it!

SparkSeq: fast, scalable and cloud-ready tool for the interactive genomic data analysis with nucleotide precision

SparkSeq: fast, scalable and cloud-ready tool for the interactive genomic data analysis with nucleotide precision http://t.co/1UfXL3r4Ud

Via Mel Melendrez-Vallard
more...
No comment yet.
Rescooped by Biswapriya Biswavas Misra from Bioinformatics Software: Sequence Analysis
Scoop.it!

[1409.7208] MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph

MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly
http://t.co/hZE5V294CR (ht @homolog_us)

Via Mel Melendrez-Vallard
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

TRIPATH: A Biological Genetic and Genomic Database of Three Economically Important Fungal Pathogen of Wheat - Rust: Smut: Bunt.

Biswapriya Biswavas Misra's insight:

Wheat, the major source of vegetable protein in human diet, provides staple food globally for a large proportion of the human population. With higher protein content than other major cereals, wheat has great socio- economic importance. Nonetheless for wheat, three important fungal pathogens i.e. rust, smut and bunt are major cause of significant yield losses throughout the world. Researchers are putting up a strong fight against devastating wheat pathogens, and have made progress in tracking and controlling disease outbreaks from East Africa to South Asia. The aim of the present work hence was to develop a fungal pathogens database dedicated to wheat, gathering information about different pathogen species and linking them to their biological classification, distribution and control. Towards this end, we developed an open access database Tripath: A biological, genetic and genomic database of economically important wheat fungal pathogens - rust: smut: bunt. Data collected from peer-reviewed publications and fungal pathogens were added to the customizable database through an extended relational design. The strength of this resource is in providing rapid retrieval of information from large volumes of text at a high degree of accuracy. Database TRIPATH is freely accessible.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

cddApp 1.1 – Integration between Cytoscape and the NCBI Conserved Domain Database

cddApp 1.1 – Integration between Cytoscape and the NCBI Conserved Domain Database | Databases & Softwares | Scoop.it
cddApp 1.1 :: DESCRIPTION cddApp is a Cytoscape3 extension that supports the annotation of protein networks with information about domains and specific functional sites (features) from the National Center for Biotech (cddApp 1.1 – Integration...
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

BambooGDB: a bamboo genome database with functional annotation and an analysis platform.

Abstract: Bamboo, as one of the most important non-timber forest products and fastest-growing plants in the world, represents the only major lineage of...
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

Tomato genomic resources database: an integrated repository of useful tomato genomic... - Abstract - Europe PubMed Central

Abstract: Tomato Genomic Resources Database (TGRD) allows interactive browsing of tomato genes, micro RNAs, simple sequence repeats (SSRs), important...
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

XSAnno: a framework for building ortholog models in cross-species transcriptome comparisons

XSAnno: a framework for building ortholog models in cross-species transcriptome comparisons | Databases & Softwares | Scoop.it
Abstract
Background

The accurate characterization of RNA transcripts and expression levels across species is critical for understanding transcriptome evolution. As available RNA-seq data accumulate rapidly, there is a great demand for tools that build gene annotations for cross-species RNA-seq analysis. However, prevailing methods of ortholog annotation for RNA-seq analysis between closely-related species do not take inter-species variation in mappability into consideration.
Results

Here we present XSAnno, a computational framework that integrates previous approaches with multiple filters to improve the accuracy of inter-species transcriptome comparisons. The implementation of this approach in comparing RNA-seq data of human, chimpanzee, and rhesus macaque brain transcriptomes has reduced the false discovery of differentially expressed genes, while maintaining a low false negative rate.
Conclusion

The present study demonstrates the utility of the XSAnno pipeline in building ortholog annotations and improving the accuracy of cross-species transcriptome comparisons.
Biswapriya Biswavas Misra's insight:
AbstractBackground

The accurate characterization of RNA transcripts and expression levels across species is critical for understanding transcriptome evolution. As available RNA-seq data accumulate rapidly, there is a great demand for tools that build gene annotations for cross-species RNA-seq analysis. However, prevailing methods of ortholog annotation for RNA-seq analysis between closely-related species do not take inter-species variation in mappability into consideration.

Results

Here we present XSAnno, a computational framework that integrates previous approaches with multiple filters to improve the accuracy of inter-species transcriptome comparisons. The implementation of this approach in comparing RNA-seq data of human, chimpanzee, and rhesus macaque brain transcriptomes has reduced the false discovery of differentially expressed genes, while maintaining a low false negative rate.

Conclusion

The present study demonstrates the utility of the XSAnno pipeline in building ortholog annotations and improving the accuracy of cross-species transcriptome comparisons.

more...
No comment yet.