Databases & Softwares
5.9K views | +0 today
Follow
Databases & Softwares
Genomic, Proteomic, Transcriptomic, Metabolomic Softwares and Databases
Your new post is loading...
Your new post is loading...
Scooped by Biswapriya Biswavas Misra
Scoop.it!

Islander: a database of precisely mapped genomic islands in tRNA and tmRNA genes

Islander: a database of precisely mapped genomic islands in tRNA and tmRNA genes | Databases & Softwares | Scoop.it
Biswapriya Biswavas Misra's insight:

Genomic islands are mobile DNAs that are major agents of bacterial and archaeal evolution. Integration into prokaryotic chromosomes usually occurs site-specifically at tRNA or tmRNA gene (together, tDNA) targets, catalyzed by tyrosine integrases. This splits the target gene, yet sequences within the island restore the disrupted gene; the regenerated target and its displaced fragment precisely mark the endpoints of the island. We applied this principle to search for islands in genomic DNA sequences. Our algorithm identifies tDNAs, finds fragments of those tDNAs in the same replicon and removes unlikely candidate islands through a series of filters. A search for islands in 2168 whole prokaryotic genomes produced 3919 candidates. The website Islander (recently moved to http://bioinformatics.sandia.gov/islander/) presents these precisely mapped candidate islands, the gene content and the island sequence. The algorithm further insists that each island encode an integrase, and attachment site sequence identity is carefully noted; therefore, the database also serves in the study of integrase site-specificity and its evolution.

more...
No comment yet.
Rescooped by Biswapriya Biswavas Misra from Plant Genomics
Scoop.it!

Updates in Metabolomics Tools and Resources: 2014–2015 - Misra - ELECTROPHORESIS

Updates in Metabolomics Tools and Resources: 2014–2015 - Misra - ELECTROPHORESIS | Databases & Softwares | Scoop.it

Data processing and interpretation represent the most challenging and time-consuming steps in high-throughput metabolomic experiments, regardless of the analytical platform (mass spectrometry [MS] or nuclear magnetic resonance spectroscopy [NMR]-based) used for data acquisition. Improved machinery in metabolomics generate increasingly complex data sets which create the need for more and better processing and analysis software and in-silico approaches to understand the resulting data. However, a comprehensive source of information describing the utility of the most recently developed and released metabolomics resources – in the form of tools, software, and databases - is currently lacking. Thus, here we provide an overview of freely-available, open-source, tools, algorithms and frameworks to make both upcoming and established metabolomics researchers aware of the recent developments in an attempt to advance and facilitate data processing workflows in their metabolomics research. The major topics include tools and researches for data processing, data annotation, and data visualization in MS and NMR based metabolomics. Most in this review described tools are dedicated to untargeted metabolomics workflows; however, some more specialist tools are described as well. All tools and resources described including their analytical and computational platform dependencies are summarized in an overview Table.

of

Via Biswapriya B Misra, Biswapriya Biswavas Misra
more...
Biswapriya B Misra's curator insight, October 25, 2015 1:40 PM
Keywords:
Annotation,Databases,Data analysis,Data processing,Data visualization,Mass spectrometry,Metabolites,Metabolomics,NMR;Statistics,Software tools

Data processing and interpretation represent the most challenging and time-consuming steps in high-throughput metabolomic experiments, regardless of the analytical platform (mass spectrometry [MS] or nuclear magnetic resonance spectroscopy [NMR]-based) used for data acquisition. Improved machinery in metabolomics generate increasingly complex data sets which create the need for more and better processing and analysis software and in-silico approaches to understand the resulting data. However, a comprehensive source of information describing the utility of the most recently developed and released metabolomics resources – in the form of tools, software, and databases - is currently lacking. Thus, here we provide an overview of freely-available, open-source, tools, algorithms and frameworks to make both upcoming and established metabolomics researchers aware of the recent developments in an attempt to advance and facilitate data processing workflows in their metabolomics research. The major topics include tools and researches for data processing, data annotation, and data visualization in MS and NMR based metabolomics. Most in this review described tools are dedicated to untargeted metabolomics workflows; however, some more specialist tools are described as well. All tools and resources described including their analytical and computational platform dependencies are summarized in an overview Table.

Scooped by Biswapriya Biswavas Misra
Scoop.it!

CircNet: a database of circular RNAs derived from transcriptome sequencing data.

CircNet: a database of circular RNAs derived from transcriptome sequencing data. | Databases & Softwares | Scoop.it
Nucleic Acids Res. 2015 Oct 7. pii: gkv940. [Epub ahead of print]
Biswapriya Biswavas Misra's insight:

Circular RNAs (circRNAs) represent a new type of regulatory noncoding RNA that only recently has been identified and cataloged. Emerging evidence indicates that circRNAs exert a new layer of post-transcriptional regulation of gene expression. In this study, we utilized transcriptome sequencing datasets to systematically identify the expression of circRNAs (including known and newly identified ones by our pipeline) in 464 RNA-seq samples, and then constructed the CircNet database (http://circnet.mbc.nctu.edu.tw/) that provides the following resources: (i) novel circRNAs, (ii) integrated miRNA-target networks, (iii) expression profiles of circRNA isoforms, (iv) genomic annotations of circRNA isoforms (e.g. 282 948 exon positions), and (v) sequences of circRNA isoforms. The CircNet database is to our knowledge the first public database that provides tissue-specific circRNA expression profiles and circRNA-miRNA-gene regulatory networks. It not only extends the most up to date catalog of circRNAs but also provides a thorough expression analysis of both previously reported and novel circRNAs. Furthermore, it generates an integrated regulatory network that illustrates the regulation between circRNAs, miRNAs and genes.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

FlavonQ: An Automated Data Processing Tool for Profiling Flavone and Flavonol Glycosides with Ultra-High-Performance Liquid Chromatography–Diode Array Detection–High Resolution Accurate Mass–Mass S...

FlavonQ: An Automated Data Processing Tool for Profiling Flavone and Flavonol Glycosides with Ultra-High-Performance Liquid Chromatography–Diode Array Detection–High Resolution Accurate Mass–Mass S... | Databases & Softwares | Scoop.it
Profiling flavonoids in natural products poses a great challenge due to the diversity of flavonoids, the lack of commercially available standards, and the complexity of plant matrixes. The increasingly popular use of ultra-high-performance liquid chromatography–diode array detection–high resolution accurate mass–mass spectrometry (UHPLC-HRAM-MS) for the analysis of flavonoids has provided more definitive information but also vastly increased amounts of data. Thus, mining of the UHPLC-HRAM-MS data is a very daunting, labor-intensive, and expertise-dependent process. An automated data processing tool, FlavonQ, was developed that can transfer field-acquired expertise into data analysis and facilitate flavonoid research. FlavonQ is an “expert system” designed for automated data analysis of flavone and flavonol glycosides, two important subclasses of flavonoids. FlavonQ is capable of data format conversion, peak detection, flavone and flavonol glycoside peak extraction, flavone and flavonol glycoside identification, and production of quantitative results. An expert system was applied to the determination of flavone and flavonol glycosides in nine different plants with an average execution time of less than 1 min. The results obtained by FlavonQ were in good agreement with those determined conventionally by a flavonoid expert.
Biswapriya Biswavas Misra's insight:

Profiling flavonoids in natural products poses a great challenge due to the diversity of flavonoids, the lack of commercially available standards, and the complexity of plant matrixes. The increasingly popular use of ultra-high-performance liquid chromatography–diode array detection–high resolution accurate mass–mass spectrometry (UHPLC-HRAM-MS) for the analysis of flavonoids has provided more definitive information but also vastly increased amounts of data. Thus, mining of the UHPLC-HRAM-MS data is a very daunting, labor-intensive, and expertise-dependent process. An automated data processing tool, FlavonQ, was developed that can transfer field-acquired expertise into data analysis and facilitate flavonoid research. FlavonQ is an “expert system” designed for automated data analysis of flavone and flavonol glycosides, two important subclasses of flavonoids. FlavonQ is capable of data format conversion, peak detection, flavone and flavonol glycoside peak extraction, flavone and flavonol glycoside identification, and production of quantitative results. An expert system was applied to the determination of flavone and flavonol glycosides in nine different plants with an average execution time of less than 1 min. The results obtained by FlavonQ were in good agreement with those determined conventionally by a flavonoid expert.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

Escher: A Web Application for Building, Sharing, and Embedding Data-Rich Visualizations of Biological Pathways

Escher: A Web Application for Building, Sharing, and Embedding Data-Rich Visualizations of Biological Pathways | Databases & Softwares | Scoop.it
Escher is a web application for visualizing data on biological pathways. Three key features make Escher a uniquely effective tool for pathway visualization. First, users can rapidly design new pathway maps. Escher provides pathway suggestions based on user data and genome-scale models, so users can draw pathways in a semi-automated way. Second, users can visualize data related to genes or proteins on the associated reactions and pathways, using rules that define which enzymes catalyze each reaction. Thus, users can identify trends in common genomic data types (e.g. RNA-Seq, proteomics, ChIP)—in conjunction with metabolite- and reaction-oriented data types (e.g. metabolomics, fluxomics). Third, Escher harnesses the strengths of web technologies (SVG, D3, developer tools) so that visualizations can be rapidly adapted, extended, shared, and embedded. This paper provides examples of each of these features and explains how the development approach used for Escher can be used to guide the development of future visualization tools.
Biswapriya Biswavas Misra's insight:

Escher is a web application for visualizing data on biological pathways. Three key features make Escher a uniquely effective tool for pathway visualization. First, users can rapidly design new pathway maps. Escher provides pathway suggestions based on user data and genome-scale models, so users can draw pathways in a semi-automated way. Second, users can visualize data related to genes or proteins on the associated reactions and pathways, using rules that define which enzymes catalyze each reaction. Thus, users can identify trends in common genomic data types (e.g. RNA-Seq, proteomics, ChIP)—in conjunction with metabolite- and reaction-oriented data types (e.g. metabolomics, fluxomics). Third, Escher harnesses the strengths of web technologies (SVG, D3, developer tools) so that visualizations can be rapidly adapted, extended, shared, and embedded. This paper provides examples of each of these features and explains how the development approach used for Escher can be used to guide the development of future visualization tools.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

Analysis of metabolites in single cells—what is the best micro-platform?

Analysis of metabolites in single cells—what is the best micro-platform? | Databases & Softwares | Scoop.it
Biswapriya Biswavas Misra's insight:

This review covers new innovations and developments in the field of single-cell level analysis of metabolites, involving the role of microfluidic and microarray platforms to manipulate and handle the cells prior their detection. Microfluidic and microarray platforms have shown great promise. The latest developments demonstrate their potential to identify a particular cell or even an ensemble of cells (sharing a common property or phenotype) that co-exist in a much larger cell population. The reason for this is the capability of these platforms to perform several complex analytical processes, such as: cleanup, sorting, derivatization, separation, and detection, with great robustness, speed, and reduced sample/reagent consumption. Here, we present several examples that illustrate the rapid strides that have been made for the routine analysis of metabolites by coupling different microfluidics and microarrays devices to a wide range of analytical detectors (e.g. fluorescent microscopy, electrochemical, and mass spectrometry). Herein, we also present selected examples detailing the use of microfluidics and microarrays in the visualization of the natural occurring cell-to-cell heterogeneity in isogenic populations, in particular during the response to external cues. The possibility to accurate monitor the cell-to-cell heterogeneity based on different levels of key metabolites is of clinical relevance, since cell-to-cell heterogeneity can influence, for example, the outcome of a drug treatment.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

RulNet: A Web-Oriented Platform for Regulatory Network Inference, Application to Wheat –Omics Data

RulNet: A Web-Oriented Platform for Regulatory Network Inference, Application to Wheat –Omics Data | Databases & Softwares | Scoop.it
With the increasing amount of –omics data available, a particular effort has to be made to provide suitable analysis tools. A major challenge is that of unraveling the molecular regulatory networks from massive and heterogeneous datasets. Here we describe RulNet, a web-oriented platform dedicated to the inference and analysis of regulatory networks from qualitative and quantitative –omics data by means of rule discovery. Queries for rule discovery can be written in an extended form of the RQL query language, which has a syntax similar to SQL. RulNet also offers users interactive features that progressively adjust and refine the inferred networks. In this paper, we present a functional characterization of RulNet and compare inferred networks with correlation-based approaches. The performance of RulNet has been evaluated using the three benchmark datasets used for the transcriptional network inference challenge DREAM5. Overall, RulNet performed as well as the best methods that participated in this challenge and it was shown to behave more consistently when compared across the three datasets. Finally, we assessed the suitability of RulNet to analyze experimental –omics data and to infer regulatory networks involved in the response to nitrogen and sulfur supply in wheat (Triticum aestivum L.) grains. The results highlight putative actors governing the response to nitrogen and sulfur supply in wheat grains. We evaluate the main characteristics and features of RulNet as an all-in-one solution for RN inference, visualization and editing. Using simple yet powerful RulNet queries allowed RNs involved in the adaptation of wheat grain to N and S supply to be discovered. We demonstrate the effectiveness and suitability of RulNet as a platform for the analysis of RNs involving different types of –omics data. The results are promising since they are consistent with what was previously established by the scientific community.
Biswapriya Biswavas Misra's insight:

With the increasing amount of –omics data available, a particular effort has to be made to provide suitable analysis tools. A major challenge is that of unraveling the molecular regulatory networks from massive and heterogeneous datasets. Here we describe RulNet, a web-oriented platform dedicated to the inference and analysis of regulatory networks from qualitative and quantitative –omics data by means of rule discovery. Queries for rule discovery can be written in an extended form of the RQL query language, which has a syntax similar to SQL. RulNet also offers users interactive features that progressively adjust and refine the inferred networks. In this paper, we present a functional characterization of RulNet and compare inferred networks with correlation-based approaches. The performance of RulNet has been evaluated using the three benchmark datasets used for the transcriptional network inference challenge DREAM5. Overall, RulNet performed as well as the best methods that participated in this challenge and it was shown to behave more consistently when compared across the three datasets. Finally, we assessed the suitability of RulNet to analyze experimental –omics data and to infer regulatory networks involved in the response to nitrogen and sulfur supply in wheat (Triticum aestivum L.) grains. The results highlight putative actors governing the response to nitrogen and sulfur supply in wheat grains. We evaluate the main characteristics and features of RulNet as an all-in-one solution for RN inference, visualization and editing. Using simple yet powerful RulNet queries allowed RNs involved in the adaptation of wheat grain to N and S supply to be discovered. We demonstrate the effectiveness and suitability of RulNet as a platform for the analysis of RNs involving different types of –omics data. The results are promising since they are consistent with what was previously established by the scientific community.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

FuncTree: Functional Analysis and Visualization for Large-Scale Omics Data

FuncTree: Functional Analysis and Visualization for Large-Scale Omics Data | Databases & Softwares | Scoop.it
Exponential growth of high-throughput data and the increasing complexity of omics information have been making processing and interpreting biological data an extremely difficult and daunting task. Here we developed FuncTree ( http://bioviz.tokyo/functree ), a web-based application for analyzing and visualizing large-scale omics data, including but not limited to genomic, metagenomic, and transcriptomic data. FuncTree allows user to map their omics data onto the “Functional Tree map”, a predefin
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

NaviCell Web Service for network-based data visualization

NaviCell Web Service for network-based data visualization | Databases & Softwares | Scoop.it
Biswapriya Biswavas Misra's insight:

Data visualization is an essential element of biological research, required for obtaining insights and formulating new hypotheses on mechanisms of health and disease. NaviCell Web Service is a tool for network-based visualization of ‘omics’ data which implements several data visual representation methods and utilities for combining them together. NaviCell Web Service uses Google Maps and semantic zooming to browse large biological network maps, represented in various formats, together with different types of the molecular data mapped on top of them. For achieving this, the tool provides standard heatmaps, barplots and glyphs as well as the novel map staining technique for grasping large-scale trends in numerical values (such as whole transcriptome) projected onto a pathway map. The web service provides a server mode, which allows automating visualization tasks and retrieving data from maps via RESTful (standard HTTP) calls. Bindings to different programming languages are provided (Python and R). We illustrate the purpose of the tool with several case studies using pathway maps created by different research groups, in which data visualization provides new insights into molecular mechanisms involved in systemic diseases such as cancer and neurodegenerative diseases.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

The ReproGenomics Viewer: an integrative cross-species toolbox for the reproductive science community

The ReproGenomics Viewer: an integrative cross-species toolbox for the reproductive science community | Databases & Softwares | Scoop.it
Biswapriya Biswavas Misra's insight:

We report the development of the ReproGenomics Viewer (RGV), a multi- and cross-species working environment for the visualization, mining and comparison of published omics data sets for the reproductive science community. The system currently embeds 15 published data sets related to gametogenesis from nine model organisms. Data sets have been curated and conveniently organized into broad categories including biological topics, technologies, species and publications. RGV's modular design for both organisms and genomic tools enables users to upload and compare their data with that from the data sets embedded in the system in a cross-species manner. The RGV is freely available at http://rgv.genouest.org.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

GRACOMICS: software for graphical comparison of multiple results with omics data

Analysis of large-scale omics data has become more and more challenging due to high dimensionality. More complex analysis methods and tools are required to handle such data. While many methods already exist, those methods often produce different results. To help users obtain more appropriate results (i.e. candidate genes), we propose a tool, GRACOMICS that compares numerous analysis results visually in a more systematic way; this enables the users to easily interpret the results more comfortably.
Biswapriya Biswavas Misra's insight:
AbstractBackground

Analysis of large-scale omics data has become more and more challenging due to high dimensionality. More complex analysis methods and tools are required to handle such data. While many methods already exist, those methods often produce different results. To help users obtain more appropriate results (i.e. candidate genes), we propose a tool, GRACOMICS that compares numerous analysis results visually in a more systematic way; this enables the users to easily interpret the results more comfortably.

Results

GRACOMICS has the ability to visualize multiple analysis results interactively. We developed GRACOMICS to provide instantaneous results (plots and tables), corresponding to user-defined threshold values, since there are yet no other up-to-date omics data visualization tools that provide such features. In our analysis, we successfully employed two types of omics data: transcriptomic data (microarray and RNA-seq data) and genomic data (SNP chip and NGS data).

Conclusions

GRACOMICS is a graphical user interface (GUI)-based program written in Java for cross-platform computing environments, and can be applied to compare analysis results for any type of large-scale omics data. This tool can be useful for biologists to identify genes commonly found by intersected statistical methods, for further experimental validation

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

CyanOmics: an integrated database of omics for the model cyanobacterium Synechococcus sp. PCC 7002

CyanOmics: an integrated database of omics for the model cyanobacterium Synechococcus sp. PCC 7002 | Databases & Softwares | Scoop.it
Biswapriya Biswavas Misra's insight:

Cyanobacteria are an important group of organisms that carry out oxygenic photosynthesis and play vital roles in both the carbon and nitrogen cycles of the Earth. The annotated genome of Synechococcus sp. PCC 7002, as an ideal model cyanobacterium, is available. A series of transcriptomic and proteomic studies of Synechococcus sp. PCC 7002 cells grown under different conditions have been reported. However, no database of such integrated omics studies has been constructed. Here we present CyanOmics, a database based on the results of Synechococcus sp. PCC 7002 omics studies. CyanOmics comprises one genomic dataset, 29 transcriptomic datasets and one proteomic dataset and should prove useful for systematic and comprehensive analysis of all those data. Powerful browsing and searching tools are integrated to help users directly access information of interest with enhanced visualization of the analytical results. Furthermore, Blast is included for sequence-based similarity searching and Cluster 3.0, as well as the R hclust function is provided for cluster analyses, to increase CyanOmics’s usefulness. To the best of our knowledge, it is the first integrated omics analysis database for cyanobacteria. This database should further understanding of the transcriptional patterns, and proteomic profiling of Synechococcus sp. PCC 7002 and other cyanobacteria. Additionally, the entire database framework is applicable to any sequenced prokaryotic genome and could be applied to other integrated omics analysis projects.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

Tools for visualization and analysis of molecular networks, pathways, and -omics data

Tools for visualization and analysis of molecular networks, pathways, and -omics data | Databases & Softwares | Scoop.it
Biological pathways have become the standard way to represent the coordinated reactions and actions of a series of molecules in a cell. A series of interconnected pathways is referred to as a biological network, which denotes a more holistic view on the entanglement of cellular reactions. Biological pathways and networks are not only an appropriate approach to visualize molecular reactions. They have also become one leading method in -omics data analysis and visualization. Here, we review a set of pathway and network visualization and analysis methods and take a look at potential future developments in the field.
Biswapriya Biswavas Misra's insight:

Biological pathways have become the standard way to represent the coordinated reactions and actions of a series of molecules in a cell. A series of interconnected pathways is referred to as a biological network, which denotes a more holistic view on the entanglement of cellular reactions. Biological pathways and networks are not only an appropriate approach to visualize molecular reactions. They have also become one leading method in -omics data analysis and visualization. Here, we review a set of pathway and network visualization and analysis methods and take a look at potential future developments in the field.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads

Liliana Florea
Biswapriya Biswavas Misra's insight:
Abstract
Background

Next-generation sequencing of cellular RNA (RNA-seq) is rapidly becoming the cornerstone of transcriptomic analysis. However, sequencing errors in the already short RNA-seq reads complicate bioinformatics analyses, in particular alignment and assembly. Error correction methods have been highly effective for whole-genome sequencing (WGS) reads, but are unsuitable for RNA-seq reads, owing to the variation in gene expression levels and alternative splicing.

Findings

We developed a k-mer based method, Rcorrector, to correct random sequencing errors in Illumina RNA-seq reads. Rcorrector uses a De Bruijn graph to compactly represent all trusted k-mers in the input reads. Unlike WGS read correctors, which use a global threshold to determine trusted k-mers, Rcorrector computes a local threshold at every position in a read.

Conclusions

Rcorrector has an accuracy higher than or comparable to existing methods, including the only other method (SEECER) designed for RNA-seq reads, and is more time and memory efficient. With a 5 GB memory footprint for 100 million reads, it can be run on virtually any desktop or server. The software is available free of charge under the GNU General Public License from https://github.com/mourisl/Rcorrector/.

Keywords:

Next-generation sequencing; RNA-seq; Error correction; k-mers

 
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

Virtual pathway explorer (viPEr) and pathway enrichment analysis tool (PEANuT): creating and analyzing focus networks to identify cross-talk between molecules and pathways

Interpreting large-scale studies from microarrays or next-generation sequencing for further experimental testing remains one of the major challenges in quantitative biology. Combining expression with physical or genetic interaction data has already been successfully applied to enhance knowledge from all types of high-throughput studies. Yet, toolboxes for navigating and understanding even small gene or protein networks are poorly developed.
Biswapriya Biswavas Misra's insight:
Abstract
Background

Interpreting large-scale studies from microarrays or next-generation sequencing for further experimental testing remains one of the major challenges in quantitative biology. Combining expression with physical or genetic interaction data has already been successfully applied to enhance knowledge from all types of high-throughput studies. Yet, toolboxes for navigating and understanding even small gene or protein networks are poorly developed.

Results

We introduce two Cytoscape plug-ins, which support the generation and interpretation of experiment-based interaction networks. The virtual pathway explorer viPEr creates so-called focus networks by joining a list of experimentally determined genes with the interactome of a specific organism. viPEr calculates all paths between two or more user-selected nodes, or explores the neighborhood of a single selected node. Numerical values from expression studies assigned to the nodes serve to score identified paths. The pathway enrichment analysis tool PEANuT annotates networks with pathway information from various sources and calculates enriched pathways between a focus and a background network. Using time series expression data of atorvastatin treated primary hepatocytes from six patients, we demonstrate the handling and applicability of viPEr and PEANuT. Based on our investigations using viPEr and PEANuT, we suggest a role of the FoxA1/A2/A3 transcriptional network in the cellular response to atorvastatin treatment. Moreover, we find an enrichment of metabolic and cancer pathways in the Fox transcriptional network and demonstrate a patient-specific reaction to the drug.

Conclusions

The Cytoscape plug-in viPEr integrates –omics data with interactome data. It supports the interpretation and navigation of large-scale datasets by creating focus networks, facilitating mechanistic predictions from –omics studies. PEANuT provides an up-front method to identify underlying biological principles by calculating enriched pathways in focus networks.

 
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

MGFM: a novel tool for detection of tissue and cell specific marker genes from microarray gene expression data

Identification of marker genes associated with a specific tissue/cell type is a fundamental challenge in genetic and cell research. Marker genes are of great importance for determining cell identity, and for understanding tissue specific gene function and the molecular mechanisms underlying complex diseases.
Biswapriya Biswavas Misra's insight:
Abstract
Background

Identification of marker genes associated with a specific tissue/cell type is a fundamental challenge in genetic and cell research. Marker genes are of great importance for determining cell identity, and for understanding tissue specific gene function and the molecular mechanisms underlying complex diseases.

Results

We have developed a new bioinformatics tool called MGFM (Marker Gene Finder in Microarray data) to predict marker genes from microarray gene expression data. Marker genes are identified through the grouping of samples of the same type with similar marker gene expression levels. We verified our approach using two microarray data sets from the NCBI’s Gene Expression Omnibus public repository encompassing samples for similar sets of five human tissues (brain, heart, kidney, liver, and lung). Comparison with another tool for tissue-specific gene identification and validation with literature-derived established tissue markers established functionality, accuracy and simplicity of our tool. Furthermore, top ranked marker genes were experimentally validated by reverse transcriptase-polymerase chain reaction (RT-PCR). The sets of predicted marker genes associated with the five selected tissues comprised well-known genes of particular importance in these tissues. The tool is freely available from the Bioconductor web site, and it is also provided as an online application integrated into the CellFinder platform (http://cellfinder.org/analysis/marker).

Conclusions

MGFM is a useful tool to predict tissue/cell type marker genes using microarray gene expression data. The implementation of the tool as an R-package as well as an application within CellFinder facilitates its use.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

Genome sequencing reveals a new lineage associated with lablab bean and genetic exchange between Xanthomonas axonopodis pv. phaseoli and Xanthomonas fuscans subsp. fuscans

Genome sequencing reveals a new lineage associated with lablab bean and genetic exchange between Xanthomonas axonopodis pv. phaseoli and Xanthomonas fuscans subsp. fuscans | Databases & Softwares | Scoop.it
Common bacterial blight is a devastating seed-borne disease of common beans that also occurs on other legume species including lablab and Lima beans. We sequenced and analysed the genomes of 26 isolates of Xanthomonas axonopodis pv. phaseoli and X. fuscans subsp. fuscans, the causative agents of this disease, collected over four decades and six continents. This revealed considerable genetic variation within both taxa, encompassing both single-nucleotide variants and differences in gene content, that could be exploited for tracking pathogen spread. The bacterial isolate from Lima bean fell within the previously described Genetic Lineage 1, along with the pathovar type isolate (NCPPB 3035). The isolates from lablab represent a new, previously unknown genetic lineage closely related to strains of X. axonopodis pv. glycines. Finally, we identified more than 100 genes that appear to have been recently acquired by Xanthomonas axonopodis pv. phaseoli from X. fuscans subsp. fuscans.
Biswapriya Biswavas Misra's insight:

Common bacterial blight is a devastating seed-borne disease of common beans that also occurs on other legume species including lablab and Lima beans. We sequenced and analyzed the genomes of 26 strains of Xanthomonas axonopodis pv. phaseoli and X. fuscans subsp. fuscans, the causative agents of this disease, collected over four decades and six continents. This revealed considerable genetic variation within both taxa, encompassing both single-nucleotide variants and differences in gene content, that could be exploited for tracking pathogen spread. The bacterial strain from Lima bean fell within the previously described Genetic Lineage 1, along with the pathovar type strain (NCPPB 3035). The strains from lablab represent a new, previously unknown genetic lineage closely related to strains of X. axonopodis pv. glycines. Finally, we identified more than 100 genes that appear to have been recently acquired by Xanthomonas axonopodis pv. phaseoli from X. fuscans subsp. fuscans.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

SIMAT: GC-SIM-MS data analysis tool

Gas chromatography coupled with mass spectrometry (GC-MS) is one of the technologies widely used for qualitative and quantitative analysis of small molecules. In particular, GC coupled to single quadrupole MS can be utilized for targeted analysis by selected ion monitoring (SIM). However, to our knowledge, there are no software tools specifically designed for analysis of GC-SIM-MS data. In this paper, we introduce a new R/Bioconductor package called SIMAT for quantitative analysis of the levels of targeted analytes. SIMAT provides guidance in choosing fragments for a list of targets. This is accomplished through an optimization algorithm that has the capability to select the most appropriate fragments from overlapping chromatographic peaks based on a pre-specified library of background analytes. The tool also allows visualization of the total ion chromatograms (TIC) of runs and extracted ion chromatograms (EIC) of analytes of interest. Moreover, retention index (RI) calibration can be performed and raw GC-SIM-MS data can be imported in netCDF or NIST mass spectral library (MSL) formats.
Biswapriya Biswavas Misra's insight:
Abstract
Background

Gas chromatography coupled with mass spectrometry (GC-MS) is one of the technologies widely used for qualitative and quantitative analysis of small molecules. In particular, GC coupled to single quadrupole MS can be utilized for targeted analysis by selected ion monitoring (SIM). However, to our knowledge, there are no software tools specifically designed for analysis of GC-SIM-MS data. In this paper, we introduce a new R/Bioconductor package called SIMAT for quantitative analysis of the levels of targeted analytes. SIMAT provides guidance in choosing fragments for a list of targets. This is accomplished through an optimization algorithm that has the capability to select the most appropriate fragments from overlapping chromatographic peaks based on a pre-specified library of background analytes. The tool also allows visualization of the total ion chromatograms (TIC) of runs and extracted ion chromatograms (EIC) of analytes of interest. Moreover, retention index (RI) calibration can be performed and raw GC-SIM-MS data can be imported in netCDF or NIST mass spectral library (MSL) formats.

Results

We evaluated the performance of SIMAT using two GC-SIM-MS datasets obtained by targeted analysis of: (1) plasma samples from 86 patients in a targeted metabolomic experiment; and (2) mixtures of internal standards spiked in plasma samples at varying concentrations in a method development study. Our results demonstrate that SIMAT offers alternative solutions to AMDIS and MetaboliteDetector to achieve accurate detection of targets and estimation of their relative intensities by analysis of GC-SIM-MS data.

Conclusions

We introduce a new R package called SIMAT that allows the selection of the optimal set of fragments and retention time windows for target analytes in GC-SIM-MS based analysis. Also, various functions and algorithms are implemented in the tool to: (1) read and import raw data and spectral libraries; (2) perform GC-SIM-MS data preprocessing; and (3) plot and visualize EICs and TICs.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

DynaMet: A Fully Automated Pipeline for Dynamic LC–MS Data

DynaMet: A Fully Automated Pipeline for Dynamic LC–MS Data | Databases & Softwares | Scoop.it
Dynamic isotope labeling data provides crucial information about the operation of metabolic pathways and are commonly generated via liquid chromatography–mass spectrometry (LC–MS). Metabolome-wide analysis is challenging as it requires grouping of metabolite features over different samples. We developed DynaMet for fully automated investigations of isotope labeling experiments from LC-high-resolution MS raw data. DynaMet enables untargeted extraction of metabolite labeling profiles and provides integrated tools for expressive data visualization. To validate DynaMet we first used time course labeling data of the model strain Bacillus methanolicus from 13C methanol resulting in complex spectra in multicarbon compounds. Analysis of two biological replicates revealed high robustness and reproducibility of the pipeline. In total, DynaMet extracted 386 features showing dynamic labeling within 10 min. Of these features, 357 could be fitted by implemented kinetic models. Feature identification against KEGG database resulted in 215 matches covering multiple pathways of core metabolism and major biosynthetic routes. Moreover, we performed time course labeling experiment with Escherichia coli on uniformly labeled 13C glucose resulting in a comparable number of detected features with labeling profiles of high quality. The distinct labeling patterns of common central metabolites generated from both model bacteria can readily be explained by one versus multicarbon compound metabolism. DynaMet is freely available as an extension package for Python based eMZed2, an open source framework built for rapid development of LC–MS data analysis workflows.
Biswapriya Biswavas Misra's insight:

Dynamic isotope labeling data provides crucial information about the operation of metabolic pathways and are commonly generated via liquid chromatography–mass spectrometry (LC–MS). Metabolome-wide analysis is challenging as it requires grouping of metabolite features over different samples. We developed DynaMet for fully automated investigations of isotope labeling experiments from LC-high-resolution MS raw data. DynaMet enables untargeted extraction of metabolite labeling profiles and provides integrated tools for expressive data visualization. To validate DynaMet we first used time course labeling data of the model strain Bacillus methanolicus from 13C methanol resulting in complex spectra in multicarbon compounds. Analysis of two biological replicates revealed high robustness and reproducibility of the pipeline. In total, DynaMet extracted 386 features showing dynamic labeling within 10 min. Of these features, 357 could be fitted by implemented kinetic models. Feature identification against KEGG database resulted in 215 matches covering multiple pathways of core metabolism and major biosynthetic routes. Moreover, we performed time course labeling experiment with Escherichia coli on uniformly labeled 13C glucose resulting in a comparable number of detected features with labeling profiles of high quality. The distinct labeling patterns of common central metabolites generated from both model bacteria can readily be explained by one versus multicarbon compound metabolism. DynaMet is freely available as an extension package for Python based eMZed2, an open source framework built for rapid development of LC–MS data analysis workflows.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

GraP: platform for functional genomics analysis of Gossypium raimondii

GraP: platform for functional genomics analysis of Gossypium raimondii | Databases & Softwares | Scoop.it
Biswapriya Biswavas Misra's insight:

Cotton (Gossypium spp.) is one of the most important natural fiber and oil crops worldwide. Improvement of fiber yield and quality under changing environments attract much attention from cotton researchers; however, a functional analysis platform integrating omics data is still missing. The success of cotton genome sequencing and large amount of available transcriptome data allows the opportunity to establish a comprehensive analysis platform for integrating these data and related information. A comprehensive database, Platform of Functional Genomics Analysis in Gossypium raimondii (GraP), was constructed to provide multi-dimensional analysis, integration and visualization tools. GraP includes updated functional annotation, gene family classifications, protein–protein interaction networks, co-expression networks and microRNA–target pairs. Moreover, gene set enrichment analysis and cis-element significance analysis tools are also provided for gene batch analysis of high-throughput data sets. Based on these effective services, GraP may offer further information for subsequent studies of functional genes and in-depth analysis of high-throughput data. GraP is publically accessible at http://structuralbiology.cau.edu.cn/GraP/, with all data available for downloading.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

GOplot: an R package for visually combining expression data with functional analysis

GOplot: an R package for visually combining expression data with functional analysis | Databases & Softwares | Scoop.it
Biswapriya Biswavas Misra's insight:

Summary: Despite the plethora of methods available for the functional analysis of omics data, obtaining comprehensive-yet detailed understanding of the results remains challenging. This is mainly due to the lack of publicly available tools for the visualization of this type of information. Here we present an R package called GOplot, based on ggplot2, for enhanced graphical representation. Our package takes the output of any general enrichment analysis and generates plots at different levels of detail: from a general overview to identify the most enriched categories (bar plot, bubble plot) to a more detailed view displaying different types of information for molecules in a given set of categories (circle plot, chord plot, cluster plot). The package provides a deeper insight into omics data and allows scientists to generate insightful plots with only a few lines of code to easily communicate the findings.

Availability: The R package GOplot is available via CRAN-The Comprehensive R Archive Network: http://cran.r-project.org/web/packages/GOplot. The shiny web application of the Venn diagram can be found at: https://wwalter.shinyapps.io/Venn/

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

PANDA: pathway and annotation explorer for visualizing and interpreting gene-centric data

PANDA: pathway and annotation explorer for visualizing and interpreting gene-centric data | Databases & Softwares | Scoop.it
Objective. Bringing together genomics, transcriptomics, proteomics, and other -omics technologies is an important step towards developing highly personalized medicine. However, instrumentation has advances far beyond expectations and now we are able to generate data faster than it can be interpreted.Materials and Methods. We have developed PANDA (Pathway AND Annotation) Explorer, a visualization tool that integrates gene-level annotation in the context of biological pathways to help interpret complex data from disparate sources. PANDA is a web-based application that displays data in the context of well-studied pathways like KEGG, BioCarta, and PharmGKB. PANDA represents data/annotations as icons in the graph while maintaining the other data elements (i.e., other columns for the table of annotations). Custom pathways from underrepresented diseases can be imported when existing data sources are inadequate. PANDA also allows sharing annotations among collaborators.Results. In our first use case, we show how easy it is to view supplemental data from a manuscript in the context of a user’s own data. Another use-case is provided describing how PANDA was leveraged to design a treatment strategy from the somatic variants found in the tumor of a patient with metastatic sarcomatoid renal cell carcinoma.Conclusion. PANDA facilitates the interpretation of gene-centric annotations by visually integrating this information with context of biological pathways. The application can be downloaded or used directly from our website: http://bioinformaticstools.mayo.edu/research/panda-viewer/.
Biswapriya Biswavas Misra's insight:

Objective. Bringing together genomics, transcriptomics, proteomics, and other -omics technologies is an important step towards developing highly personalized medicine. However, instrumentation has advances far beyond expectations and now we are able to generate data faster than it can be interpreted.

Materials and Methods. We have developed PANDA (Pathway AND Annotation) Explorer, a visualization tool that integrates gene-level annotation in the context of biological pathways to help interpret complex data from disparate sources. PANDA is a web-based application that displays data in the context of well-studied pathways like KEGG, BioCarta, and PharmGKB. PANDA represents data/annotations as icons in the graph while maintaining the other data elements (i.e., other columns for the table of annotations). Custom pathways from underrepresented diseases can be imported when existing data sources are inadequate. PANDA also allows sharing annotations among collaborators.

Results. In our first use case, we show how easy it is to view supplemental data from a manuscript in the context of a user’s own data. Another use-case is provided describing how PANDA was leveraged to design a treatment strategy from the somatic variants found in the tumor of a patient with metastatic sarcomatoid renal cell carcinoma.

Conclusion. PANDA facilitates the interpretation of gene-centric annotations by visually integrating this information with context of biological pathways. The application can be downloaded or used directly from our website: http://bioinformaticstools.mayo.edu/research/panda-viewer/.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

Recherche uO Research: Advancing Lipidomic Bioinformatics: Visualization and phosphoLipid IDentification (VaLID)

Biswapriya Biswavas Misra's insight:

Lipidomics is a relatively new field under the heading of systems biology. Due to its infancy, the field suffers from significant ‘growing pains’, one of which is the lack of bioinformatic analytic resources that other “-omics” fields enjoy. Here, I describe the creation and validation of the glycerophospholipid identification program VaLID. Using an in silico approach, we generated a comprehensive database containing all of the glycerophospholipids within multiple sub-classes: those containing chains of 0 to 30 carbons with up to 6 unsaturations and various linkages. Using Java, I created a web- based computer interface with a search engine and a visualization tool to access this database. In comparing results to current programs, I found that VaLID consistently contained more identity predictions than did the current gold standard LipidMAPS. Results from several tests with real datasets confirm that VaLID is more than capable as a phospholipid identification tool for use in lipidomics.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

SpirPro: A Spirulina proteome database and web-based tools for the analysis of protein-protein interactions at the metabolic level in Spirulina (Arthro...

Spirulina (Arthrospira) platensis is the only cyanobacterium that in addition to being studied at the molecular level and subjected to gene manipulation, can also be mass cultivated in outdoor ponds for commercial use as a food supplement. Thus, encountering environmental changes, including temperature stresses, is common during the mass production of Spirulina. The use of cyanobacteria as an experimental platform, especially for photosynthetic gene manipulation in plants and bacteria, is becoming increasingly important. Understanding the mechanisms and protein-protein interaction networks that underlie low- and high-temperature responses is relevant to Spirulina mass production. To accomplish this goal, high-throughput techniques such as OMICs analyses are used. Thus, large datasets must be collected, managed and subjected to information extraction. Therefore, databases including (i) proteomic analysis and protein-protein interaction (PPI) data and (ii) domain/motif visualization tools are required for potential use in temperature response models for plant chloroplasts and photosynthetic bacteria.
Biswapriya Biswavas Misra's insight:
Abstract
Background

Spirulina (Arthrospira) platensis is the only cyanobacterium that in addition to being studied at the molecular level and subjected to gene manipulation, can also be mass cultivated in outdoor ponds for commercial use as a food supplement. Thus, encountering environmental changes, including temperature stresses, is common during the mass production of Spirulina. The use of cyanobacteria as an experimental platform, especially for photosynthetic gene manipulation in plants and bacteria, is becoming increasingly important. Understanding the mechanisms and protein-protein interaction networks that underlie low- and high-temperature responses is relevant to Spirulina mass production. To accomplish this goal, high-throughput techniques such as OMICs analyses are used. Thus, large datasets must be collected, managed and subjected to information extraction. Therefore, databases including (i) proteomic analysis and protein-protein interaction (PPI) data and (ii) domain/motif visualization tools are required for potential use in temperature response models for plant chloroplasts and photosynthetic bacteria.

Descriptions

A web-based repository was developed including an embedded database, SpirPro, and tools for network visualization. Proteome data were analyzed integrated with protein-protein interactions and/or metabolic pathways from KEGG. The repository provides various information, ranging from raw data (2D-gel images) to associated results, such as data from interaction and/or pathway analyses. This integration allows in silico analyses of protein-protein interactions affected at the metabolic level and, particularly, analyses of interactions between and within the affected metabolic pathways under temperature stresses for comparative proteomic analysis. The developed tool, which is coded in HTML with CSS/JavaScript and depicted in Scalable Vector Graphics (SVG), is designed for interactive analysis and exploration of the constructed network. SpirPro is publicly available on the web at http://spirpro.sbi.kmutt.ac.th.

Conclusions

SpirPro is an analysis platform containing an integrated proteome and PPI database that provides the most comprehensive data on this cyanobacterium at the systematic level. As an integrated database, SpirPro can be applied in various analyses, such as temperature stress response networking analysis in cyanobacterial models and interacting domain-domain analysis between proteins of interest.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

CARMO: a comprehensive annotation platform for functional exploration of rice multi-omics data

High-throughput technology is gradually becoming a powerful tool for routine research in rice. Interpretation of biological significance from the huge amount of data is a critical but non-trivial task, especially for rice, for which gene annotations rely heavily on sequence similarity rather than direct experimental evidence. Here we describe the annotation platform for comprehensive annotation of rice multi-omics data (CARMO), which provides multiple web-based analysis tools for in-depth data mining and visualization. The central idea involves systematic integration of 1819 samples from omics studies and diverse sources of functional evidence (15 401 terms), which are further organized into gene sets and higher-level gene modules. In this way, the high-throughput data may easily be compared across studies and platforms, and integration of multiple types of evidence allows biological interpretation from the level of gene functional modules with high confidence. In addition, the functions and pathways for thousands of genes lacking description or validation may be deduced based on concerted expression of genes within the constructed co-expression networks or gene modules. Overall, CARMO provides comprehensive annotations for transcriptomic datasets, epi-genomic modification sites, single nucleotide polymorphisms identified from genome re-sequencing, and the large gene lists derived from these omics studies. Well-organized results, as well as multiple tools for interactive visualization, are available through a user-friendly web interface. Finally, we illustrate how CARMO enables biological insights using four examples, demonstrating that CARMO is a highly useful resource for intensive data mining and hypothesis generation based on rice multi-omics data. CARMO is freely available online (http://bioinfo.sibs.ac.cn/carmo).
Biswapriya Biswavas Misra's insight:

High-throughput technology is gradually becoming a powerful tool for routine research in rice. Interpretation of biological significance from the huge amount of data is a critical but non-trivial task, especially for rice, for which gene annotations rely heavily on sequence similarity rather than direct experimental evidence. Here we describe the annotation platform for comprehensive annotation of rice multi-omics data (CARMO), which provides multiple web-based analysis tools for in-depth data mining and visualization. The central idea involves systematic integration of 1819 samples from omics studies and diverse sources of functional evidence (15 401 terms), which are further organized into gene sets and higher-level gene modules. In this way, the high-throughput data may easily be compared across studies and platforms, and integration of multiple types of evidence allows biological interpretation from the level of gene functional modules with high confidence. In addition, the functions and pathways for thousands of genes lacking description or validation may be deduced based on concerted expression of genes within the constructed co-expression networks or gene modules. Overall, CARMO provides comprehensive annotations for transcriptomic datasets, epi-genomic modification sites, single nucleotide polymorphisms identified from genome re-sequencing, and the large gene lists derived from these omics studies. Well-organized results, as well as multiple tools for interactive visualization, are available through a user-friendly web interface. Finally, we illustrate how CARMO enables biological insights using four examples, demonstrating that CARMO is a highly useful resource for intensive data mining and hypothesis generation based on rice multi-omics data. CARMO is freely available online (http://bioinfo.sibs.ac.cn/carmo).

more...
No comment yet.