Databases & Softw...
Follow
Find
4.2K views | +0 today
 
Scooped by Biswapriya Biswavas Misra
onto Databases & Softwares
Scoop.it!

cRacker 1.492 - Analysis tool for Proteomic Datasets (LC-MS/MS)

cRacker 1.492 - Analysis tool for Proteomic Datasets (LC-MS/MS) | Databases & Softwares | Scoop.it

:: DESCRIPTION

cRacker has been programmed to automate standard normalization strategies of quantitative proteomic experiments. It simplifies the analysis of quantified peptide lists coming from various proteomic quantitation softwares. The applied normalization strategies are optimized for label free analyses of peptide intensities but also experiments using labeling approaches can be analyzed.

more...
No comment yet.
Databases & Softwares
Genomic, Proteomic, Transcriptomic, Metabolomic Softwares and Databases
Your new post is loading...
Your new post is loading...
Scooped by Biswapriya Biswavas Misra
Scoop.it!

dbSUPER: an integrated database of super-enhancers in mouse and human genome

database
Biswapriya Biswavas Misra's insight:

Super-enhancer is a newly proposed concept, which refers to clusters of enhancers that can drive cell-type-specific gene expression and are crucial in cell identity. Many disease-associated sequence variations are enriched in the super-enhancer regions of disease-relevant cell types. Thus, super-enhancers can be used as potential biomarkers for disease diagnosis and therapeutics. Current studies have identified super-enhancers for more than 100 cell types in human and mouse. However, no centralized resource to integrate all these findings is available yet. We developed dbSUPER (http://bioinfo.au.tsinghua.edu.cn/dbsuper/), the first integrated and interactive database of super-enhancers, with the primary goal of providing a resource for further study of transcriptional control of cell identity and disease by archiving computationally produced data. This data can be easily send to Galaxy, GREAT and Cistrome web servers for further downstream analysis. dbSUPER provides a responsive and user-friendly web interface to facilitate efficient and comprehensive searching and browsing. dbSUPER provides downloadable and exportable features in a variety of data formats, and can be visualized in UCSC genome browser while custom tracks will be added automatically. Further, dbSUPER lists genes associated with super-enhancers and links to various databases, including GeneCards, UniProt and Entrez. Our database also provides an overlap analysis tool, to check the overlap of user defined regions with the current database. We believe, dbSUPER is a valuable resource for the bioinformatics and genetics research community.

 
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

DISSECT

DISSECT was designed for being able to perform common genomic analysis on big supercomputers thus allowing to analyze very large datasets. DISSECT capabilities include analysis using mixed linear models, principal components analysis, genome-wide association analysis (testing markers individually or in together in big groups), among others. It is designed for being as easy to use as other common software tools such as PLINK or REACTA/GCTA. In addition, despite its capability of working in supercomputers, it can be used also in single computers without problems.
Biswapriya Biswavas Misra's insight:

DISSECT was designed for being able to perform common genomic analysis on big supercomputers thus allowing to analyze very large datasets. DISSECT capabilities include analysis using mixed linear models, principal components analysis, genome-wide association analysis (testing markers individually or in together in big groups), among others. It is designed for being as easy to use as other common software tools such as PLINK or REACTA/GCTA. In addition, despite its capability of working in supercomputers, it can be used also in single computers without problems.

 
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

DoGSD: the dog and wolf genome SNP database

DoGSD: the dog and wolf genome SNP database | Databases & Softwares | Scoop.it
The rapid advancement of next-generation sequencing technology has generated a deluge of genomic data from domesticated dogs and their wild ancestor, grey wolves, which have simultaneously broadened our understanding of domestication and diseases that are shared by humans and dogs. To address the scarcity of single nucleotide polymorphism (SNP) data provided by authorized databases and to make SNP data more easily/friendly usable and available, we propose DoGSD (http://dogsd.big.ac.cn), the first canidae-specific database which focuses on whole genome SNP data from domesticated dogs and grey wolves. The DoGSD is a web-based, open-access resource comprising ∼19 million high-quality whole-genome SNPs. In addition to the dbSNP data set (build 139), DoGSD incorporates a comprehensive collection of SNPs from two newly sequenced samples (1 wolf and 1 dog) and collected SNPs from three latest dog/wolf genetic studies (7 wolves and 68 dogs), which were taken together for analysis with the population genetic statistics, Fst. In addition, DoGSD integrates some closely related information including SNP annotation, summary lists of SNPs located in genes, synonymous and non-synonymous SNPs, sampling location and breed information. All these features make DoGSD a useful resource for in-depth analysis in dog-/wolf-related studies.
Biswapriya Biswavas Misra's insight:

The rapid advancement of next-generation sequencing technology has generated a deluge of genomic data from domesticated dogs and their wild ancestor, grey wolves, which have simultaneously broadened our understanding of domestication and diseases that are shared by humans and dogs. To address the scarcity of single nucleotide polymorphism (SNP) data provided by authorized databases and to make SNP data more easily/friendly usable and available, we propose DoGSD (http://dogsd.big.ac.cn), the first canidae-specific database which focuses on whole genome SNP data from domesticated dogs and grey wolves. The DoGSD is a web-based, open-access resource comprising ∼19 million high-quality whole-genome SNPs. In addition to the dbSNP data set (build 139), DoGSD incorporates a comprehensive collection of SNPs from two newly sequenced samples (1 wolf and 1 dog) and collected SNPs from three latest dog/wolf genetic studies (7 wolves and 68 dogs), which were taken together for analysis with the population genetic statistics, Fst. In addition, DoGSD integrates some closely related information including SNP annotation, summary lists of SNPs located in genes, synonymous and non-synonymous SNPs, sampling location and breed information. All these features make DoGSD a useful resource for in-depth analysis in dog-/wolf-related studies.

 
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

South Green

South Green | Databases & Softwares | Scoop.it
Biswapriya Biswavas Misra's insight:

DoCM, the Database of Curated Mutations, is a highly curated database of known, disease-causing mutations that provides easily explorable variant lists with direct links to source citations for easy verification.

 
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

SeqFindr 0.34.0 : Python Package Index

SeqFindr 0.34.0 : Python Package Index | Databases & Softwares | Scoop.it
SeqFindr - easily create informative genomic feature plots. It’s a bioinfomagicians tool to detect the presence or absence of genomic features given a database describing these features & a set of draft and/or complete genomes. We work with bacterial genomes & as such SeqFindr has only been tested with bacterial genomes.
Biswapriya Biswavas Misra's insight:

SeqFindr - easily create informative genomic feature plots. It’s a bioinfomagicians tool to detect the presence or absence of genomic features given a database describing these features & a set of draft and/or complete genomes. We work with bacterial genomes & as such SeqFindr has only been tested with bacterial genomes.

 
more...
No comment yet.
Rescooped by Biswapriya Biswavas Misra from Plant-Microbe Symbioses
Scoop.it!

MTGD: The Medicago truncatula Genome Database

MTGD: The Medicago truncatula Genome Database | Databases & Softwares | Scoop.it
Medicago truncatula, a close relative of alfalfa (Medicago sativa), is a model legume used for studying symbiotic nitrogen fixation, mycorrhizal interactions and legume genomics. J. Craig Venter Institute (JCVI; formerly TIGR) has been involved in M. truncatula genome sequencing and annotation since 2002 and has maintained a web-based resource providing data to the community for this entire period. The website (http://www.MedicagoGenome.org) has seen major updates in the past year, where it currently hosts the latest version of the genome (Mt4.0), associated data and legacy project information, presented to users via a rich set of open-source tools. A JBrowse-based genome browser interface exposes tracks for visualization. Mutant gene symbols originally assembled and curated by the Frugoli lab are now hosted at JCVI and tie into our community annotation interface, Medicago EuCAP (to be integrated soon with our implementation of WebApollo). Literature pertinent to M. truncatula is indexed and made searchable via the Textpresso search engine. The site also implements MedicMine, an instance of InterMine that offers interconnectivity with other plant ‘mines’ such as ThaleMine and PhytoMine, and other model organism databases (MODs). In addition to these new features, we continue to provide keyword- and locus identifier-based searches served via a Chado-backed Tripal Instance, a BLAST search interface and bulk downloads of data sets from the iPlant Data Store (iDS). Finally, we maintain an E-mail helpdesk, facilitated by a JIRA issue tracking system, where we receive and respond to questions about the website and requests for specific data sets from the community.

Via Jean-Michel Ané
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

Legacies of inspiring leadership: Prof. Maqs

Legacies of inspiring leadership: Prof. Maqs | Databases & Softwares | Scoop.it

Unfortunately, despite Maqs’ research contributions being a boon to Malaysia’s socio-economic well-being, especially the trillion ringgit natural rubber industry, including that of the smallholders (who are facing an uncertain future), the country is unable to further harness his unique talent and rare expertise.

Biswapriya Biswavas Misra's insight:

Professor Maqsudul Alam was an internationally renowned microbiologist trained at the famous Max Plank Institute, Germany. He specialised in genomic research, more commonly called “synthetic biology”. After much discussion at various levels, including with the then Malaysian Prime Minister Tun Abdullah Ahmad Badawi, Maqs (as he was fondly called) agreed to set up the country’s first Centre for Chemical Biology at USM (CCB@USM) dedicated to natural rubber genomic research especially Hevea brasiliensis.

Heading one out of seven of the world’s first APEX projects to ensure that USM continued to escalate intellectually, he led the charge to decode the genomic constituent of natural rubber within a period of no more than three years, working with a specially assembled talented team of graduate and post-doctoral students, as well as staff.

The mission saw a race between at least five countries worldwide, some major producers of natural rubber, others major producers of natural rubber products globally.

A new entrant, Malaysia faced a particularly challenging task given a late start and from scratch. Notwithstanding that, under Maqs’ no-nonsense stewardship and mentoring, not only were world-class facilities completed in record time but the rubber genomic sequences of some two billion bases were also decoded by CCB@USM. All these happened in less than 20 months, putting Malaysia and USM on the world map of pioneering genomic research in natural rubber internationally.

It was a highly instructive time for those who believed that Malaysia can race from behind to be a world leader as new knowledge creators as called for in Challenge number six of Wawasan 2020, provided there is courage “to challenge the status quo” — the then Chief Secretary Tan Sri Mohd Sidek Hassan’s mantra that has rubbed off onto the USM culture of transforming the university, beginning with several of the APEX world’s first initiatives
.

By October 2009, Malaysia had gained recognition as the first country to decode the genome of natural rubber. As a result, CCB@USM received invitations to assist a South-South collaborative effort in agrogenomic work in jute, fungus, dates and more from the different parts of the developing world. The Bangladeshi government commissioned a national project, under the auspices of Prime Minister Sheikh Hasina, to enhance the use of jute through genomic research under Maqs’ leadership. The genome sequence of the Tosha jute plant was uncovered in less than a year later in June 2010.

Unfortunately, despite Maqs’ research contributions being a boon to Malaysia’s socio-economic well-being, especially the trillion ringgit natural rubber industry, including that of the smallholders (who are facing an uncertain future), the country is unable to further harness his unique talent and rare expertise.

Maqs passed away on Dec 20 at Queen’s Medical Center, Honolulu, Hawaii within a day of that of Ani Arope. Both luminaries will be dearly missed and fondly remembered for their courageous leadership in challenging the status quo in the pursuit of truth and knowledge. May they rest in peace. Al-Fatihah.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

CDvist: a webserver for identification and visualization of conserved domains in protein sequences

CDvist: a webserver for identification and visualization of conserved domains in protein sequences | Databases & Softwares | Scoop.it
Biswapriya Biswavas Misra's insight:

Summary: Identification of domains in protein sequences allows their assigning to biological functions. Several webservers exist for identification of protein domains using similarity searches against various databases of protein domain models. However, none of them provides comprehensive domain coverage while allowing bulk querying and their visualization schemes can be improved. To address these issues we developed CDvist (a comprehensive domain visualization tool), which combines the best available search algorithms and databases into a user-friendly framework. First, a given protein sequence is matched to domain models using high-specificity tools and only then unmatched segments are subjected to more sensitive algorithms resulting in a best possible comprehensive coverage. Bulk querying and rich visualization and download options provide improved functionality to domain architecture analysis.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

DGIdb 1.66 – Rails Frontend to The Genome Institute’s Drug Gene Interaction Database

DGIdb 1.66 – Rails Frontend to The Genome Institute’s Drug Gene Interaction Database | Databases & Softwares | Scoop.it
DGIdb 1.66
:: DESCRIPTION

The DGIdb (Drug-Gene Interaction database) mines existing resources that generate hypotheses about how mutated genes might be targeted therapeutically or prioritized for drug development.
:
Biswapriya Biswavas Misra's insight:

The DGIdb (Drug-Gene Interaction database) mines existing resources that generate hypotheses about how mutated genes might be targeted therapeutically or prioritized for drug development.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

The eSNV-detect: a computational system to identify expressed single nucleotide variants from transcriptome sequencing data

The eSNV-detect: a computational system to identify expressed single nucleotide variants from transcriptome sequencing data | Databases & Softwares | Scoop.it
Biswapriya Biswavas Misra's insight:

Rapid development of next generation sequencing technology has enabled the identification of genomic alterations from short sequencing reads. There are a number of software pipelines available for calling single nucleotide variants from genomic DNA but, no comprehensive pipelines to identify, annotate and prioritize expressed SNVs (eSNVs) from non-directional paired-end RNA-Seq data. We have developed the eSNV-Detect, a novel computational system, which utilizes data from multiple aligners to call, even at low read depths, and rank variants from RNA-Seq. Multi-platform comparisons with the eSNV-Detect variant candidates were performed. The method was first applied to RNA-Seq from a lymphoblastoid cell-line, achieving 99.7% precision and 91.0% sensitivity in the expressed SNPs for the matching HumanOmni2.5 BeadChip data. Comparison of RNA-Seq eSNV candidates from 25 ER+ breast tumors from The Cancer Genome Atlas (TCGA) project with whole exome coding data showed 90.6–96.8% precision and 91.6–95.7% sensitivity. Contrasting single-cell mRNA-Seq variants with matching traditional multicellular RNA-Seq data for the MD-MB231 breast cancer cell-line delineated variant heterogeneity among the single-cells. Further, Sanger sequencing validation was performed for an ER+ breast tumor with paired normal adjacent tissue validating 29 out of 31 candidate eSNVs. The source code and user manuals of the eSNV-Detect pipeline for Sun Grid Engine and virtual machine are available at http://bioinformaticstools.mayo.edu/research/esnv-detect/.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

MOROKOSHI: Transcriptome Database in Sorghum bicolor

MOROKOSHI: Transcriptome Database in Sorghum bicolor | Databases & Softwares | Scoop.it
In transcriptome analysis, accurate annotation of each transcriptional unit and its expression profile is essential. A full-length cDNA (FL-cDNA) collection facilitates the refinement of transcriptional annotation and accurate transcription start sites help to unravel transcriptional regulation. We constructed a normalized FL-cDNA library from eight growth stages of aerial tissues in Sorghum bicolor and isolated 37,607 clones. These clones were Sanger sequenced from both the 5' and 3' ends and in total 38,981 high-quality expressed sequence tags (ESTs) were obtained. About one-third of the transcripts of known genes were captured as FL-cDNA clone resources. In addition to these we also annotated 272 novel genes, 323 antisense transcripts and 1672 candidate isoforms. These clones are available from the RIKEN Bioresource Center.

After obtaining accurate annotation of transcriptional units, we performed expression profile analysis. We carried out spikelet-, seed- and stem-specific RNA-Seq analysis and confirmed the expression of 70.6% of the newly identified genes. We also downloaded 23 sorghum RNA-Seq samples that are publicly available and these are shown on a genome browser together with our original FL-cDNA and RNA-Seq data. Using our original and publicly available data, we made an expression profile of each gene and identified the top 20 genes with the most similar expression. In addition, we visualized their relationships in gene co-expression networks.

Users can access and compare various transcriptome data from Sorghum bicolor at http://sorghum.riken.jp.
Biswapriya Biswavas Misra's insight:

In transcriptome analysis, accurate annotation of each transcriptional unit and its expression profile is essential. A full-length cDNA (FL-cDNA) collection facilitates the refinement of transcriptional annotation and accurate transcription start sites help to unravel transcriptional regulation. We constructed a normalized FL-cDNA library from eight growth stages of aerial tissues in Sorghum bicolor and isolated 37,607 clones. These clones were Sanger sequenced from both the 5' and 3' ends and in total 38,981 high-quality expressed sequence tags (ESTs) were obtained. About one-third of the transcripts of known genes were captured as FL-cDNA clone resources. In addition to these we also annotated 272 novel genes, 323 antisense transcripts and 1672 candidate isoforms. These clones are available from the RIKEN Bioresource Center.

After obtaining accurate annotation of transcriptional units, we performed expression profile analysis. We carried out spikelet-, seed- and stem-specific RNA-Seq analysis and confirmed the expression of 70.6% of the newly identified genes. We also downloaded 23 sorghum RNA-Seq samples that are publicly available and these are shown on a genome browser together with our original FL-cDNA and RNA-Seq data. Using our original and publicly available data, we made an expression profile of each gene and identified the top 20 genes with the most similar expression. In addition, we visualized their relationships in gene co-expression networks.

Users can access and compare various transcriptome data from Sorghum bicolor at http://sorghum.riken.jp.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

Developmental Self-Construction and -Configuration of Functional Neocortical Neuronal Networks

Developmental Self-Construction and -Configuration of Functional Neocortical Neuronal Networks | Databases & Softwares | Scoop.it
PLOS Computational Biology is an open-access
Biswapriya Biswavas Misra's insight:

The prenatal development of neural circuits must provide sufficient configuration to support at least a set of core postnatal behaviors. Although knowledge of various genetic and cellular aspects of development is accumulating rapidly, there is less systematic understanding of how these various processes play together in order to construct such functional networks. Here we make some steps toward such understanding by demonstrating through detailed simulations how a competitive co-operative (‘winner-take-all’, WTA) network architecture can arise by development from a single precursor cell. This precursor is granted a simplified gene regulatory network that directs cell mitosis, differentiation, migration, neurite outgrowth and synaptogenesis. Once initial axonal connection patterns are established, their synaptic weights undergo homeostatic unsupervised learning that is shaped by wave-like input patterns. We demonstrate how this autonomous genetically directed developmental sequence can give rise to self-calibrated WTA networks, and compare our simulation results with biological data.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

ECOD: An Evolutionary Classification of Protein Domains

ECOD: An Evolutionary Classification of Protein Domains | Databases & Softwares | Scoop.it
Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or “fold”). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies.
Biswapriya Biswavas Misra's insight:

Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or “fold”). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

DGIdb - Mining the Druggable Genome

Search Interactions search for drug-gene interactions by gene name
Biswapriya Biswavas Misra's insight:
Search Interactions search for drug-gene interactions by gene name
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

The COG database: a tool for genome-scale analysis of protein functions and evolution.

Rational classification of proteins encoded in sequenced genomes is critical for making the genome sequences maximally useful for functional and evolutionary studies. The database of Clusters of Orthologous Groups of proteins (COGs) is an attempt on a phylogenetic classification of the proteins encoded in 21 complete genomes of bacteria, archaea and eukaryotes (http://www. ncbi.nlm. nih.gov/COG). The COGs were constructed by applying the criterion of consistency of genome-specific best hits to the results of an exhaustive comparison of all protein sequences from these genomes. The database comprises 2091 COGs that include 56-83% of the gene products from each of the complete bacterial and archaeal genomes and approximately 35% of those from the yeast Saccharomyces cerevisiae genome. The COG database is accompanied by the COGNITOR program that is used to fit new proteins into the COGs and can be applied to functional and phylogenetic annotation of newly sequenced genomes.
Biswapriya Biswavas Misra's insight:

Rational classification of proteins encoded in sequenced genomes is critical for making the genome sequences maximally useful for functional and evolutionary studies. The database of Clusters of Orthologous Groups of proteins (COGs) is an attempt on a phylogenetic classification of the proteins encoded in 21 complete genomes of bacteria, archaea and eukaryotes (http://www. ncbi.nlm. nih.gov/COG). The COGs were constructed by applying the criterion of consistency of genome-specific best hits to the results of an exhaustive comparison of all protein sequences from these genomes. The database comprises 2091 COGs that include 56-83% of the gene products from each of the complete bacterial and archaeal genomes and approximately 35% of those from the yeast Saccharomyces cerevisiae genome. The COG database is accompanied by the COGNITOR program that is used to fit new proteins into the COGs and can be applied to functional and phylogenetic annotation of newly sequenced genomes.

  
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

South Green

South Green | Databases & Softwares | Scoop.it
South Green is a bioinformatics platform applied to the genomic resource analysis of southern and Mediterranean plants. ( read more)
Biswapriya Biswavas Misra's insight:

South Green is a bioinformatics platform applied to the genomic resource analysis of southern and Mediterranean plants. ( read more)

 
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

DoCM - Database of Curated Mutations

DoCM - Database of Curated Mutations | Databases & Softwares | Scoop.it
DoCM, the Database of Curated Mutations, is a highly curated database of known, disease-causing mutations that provides easily explorable variant lists with direct links to source citations for easy verification.
Biswapriya Biswavas Misra's insight:

DoCM, the Database of Curated Mutations, is a highly curated database of known, disease-causing mutations that provides easily explorable variant lists with direct links to source citations for easy verification.

 
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

MP3: A Software Tool for the Prediction of Pathogenic Proteins in Genomic and Metagenomic Data

MP3: A Software Tool for the Prediction of Pathogenic Proteins in Genomic and Metagenomic Data | Databases & Softwares | Scoop.it
The identification of virulent proteins in any de-novo sequenced genome is useful in estimating its pathogenic ability and understanding the mechanism of pathogenesis. Similarly, the identification of such proteins could be valuable in comparing the metagenome of healthy and diseased individuals and estimating the proportion of pathogenic species. However, the common challenge in both the above tasks is the identification of virulent proteins since a significant proportion of genomic and metage
Biswapriya Biswavas Misra's insight:

The identification of virulent proteins in any de-novo sequenced genome is useful in estimating its pathogenic ability and understanding the mechanism of pathogenesis. Similarly, the identification of such proteins could be valuable in comparing the metagenome of healthy and diseased individuals and estimating the proportion of pathogenic species. However, the common challenge in both the above tasks is the identification of virulent proteins since a significant proportion of genomic and metagenomic proteins are novel and yet unannotated. The currently available tools which carry out the identification of virulent proteins provide limited accuracy and cannot be used on large datasets. Therefore, we have developed an MP3 standalone tool and web server for the prediction of pathogenic proteins in both genomic and metagenomic datasets. MP3 is developed using an integrated Support Vector Machine (SVM) and Hidden Markov Model (HMM) approach to carry out highly fast, sensitive and accurate prediction of pathogenic proteins. It displayed Sensitivity, Specificity, MCC and accuracy values of 92%, 100%, 0.92 and 96%, respectively, on blind dataset constructed using complete proteins. On the two metagenomic blind datasets (Blind A: 51–100 amino acids and Blind B: 30–50 amino acids), it displayed Sensitivity, Specificity, MCC and accuracy values of 82.39%, 97.86%, 0.80 and 89.32% for Blind A and 71.60%, 94.48%, 0.67 and 81.86% for Blind B, respectively. In addition, the performance of MP3 was validated on selected bacterial genomic and real metagenomic datasets. To our knowledge, MP3 is the only program that specializes in fast and accurate identification of partial pathogenic proteins predicted from short (100–150 bp) metagenomic reads and also performs exceptionally well on complete protein sequences. MP3 is publicly available at http://metagenomics.iiserb.ac.in/mp3/index.php.

  
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

Metabolomics Society Webinar on Thursday 29 January 2015 (7:30 AM - 8:30 AM EST) by Dr. Oscar Yanes

Metabolomics Society Webinar on Thursday 29 January 2015 (7:30 AM - 8:30 AM EST) by Dr. Oscar Yanes | Databases & Softwares | Scoop.it
Biswapriya Biswavas Misra's insight:
Dear Metabolomics Community,The Early-career Members Network (EMN), on behalf of the Metabolomics Society, is planning to establish a series of online webinars from January 2015 onwards.We would like to formally invite you to our first session of our series coming to you live on Thursday 29 January 2015 (7:30 AM - 8:30 AM EST). Session 1 of the EMN webinar series will feature our expert speaker Dr. Oscar Yanes(http://www.yaneslab.com) who will provide a cutting edge 20 minute presentation regarding the complex and multidisciplinary nature of metabolomics. The experiences and research conducted in Dr. Yanes' laboratory will provide an invaluable insight into the challenges faced in modern metabolomics practice. In addition, there will be an opportunity to pose key questions to Dr. Yanes at the end of the session.Please, register using the following link: 
https://attendee.gotowebinar.com/register/2409752719256762369The first webinar is freely available for everyone courtesy of the Metabolomics Society and will be uploaded to the society's website. All subsequent sessions from our series will be available for members of the Metabolomics Society only, with the opportunity to revisit live recorded sessions at your own convenience.We look forward to having you join us!Sincerely,
The EMN
more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

BFA: phenotype prediction integrating metabolic models with constraints derived from experimental data

Background

Flux analysis methods lie at the core of Metabolic Engineering (ME), providing methods for phenotype simulation that allow the determination of flux distributions under different conditions. Although many constraint-based modeling software tools have been developed and published, none provides a free user-friendly application that makes available the full portfolio of flux analysis methods.
Results

This work presents Constraint-based Flux Analysis (CBFA), an open-source software application for flux analysis in metabolic models that implements several methods for phenotype prediction, allowing users to define constraints associated with measured fluxes and/or flux ratios, together with environmental conditions (e.g. media) and reaction/gene knockouts. CBFA identifies the set of applicable methods based on the constraints defined from user inputs, encompassing algebraic and constraint-based simulation methods. The integration of CBFA within the OptFlux framework for ME enables the utilization of different model formats and standards and the integration with complementary methods for phenotype simulation and visualization of results.
Conclusions

A general-purpose and flexible application is proposed that is independent of the origin of the constraints defined for a given simulation. The aim is to provide a simple to use software tool focused on the application of several flux prediction methods.
Biswapriya Biswavas Misra's insight:
Background

Flux analysis methods lie at the core of Metabolic Engineering (ME), providing methods for phenotype simulation that allow the determination of flux distributions under different conditions. Although many constraint-based modeling software tools have been developed and published, none provides a free user-friendly application that makes available the full portfolio of flux analysis methods.

Results

This work presents Constraint-based Flux Analysis (CBFA), an open-source software application for flux analysis in metabolic models that implements several methods for phenotype prediction, allowing users to define constraints associated with measured fluxes and/or flux ratios, together with environmental conditions (e.g. media) and reaction/gene knockouts. CBFA identifies the set of applicable methods based on the constraints defined from user inputs, encompassing algebraic and constraint-based simulation methods. The integration of CBFA within the OptFlux framework for ME enables the utilization of different model formats and standards and the integration with complementary methods for phenotype simulation and visualization of results.

Conclusions

A general-purpose and flexible application is proposed that is independent of the origin of the constraints defined for a given simulation. The aim is to provide a simple to use software tool focused on the application of several flux prediction methods.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

solGS: a web-based tool for genomic selection

Genomic selection (GS) promises to improve accuracy in estimating breeding values and genetic gain for quantitative traits compared to traditional breeding methods. Its reliance on high-throughput genome-wide markers and statistical complexity, however, is a serious challenge in data management, analysis, and sharing. A bioinformatics infrastructure for data storage and access, and user-friendly web-based tool for analysis and sharing output is needed to make GS more practical for breeders.
Biswapriya Biswavas Misra's insight:
Background

Genomic selection (GS) promises to improve accuracy in estimating breeding values and genetic gain for quantitative traits compared to traditional breeding methods. Its reliance on high-throughput genome-wide markers and statistical complexity, however, is a serious challenge in data management, analysis, and sharing. A bioinformatics infrastructure for data storage and access, and user-friendly web-based tool for analysis and sharing output is needed to make GS more practical for breeders.

Results

We have developed a web-based tool, called solGS, for predicting genomic estimated breeding values (GEBVs) of individuals, using a Ridge-Regression Best Linear Unbiased Predictor (RR-BLUP) model. It has an intuitive web-interface for selecting a training population for modeling and estimating genomic estimated breeding values of selection candidates. It estimates phenotypic correlation and heritability of traits and selection indices of individuals. Raw data is stored in a generic database schema, Chado Natural Diversity, co-developed by multiple database groups. Analysis output is graphically visualized and can be interactively explored online or downloaded in text format. An instance of its implementation can be accessed at the NEXTGEN Cassava breeding database, http://cassavabase.org/solgs webcite.

Conclusions

solGS enables breeders to store raw data and estimate GEBVs of individuals online, in an intuitive and interactive workflow. It can be adapted to any breeding program.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

Statistical significance of variables driving systematic variation in high-dimensional data

Statistical significance of variables driving systematic variation in high-dimensional data | Databases & Softwares | Scoop.it
Biswapriya Biswavas Misra's insight:

Motivation: There are a number of well-established methods such as principal component analysis (PCA) for automatically capturing systematic variation due to latent variables in large-scale genomic data. PCA and related methods may directly provide a quantitative characterization of a complex biological variable that is otherwise difficult to precisely define or model. An unsolved problem in this context is how to systematically identify the genomic variables that are drivers of systematic variation captured by PCA. Principal components (PCs) (and other estimates of systematic variation) are directly constructed from the genomic variables themselves, making measures of statistical significance artificially inflated when using conventional methods due to over-fitting.

Results: We introduce a new approach called the jackstraw that allows one to accurately identify genomic variables that are statistically significantly associated with any subset or linear combination of PCs. The proposed method can greatly simplify complex significance testing problems encountered in genomics and can be used to identify the genomic variables significantly associated with latent variables. Using simulation, we demonstrate that our method attains accurate measures of statistical significance over a range of relevant scenarios. We consider yeast cell-cycle gene expression data, and show that the proposed method can be used to straightforwardly identify genes that are cell-cycle regulated with an accurate measure of statistical significance. We also analyze gene expression data from post-trauma patients, allowing the gene expression data to provide a molecularly driven phenotype. Using our method, we find a greater enrichment for inflammatory-related gene sets compared to the original analysis that uses a clinically defined, although likely imprecise, phenotype. The proposed method provides a useful bridge between large-scale quantifications of systematic variation and gene-level significance analyses.

Availability and implementation: An R software package, called jackstraw, is available in CRAN.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

goldilocks 0.0.52 : Python Package Index

quality
Biswapriya Biswavas Misra's insight:

Goldilocks is a Python package providing functionality for locating 'interesting' genomic regions for some definition of 'interesting'. You can import it to your scripts, pass it sequence data and search for subsequences that match some criteria across one or more samples.

Goldilocks was developed to support our work in the investigation of quality control for genetic sequencing. It was used to quickly locate regions on the human genome that expressed a desired level of variability, which were "just right" for later variant calling and comparison.

The package has since been made more flexible and can be used to find regions of interest based on other criteria such as GC-content, density of target k-mers, defined confidence metrics and missing nucleotides

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

SteatoNet: The First Integrated Human Metabolic Model with Multi-layered Regulation to Investigate Liver-Associated Pathologies

SteatoNet: The First Integrated Human Metabolic Model with Multi-layered Regulation to Investigate Liver-Associated Pathologies | Databases & Softwares | Scoop.it
PLOS Computational Biology is an open-access
Biswapriya Biswavas Misra's insight:

Current state-of-the-art mathematical models to investigate complex biological processes, in particular liver-associated pathologies, have limited expansiveness, flexibility, representation of integrated regulation and rely on the availability of detailed kinetic data. We generated the SteatoNet, a multi-pathway, multi-tissue model and in silico platform to investigate hepatic metabolism and its associated deregulations. SteatoNet is based on object-oriented modelling, an approach most commonly applied in automotive and process industries, whereby individual objects correspond to functional entities. Objects were compiled to feature two novel hepatic modelling aspects: the interaction of hepatic metabolic pathways with extra-hepatic tissues and the inclusion of transcriptional and post-transcriptional regulation. SteatoNet identification at normalised steady state circumvents the need for constraining kinetic parameters. Validation and identification of flux disturbances that have been proven experimentally in liver patients and animal models highlights the ability of SteatoNet to effectively describe biological behaviour. SteatoNet identifies crucial pathway branches (transport of glucose, lipids and ketone bodies) where changes in flux distribution drive the healthy liver towards hepatic steatosis, the primary stage of non-alcoholic fatty liver disease. Cholesterol metabolism and its transcription regulators are highlighted as novel steatosis factors. SteatoNet thus serves as an intuitive in silico platform to identify systemic changes associated with complex hepatic metabolic disorders.

more...
No comment yet.
Scooped by Biswapriya Biswavas Misra
Scoop.it!

A Systematic Computational Analysis of Biosynthetic Gene Cluster Evolution: Lessons for Engineering Biosynthesis

A Systematic Computational Analysis of Biosynthetic Gene Cluster Evolution: Lessons for Engineering Biosynthesis | Databases & Softwares | Scoop.it
BGC
Biswapriya Biswavas Misra's insight:

Bacterial secondary metabolites are widely used as antibiotics, anticancer drugs, insecticides and food additives. Attempts to engineer their biosynthetic gene clusters (BGCs) to produce unnatural metabolites with improved properties are often frustrated by the unpredictability and complexity of the enzymes that synthesize these molecules, suggesting that genetic changes within BGCs are limited by specific constraints. Here, by performing a systematic computational analysis of BGC evolution, we derive evidence for three findings that shed light on the ways in which, despite these constraints, nature successfully invents new molecules: 1) BGCs for complex molecules often evolve through the successive merger of smaller sub-clusters, which function as independent evolutionary entities. 2) An important subset of polyketide synthases and nonribosomal peptide synthetases evolve by concerted evolution, which generates sets of sequence-homogenized domains that may hold promise for engineering efforts since they exhibit a high degree of functional interoperability, 3) Individual BGC families evolve in distinct ways, suggesting that design strategies should take into account family-specific functional constraints. These findings suggest novel strategies for using synthetic biology to rationally engineer biosynthetic pathways.

more...
Ken D'Amato's curator insight, December 17, 2014 8:57 PM

This fit well with the keynote at AU!