This genomic tRNA database contains tRNA gene predictions made by the program tRNAscan-SE (Lowe & Eddy, Nucl Acids Res25: 955-964, 1997) on complete or nearly complete genomes. Unless otherwise noted, all annotation is automated, and has not been inspected for agreement with published literature.
Inevitably with automated sequence analysis, we find exceptions to general identification rules, isoacceptor type predictions (esp. due to variable post-transcriptional anticodon modification), and questionable tRNA identifications (due to pseudogenes, SINES, or other tRNA-derived elements). We attempt to document all cases we come across, and welcome feedback (lowe @soe.ucsc.edu) on new or unrecognized discrepancies. For a more detailed description of information in tables and the tRNA search algorithm, see the [Legend].
The most recent Greengenes database and taxonomy updates are now found at greengenes.secondgenome.com. Taxonomic information on this site is deprecated and should be used with caution.
The greengenes web application provides access to the 2011 version of the greengenes 16S rRNA gene sequence alignment for browsing, blasting, probing, and downloading. The data and tools presented by greengenes can assist the researcher in choosing phylogenetically specific probes, interpreting microarray results, and aligning/annotating novel sequences. If you are an ARB user, you can use greengenes to keep your own local database current.
Search is now possible using new Simrank (developed by Niels Larsen) for similarity searching against the 2011 greengenes sequences.
eSLDB is a database of protein subcellular localization annotation for eukaryotic organisms. It contains experimental annotations derived from primary protein databases, homology based annotations and computational predictions.
GOBASE is a taxonomically broad organelle genome database that organizes and integrates diverse data related to mitochondria and chloroplasts. GOBASE is currently expanding to include information on representative bacteria that are thought to be specifically related to the bacterial ancestors of mitochondria and chloroplasts.
The current version of GOBASE, release 25, is based on GenBank release 175 and was released in June 2010. This release includes 177,000 new mitochondrial sequences and 41,000 new chloroplast sequences.
Unfortunately, funding for GOBASE has expired, so this will be the last update. Maintenance of GOBASE will cease at the end of August 2010 and the contact address firstname.lastname@example.org will also become defunct at this time.
Users are requested to cite GOBASE in any publication making use of GOBASE data. The preferred reference is:
GOBASE - an organelle genome database
O'Brien, Emmet A., Zhang, Yue, Wang, Eric, Marie, Veronique, Badejoko, Wole, Lang, B. Franz and Burger, Gertraud. Nucleic Acids Res.37:D946-950 (2009)
GermSAGE is a collection of male germ cell transcriptiome information derived from Serial Analysis of Gene Expression (SAGE). It includes the three key germ cell stages in spermatogenesis, including mouse type A spermatogonia (Spga), pachytene spermatocytes (Spcy), and round spermatids (Sptd). A total of 452,095 SAGE tags are represented in all the libraries and is by far the most comprehensive resource available.
Using more than one approach to characterizing functions of unknown proteins, we now present in GenProtEC (http://genprotec.mbl.edu/) some level of function information for 87% of Escherichia coli K-12 proteins. A new approach that has yielded new information entails assigning content of structural domains and their functions to E.coli proteins. In addition, some earlier methods have been further refined to provide more meaningful data. The process of identifying and separating multimodular or fused proteins into component modules has been improved. As a result, groups of sequence-similar (paralogous) proteins have been refined. Experimental information from recent literature on previously unknown genes has been incorporated. We now use a rich system of characterizing cell roles which accents the fact that many proteins play more than one cellular role and therefore carry more than one designation from our detailed catalog of roles, MultiFun.
Gene/Protein Query Search by Gene Name, B-number, ECK Number, Swiss-Prot Accession Number and ID, Enzyme Nomenclature (E.C. number), Protein Name, Gene Type, or Physiological Role.
The Human Ageing Genomic Resources (HAGR) is a collection of databases and tools designed to help researchers study the genetics of human ageing using modern approaches such as functional genomics, network analyses, systems biology and evolutionary analyses.
The Horizontal Gene Transfer DataBase (HGT-DB) is a genomic database that includes statistical parameters such as G+C content, codon and amino-acid usage, as well as information about which genes deviate in these parameters for prokaryotic complete genomes. Under the hypothesis that genes from distantly related species have different nucleotide compositions, these deviated genes may have been acquired by horizontal gene transfer. The current version of the database contains 88 bacterial and archaeal complete genomes, including multiple chromosomes and strains. For each genome, the database provides statistical parameters for all the genes, as well as averages and standard deviations of G+C content, codon usage, relative synonymous codon usage and amino-acid content. It also provides information about correspondence analyses of the codon usage, plus lists of extraneous group of genes in terms of G+C content and lists of putatively acquired genes. With this information, researchers can explore the G+C content and codon usage of a gene when they find incongruities in sequence-based phylogenetic trees. A search engine that allows searches for gene names or keywords for a specific organism is also available. HGT-DB is freely accessible at http://www.fut.es/~debb/HGT
GenomeRNAi features 168 RNAi screens in human, and 181 screens in drosophila, as well as a NEW functionality - overlay genes sharing the same phenotype onto networks in StringDB. GenomeRNAi is recommended data repositoryby Nature’s new “Scientific Data“ publication format. Looking forward to your RNAi data submissions! Europe PMC provides links to GenomeRNAi - check out the current list.
GreenPhylDB is a web resource designed for comparative and functional genomics in plants. The database contains a catalogue of gene families based on gene predictions of genomes, covering a broad taxonomy of green plants.
Result of our automatic clustering is manually annotated and analyzed by a phylogenetic-based approach to predict homologous relationships. It supports evolution and functional studies to identify candidate gene affecting agronomic traits in crops.
Grimes GR, Moodie S, Beattie JS, Craigon M, Dickinson P, Forster T, Livingston AD, Mewissen M, Robertson KA, Ross AJ, Sing G and Ghazal P.(2005) GPX-Macrophage Expression Atlas: a database for expression profiles of macrophages challenged with a variety of pro-inflammatory, anti-inflammatory, benign and pathogen insults. BMC Genomics. Dec 12;6:178.
MacrophagesA resource for distribution of information about macrophage and osteoclast biology
BioLayout Express 3DAn application designed for the visualization, clustering and analysis of large network graphs in two and three dimensional space derived primarily, but not exclusively, from biological systems
InnateImmunity-SystemsBiologyA collaboration to assemble a compendium, or parts list, of the innate immune system with an initial focus on the response of macrophages to Toll-like Receptor (TLR) agonists
RAD is a contig-oriented database for high-quality manual annotation of RGP, which can present non-redundant contig analyses by merging the accumulated PAC/BAC clones.
As of October 2004, the database contains a total of 215 Mb sequence with relevant annotation results (30000 predicted genes.) The database can provide the latest information on manual annotation as well as a comprehensive structural analysis of various features of the rice genome.
Note that the annotation data of chromosomes 1, 3, 4 and 10 are restored from flat files of the public database. These basically contain the predicted genes information but not contain their evidence information.
The GOLD.db (Genomics of Lipid-Associated Disorders Database) was developed to address the need for integrating disparate information on the function and properties of genes and their products that are particularly relevant to the biology, diagnosis management, treatment, and prevention of lipid-associated disorders.
This is a relational database of information about hemoglobin variants and mutations that cause thalassemia. The initial data came from Syllabi authored by Prof. Titus H.J. Huisman, Mrs. Marianne F.H. Carver, Dr. Erol Baysal, and Prof. Georgi D. Efremov. This information was converted to a database, and now new entries are added and old entries are corrected by our curators, Dr. Henri Wajcman, Dr. George Patrinos, Dr. Kamran Moradkhani, Joseph Borg, and Philippe Joly. HbVar results from a collaboration among several investigators at Penn State University (USA), INSERM Creteil (France), and Boston University Medical Center (USA). Visit our query page or summary page to see the types of information available.
To query on the database, click here.
To access summaries of the categories of the mutations, click here. Summaries of mutation categories has counts of the results for common queries and buttons to link to them.
Welcome to the topoSNP database. This site produces an interactive visualization of disease and non-disease associated non-synonymous single nucleotide polymorphisms (nsSNPs) and displays geometric and relative entropy calculations.
*** Please note you will need MDL's Chime Plugin to view this page correctly. It can be freely downloaded at http://www.mdlchime.com.***
To start, please select one of the datasets to the right, or enter a protein sequence to BLAST search our database.
Genetics Home Reference (GHR) provides free access to consumer-friendly information on medical genetics to patients and their family members, health care professionals, and the public. The site offers summaries for a broad range of inherited conditions or disorders caused by gene alterations. Each condition summary includes the causes, available genetic testing, and links to summaries on related genes. Descriptions of genes include gene names and their synonyms, normal function, chromosome location, and an explanation of any disorder-causing mutations. Additionally, GHR provides a genetics glossary, help on understanding genetics, and links to many useful consumer, support, and research organizations. GHR is a service of the U.S. National Library of Medicine at the National Institutes of Health.
Also available for download are multiple alignments, orthologogy assignments used in Heger, A. and Ponting, CP.. (2007). Evolutionary rate analyses of orthologs and paralogs from 12 drosophila genomes. Genome Res 17:1837-1849.
Mammals: Gene predictions in Opossum (Monodelphis domestica) and Platypus (Ornithorhynchus anatinus).CodonbiasSupplementary material to: Heger & Ponting (2007) Variable strength of translational selection among 12 Drosophila species. Genetics. Nov;177(3):1337-48
Oryza sativa (ssp. japonica), as one of the most important food crops, was among the first to be sequenced, greatly facilitating genetic and physiological research in agriculture and plant biology. However, annotation of genes in the short-length range was proved inadequate to many plant genomes in general, especially for small secreted peptides found to be involved in diverse physiological processes, i.e. stress response, flowering, hormone signaling, etc. Studies showed that the numbers of small secreted proteins were underestimated. As both an economic crop and a model plant, it is a top priority for us to address this issue in rice (Oryza sativa ssp. japonica).
We made an effort to provide plant biologists a comprehensive comparative platform: OrysPSSP. It provides the data of small secreted proteins, 25-250 aa, on a genome-scale, integrated with a variety of search tools, validation functions and comparative resources. The current official release (v0530) contains a wholly set of 101,048 candidates. About two-thirds of them, 67,559, are located in un-annotated genome regions, while the rest, 33,489, are included in known genes. For each candidate, users are provided with chromosomal location, peptide sequence and domain(s), organelle location, gene annotation and neighboring genes. Validated with different data sets, 33,350 proteins were supported by tiling Array data, 9,431 by RNAseq data, and 18,353 by mass spectrum results. When comparing across the phylogeny of 25 green plants, we found the number of conserved SSPs between rice and other plants, in general, was inversely proportional to their evolutionary distance.
On top of the curated data for small secreted proteins from rice, we developed a number of tools to help rice scientists and plant biologists in obtaining (sub)datasets that are relevant and valuable to their fields of studies. Users can view the distribution of small secreted proteins on rice chromosomes, and browse the data by chromosome. Alternatively, they can search for small secreted protein genes and retrieve data by applying one or more filter parameters, i.e. gene keyword, domain name, chromosome location, annotation status, etc. A "BLAST" tool is also provided to seek small secreted proteins mapped to users' query sequences. Query sequence can be chose from three different types, genomic sequence (DNA), mRNA sequence (mRNA), or protein sequence (Protein). In our testing releases, the most important function users found, is the validation tool supported by our database. Currently we offered three separate datatypes, tilingArray, transcriptomics and proteomics, (all from public available data sources) to validate and filter small secreted protein candidates. A comparative genomics tool for a comprehensive analysis of the conservation of SSPs in 26 green plants was build. We integrated the genome information from 25 plant species besides Oryza sativa ssp. japonica. Comparison across the phylogeny would yield insight into the occurrence and evolution of SSPs in green plants.
The highly expressed genes database (HEG-DB) is a genomic database that includes the prediction of which genes are highly expressed in prokaryotic complete genomes under strong translational selection. The current version of the database contains general features for almost 200 genomes under translational selection, including the correspondence analysis of the relative synonymous codon usage for all genes, and the analysis of their highly expressed genes. For each genome, the database contains functional and positional information about the predicted group of highly expressed genes. This information can also be accessed using a search engine. Among other statistical parameters, the database also provides the Codon Adaptation Index (CAI) for all of the genes using the codon usage of the highly expressed genes as a reference set. The 'Pathway Tools Omics Viewer' from the BioCyc database enables the metabolic capabilities of each genome to be explored, particularly those related to the group of highly expressed genes. The HEG-DB is freely available at http://genomes.urv.cat/HEG-DB.