Two international teams have independently produced the first drafts of the human proteome. These curated catalogs of the proteins expressed in most non-diseased human tissues and organs can be used as a baseline to better understand changes that occur in disease states. Their findings were published today (May 29) in Nature.
Both teams uncovered new complexities of the human genome, identifying novel proteins from regions of the genome previously thought to be non-coding.
“While other large proteomic data sets have been collected that cataloged up to 10,000 proteins, the real breakthrough with these two projects is the comprehensive coverage of more than 80 percent of the expected human proteome which has not been achieved previously,” said Hanno Steen, director of proteomics at Boston Children's Hospital, who was not involved in the work. “These efforts clearly show that to get to this deep level of proteome coverage, many different tissue types must be probed.”
Analyzing 30 different tissue types, Akhilesh Pandey, a proteomics researcher at Johns Hopkins University in Baltimore, Maryland, and his colleagues at the Institute of Bioinformatics in Bangalore, India, and elsewhere cataloged proteins encoded by about 84 percent of all human genes predicted to code for proteins. The researchers published the results of their Human Proteome Map online, and the data will also soon be accessible through the National Center for Biotechnology Information database, said Pandey.
Meanwhile, proteomics researcher Bernhard Küster of the Technische Universität München in Germany and his colleagues created ProteomicsDB, a searchable, public database that catalogs 92 percent of the estimated 19,629 human proteins.
Both teams analyzed human tissue samples using mass spectrometry. Pandey’s team generated all new data, analyzing a variety of healthy human tissues, including seven types of fetal tissues and six types of hematopoetic cells. The Küster group took a slightly different approach, compiling already available raw mass spec data from databases and colleagues’ contributions, which currently makes up about 60 percent of the ProteomicsDB. To fill in the data gaps, the Küster lab generated its own mass spec data, analyzing 60 human tissues, 13 body fluids, and 147 cancer cell lines. According to Küster, the team only selected high-resolution public data, which was computationally processed for strict quality control.
“These two papers are very complimentary,” said Anne-Claude Gingras, a proteomics researcher at the Lunenfeld-Tanenbaum Research Institute in Toronto, Canada, who was not involved in either study. “The Hopkins group really addressed what was missing in proteomics, providing a survey of human proteins from a single source, which allows for easy comparisons within their data.” In contrast, the ProteomeDB effort connected new information with existing data from the proteomics community. The goal, said Küster, is to continue to grow and refine the database, further engaging the community and pooling more resources.
- M. Wilhelm et al., “Mass-spectrometry-based draft of the human proteome,” Nature, doi:10.1038/nature13319, 2014.
- M.S. Kim et al. “A draft map of the human proteome,” Nature, doi:10.1038/nature13302, 2014.