By characterizing the geographic and functional spectrum of human genetic variation, the 1000 Genomes Project aims to build a resource to help to understand the genetic contribution to disease. The genomes of 1,092 individuals from 14 populations, constructed using a combination of low-coverage whole-genome and exome sequencing were analyzed. By developing methods to integrate information across several algorithms and diverse data sources, a validated haplotype map of 38 million single nucleotide polymorphisms, 1.4 million short insertions and deletions, and more than 14,000 larger deletions were provided. Individuals from different populations carry different profiles of rare and common variants, and low-frequency variants show substantial geographic differentiation, which is further increased by the action of purifying selection. An evolutionary conservation was found and coding consequence are key determinants of the strength of purifying selection, rare-variant load varies substantially across biological pathways, and each individual contains hundreds of rare non-coding variants at conserved sites, such as motif-disrupting changes in transcription-factor-binding sites. This extensive resource, which captures up to 98% of accessible single nucleotide polymorphisms at a frequency of 1% in related populations, enables researchers to perform a detailed analysis of common and low-frequency variants in individuals from diverse backgrounds.
Via Dr. Stefan Gruenwald