We may have sequenced the human and other species’ genomes, but we are still nowhere near predicting how this creates a living, breathing organism. Here’s why:
In 2001, the Human Genome Project gave us an almost complete draft of the 3 billion letters in our DNA. We joined an elite club of species that have had their genome sequenced, one that is growing with every passing month. As our technologies and understanding advance, will we eventually be able to look at a pile of raw DNA sequence and glean all the workings of the organism it belongs to? Just as physicists can use the laws of mechanics to predict the motion of an object, can biologists use fundamental ideas in genetics and molecular biology to predict the traits and flaws of a body based solely on its genes? Could we pop a genome into a black box, and print out the image of a human? Or a fly? Or a mouse? Not easily. In complex organisms, some traits can be traced back to specific genes. If, for instance, you’re looking at a specific variant of the MC1R gene, chances are you’ve got a mammal in front of you, and it has red hair. Indeed, people have predicted that some Neanderthals were red-heads for precisely this reason. But beyond that, predicting if something is a mouse or a whale or a armadillo, we still can't do it.
Bernhard Palsson from the University of California, San Diego agrees. “Sequencing a woolly mammoth will not predict its properties,” he says. “But you might be able to do a lot better with bacteria.” Their simpler and smaller genomes should in theory make it easier to predict the basic features of their metabolism, or whether they grow using oxygen or not. But even though we can sequence a bacterial genome in under a day, and for just $80, we would still struggle to determine important traits, like how good a disease-causing microbe is at infecting its host.
Finding all the genes even in a small genome is hard. Earlier this year, scientists discovered a new gene in a flu virus whose genome consists of just 14,000 letters (small enough to fit into 100 tweets), and had been sequenced again and again. So it should be unsurprising that our own genome, with 3 billion letters, is full of errors and gaps, despite ostensibly being “complete”. In May, another group showed that the reference human genome is missing a gene that may have shaped the evolution of our large brains. “There’s no genome that is completely understood even in terms of the genes within it,” says Markus Covert from Stanford University. “Typically, no function is known for a fourth to a fifth of the genes, even in smaller genomes.”