Genetics: Crunching the Data Within
Technological advances are making it ever easier and cheaper to gather information about the human genetic makeup—and ever more challenging to organize and interpret that flood of knowledge.
Dan Saelinger for Proto
The human genetic makeup can seem almost unfathomably complex. Yet, for at least 150 years, since Charles Darwin detailed his theory of natural selection, scientists have been peeling back layer upon layer of the mystery. From Darwin and Gregor Mendel in the 19th century to landmark discoveries in the 1950s involving deoxyribonucleic acid (DNA), genes and chromosomes, and the completion of the Human Genome Project in 2003—a watershed accomplishment that determined the sequence of the 3 billion chemical base pairs on DNA’s double helix—knowledge has expanded geometrically. But with each advance, making sense of what we’ve learned has become ever more daunting. For several decades, Massachusetts General Hospital researchers have been deciphering crucial parts of this science, and the hospital’s newly formed Analytic and Translational Genetics Unit (ATGU) will play a leading role in managing and interpreting this widening sea of genetic information.
During the early 1980s, a team led by James Gusella, now director of the Center for Human Genetic Research at MGH, pioneered a twist on a traditional genetic strategy called linkage analysis—the analysis of traits such as hair color or blood type that, when inherited together, are caused by genes located near one another on the same chromosomes. By modifying the technique to look at inherited variations in DNA, they mapped the position of the gene responsible for Huntington’s disease, a devastating adult-onset neurodegenerative disease that causes loss of motor control, altered personality, and psychiatric and cognitive symptoms. Gusella’s finding opened the route to what became the widely used strategy of positional cloning, which involves cloning genes based on their location in the genome.
“At the time I came into the field, there were only about 30 traditional genetic markers,” Gusella says, “and none of them was associated with Huntington’s, so we knew the gene wasn’t located near any of them.” Instead, he and his team looked for variations in the DNA sequences themselves (on average, about 1 in 1,000 base pairs of DNA differs in any two individuals). “By tracking the inheritance of sequence variation within the families we studied, who carried the Huntington’s disease mutation, we could actually use the sequence differences as the markers to find the chromosome,” he says. “It was just a matter of finding enough sequence variations and testing them until we hit one that traveled with Huntington’s disease.”
By 1983, Gusella and his team had narrowed down the location of the gene for Huntington’s disease to chromosome 4, and 10 years later they found the gene itself. During the following decade, this same basic strategy, refined as scientists made more detailed maps of the human genome, was repeated in family linkage studies conducted for many single-gene disorders, including cystic fibrosis.
“The strategy was very successful,” says David Altshuler, a faculty member of the Center for Human Genetic Research at MGH and director of the Program in Medical and Population Genetics at the Broad Institute of Harvard and MIT. “But for the more common diseases that are caused by many genes, it didn’t work.” Such disorders include type 2 diabetes, heart disease and psychiatric conditions, including anxiety, autism and schizophrenia. So, by the early 2000s, researchers had moved on to studying which sequence variations in the genome were associated with a particular disease by comparing research subjects with the disease to those without it—a strategy called genome-wide association—and the search was aided by rapid developments in sequencing technology. In its wake, Altshuler and Mark Daly, chief of ATGU, have helped lead three major genomic mapping efforts—the International HapMap Project, the SNP Consortium and the 1000 Genomes Project. MGH has been an active collaborator in all three, in partnership with the Whitehead Institute and now the Broad Institute, and researchers at the hospital have used genome-wide association studies to investigate type 2 diabetes, Crohn’s disease, blood lipids, anxiety and more.
According to Altshuler, that research has led to the discovery of roughly 1,000 genetic contributors to common diseases (only about 10 were known a decade ago). “Now, we’re embarking on an era in which the continued acceleration of sequencing technology—which makes it possible to sequence the entire genomes of many patients in research studies or even in the clinic—is generating the highest-resolution map of our genes that you could possibly imagine,” Altshuler says. “You can know every letter of DNA in a research subject.”
The cost and time required to do whole-genome sequencing have decreased dramatically—leading to a proliferation of genetic information that ATGU will attempt to organize and interpret. “When we run sequences with the new technologies, we find millions and millions of places in the genome that differ between two individuals,” Daly says. “We only understand the function of the genetic variations for a very small handful of those differences.” Now Daly wants to elucidate the function of those gene variations by conducting large-scale studies.
“It’s hard to explain to people that, just because we have this machine that can sequence your genome, doesn’t mean we can actually tell you all of these exciting things about your medical past and future,” he says. “We need to go through a period of using the technology in a research setting to learn where in the genome the most interesting bits of information are, as well as what the genes are telling us about the causes of disease and how we can reverse that. That’s now our focus.”