A small team of scientists has dramatically improved "gene chip" technology, for the first time making it a practical method for rapidly determining the sequence of genetic building blocks. The advance, likely to speed the search for disease-related genetic changes, is reported in the November issue of the journal Genome Research.
The gene chips, or microarrays, are dotted with a microscopic grid of hundreds of thousands of tiny segments of DNA determined by the Human Genome Project. These segments, fixed in known spots on the chip, find their matches in a sample of DNA. The better the match, the more likely the known DNA sequence reflects the unknown sample sequence, explains the team from the Johns Hopkins School of Medicine and a biotechnology company, Affymetrix, Inc.
A new analysis method lets the team identify and focus on the chips' more reliable information. "Until now, ways to analyze the chips were unable to distinguish highly accurate data from less reliable information," says lead author David Cutler, a research associate in the university's McKusick-Nathans Institute of Genetic Medicine.
"We need the chips to be very accurate because variation in the human genome is relatively rare," adds Michael Zwick, John Wasmuth postdoctoral fellow in the McKusick-Nathans Institute. "We've taken advantage of the fact that individual features, individual points, can be very reliable. Our analysis technique identifies them."
Previously, the only way to figure out the order of building blocks, or bases, in genetic material was to use machines known as DNA sequencers, which is still the only way to obtain the first genome of a species. Microarrays had been useful to study gene activity but not gene sequence, Cutler says.
"We took the most logical, straightforward approach we could to help us determine which of the microarray sequences to pay attention to and which ones to ignore," Cutler adds. "Our system actually evaluates and scores the reliability of each individual building block."
Using gene chips and their new analysis, the scientists were able to accurately determine the order of 2 million blocks of each of 40 individuals' genomes in just a year, a fraction of the time required by traditional technology. The parts of the human genome carrying instructions for proteins consist of about 45 million bases, while the whole genome is around 3 billion bases.
There's so little natural variability in human DNA that an analysis that is 99.9 percent accurate isn't good enough, Zwick says. Such an analysis would give 10 errors out of every 10,000 points in the sequence, but the natural variability in the human genome is only eight in 10,000. "To be useful, the data has to have fewer than eight errors in 10,000," he explains.
The new technique identified more than 80 percent of the determined sequences as 99.9999 percent accurate, making it possible to search those regions for changes that might be linked to diseases such as high blood pressure and schizophrenia, say the researchers, who will make their analysis tool available to others.
"Technology has advanced to where we can get lots of data about both gene expression and gene sequences," says Aravinda Chakravarti, director of the McKusick-Nathans Institute. "We can use these techniques to examine diseases that seem functionally similar but on a genetic level are not."
The study was funded by the National Institutes of Health. The other authors are Christopher Yohn, Katherine Tobin and Carl Kashuk of the McKusick-Nathans Institute of Genetic Medicine at Johns Hopkins; Minerva Carrasquillo, Debra Mathews and Evan Eichler of Case Western Reserve University; and Nila Shah and Janet Warrington of Affymetrix, Inc., Santa Clara, Calif.
Related Web Site: