Johns Hopkins Magazine

	The Weird, Weird World of DNA Amazement reigns and surprises abound for scientists intent on better fathoming the stuff of life. By Michael Purdy Photos by Bill Denison Two of the most revolutionary, far-reaching, and out-and-out weird breakthroughs of 20th-century science came together in a volume published in 1944. The two weird breakthroughs were quantum mechanics and DNA, and they "met" in a book called What Is Life? by Erwin Schrödinger, a famous German physicist who was one of the founders of quantum mechanics. Schrödinger summed up the book's objective this way: "How can the events in space and time which take place within the spatial boundary of a living organism be accounted for by physics and chemistry?" The molecules and atoms of life, Schrödinger reasoned, should be no different from those studied in physics, and should obey the same laws. So he tried to apply those laws and principles to solving biology's most important problem at the time: determining how genetic information was stored in living organisms. At that time, DNA still wasn't very high on the list of suspects for carrying genetic information. Researchers were mostly chasing proteins. Though a few obscure experiments had begun to point to DNA, full clarification of its structure was nearly a decade away. Schrödinger found it challenging to reconcile the biological and physical evidence available to him. Biological studies, he explained, suggested that the "admirable regularity and orderliness" of living organisms were coming from a "well-ordered group of atoms." But modern physicists, he asserted, were finding randomness and uncertainty to be dominant themes as they studied ever smaller groups of atoms. In the end, Schrödinger turned metaphysical, resorting to a concept that has less to do with biology than it does with philosophy. The physicist attributed life to the self-- the "I" at the center of everyone's consciousness--which, he said, was undying and continuous and in charge of the action of the atoms in the self's physical body. If it seems surprising that a scientist would be driven to philosophize while trying to understand DNA, that may be due in part to our modern conceptions. The general public tends nowadays to think of DNA in terms of tameness and predictability--as a "library" or a "blueprint." The Human Genome Project and a private company, Celera, announced this year that they had sketched out the entire sequence of the human genetic code, seemingly planting the flag on the last great summit of DNA research. Scientists who work with DNA know better. They know DNA and the systems it interacts with are incredibly complex, frequently amazing, and prone to handing out unpredictable surprises. In a word, DNA is weird. Three biology faculty members at Hopkins's Krieger School of Arts and Sciences can testify to that on the basis of surprising and amazing discoveries they've made in their career-long quests to better understand the stuff of life. DNA Structure, or What do you get if you cross a ladder and a corkscrew? The most basic component of DNA, known as a nucleotide, consists of three parts: a molecule known as a phosphate linked to a sugar linked to one of five kinds of bases. Each nucleotide can hook up to another nucleotide above and below it, forming a chain with the bases protruding. Hook two of these chains together at the bases, and you have a ladder-like structure, with the bases forming the rungs. Scientists refer to each rung as a "base pair," and use it as their yardstick for measuring distances in DNA. To finish the DNA, twist one end of the ladder while holding the other end still. The whole thing will develop a corkscrew-like shape known as a double helix, with a turn about once every five base pairs. The DNA in human cells is organized into two pairs of 23 chromosomes, which are made up of DNA and several chromosomal proteins. When the DNA and these proteins are bound together, they're collectively called chromatin. The pairs of chromosomes provide you with two copies of each of your genes. These "copies" are frequently different from each other, so scientists call them alleles. Evangelos Moudrianakis (PhD '64), Hopkins professor of biology and biophysics, is fond of pointing out a mind-numbing fact: Since there are billions of base pairs in human DNA, if the DNA were uncoiled and laid out straight, the average human genome, or complete genetic code including both copies of every gene, would be about two meters long. That's two meters of genetic material crammed into every single cell in your body. Actually, it's not just crammed into the cell--it's crammed into a pocket inside the cell known as the nucleus, just a few microns (millionths of a meter) wide. Moudrianakis likes to point out that amazing bit of trivia because in the 1960s and early 1970s he was among the first to figure out how DNA is compacted. His work and that of others in the early 1970s triggered a fundamental change in understanding how DNA is packed into the nucleus. Ever seen one of those new space-saving library shelving units, in which the shelves are compacted together until you push a button that spreads them apart so that you can reach a book? The books are like the genes in the DNA and the shelves are like the chromosomal proteins. Collectively, this entire assembly is known as chromatin. The structure of chromatin can change, altering the accessibility of certain genes through compaction and expansion. A cell doesn't use all its genes at once--it uses some often, others rarely, and still others not at all. This allows cells to become specialized types, like blood cells or nerve cells, or to respond to changes in their environment. Understanding factors that affect a cell's usage of a gene, referred to by scientists as the gene's "expression," is key to understanding genes' contributions to healthy and abnormal functions.
Moudrianakis focuses on the intricacies of chromatin.	"Saying chromatin structure is important is a bit like saying water is important," observes Moudrianakis, whose fondness for provocative statements is often masked by his soft-spoken, grandfatherly demeanor and his melodious Greek accent. "Obviously chromatin structure is important! What's really important is that chromatin structure is dynamic, and changes in that structure regulate gene activity." A quietly effervescent raconteur, Moudrianakis has been at Hopkins for four decades. He began his graduate training in the early 1960s as a biophysicist under Michael Beer, now an emeritus professor of biophysics. It was during this time that the duo made the world's first published attempt to determine the sequence of bases in a DNA molecule. Their technique, quite different from the one developed years later, involved finding "tags" for each of the four bases in DNA, and then using an electron microscope to look for these tags, like tiny flags sticking out on the strand of DNA. Moudrianakis had hoped to find a computerized method for recognizing the flags--there are far too many bases in DNA for a person to inspect them all visually. Ultimately, he discovered that the technology wasn't available. A fondness for using the electron microscope, which has lasted to this day, took root in him at that time. When he joined the Hopkins faculty in 1964, he continued to use electron microscopes to study DNA's relationship to chromosomal proteins. At that time, scientists thought chromosomal proteins lay across multiple parallel strands of DNA in a grid-like structure. In one popular depiction, the proteins provided the horizontals of the grid and DNA the verticals. The proteins were thought to rest in the grooves formed by the twists of the double helix, like pins across the grooves in a line of screws. But this picture was based on experimental data that could be interpreted in different ways. With an early graduate student, Moudrianakis in 1967 set up a daringly simple experiment to try to look directly at chromatin in its natural state. He used a colony of human cells in culture and a method for controlling their growth cycle. When cells are getting ready to divide, they duplicate their DNA, a feat that requires some unpacking of the DNA and the chromosomal proteins. "We gently broke the cells open on the microscope grid at this point, before the chromatin had been packed into chromosomes again," Moudrianakis recalls, "and this came out." "This" was a string of black and white beads on a cluttered gray background, captured in an electron micrograph that sits on his bookshelf. The chromatin in the photograph, he notes proudly, is the longest ever to be imaged in its natural beaded form from end to end. To further understand how DNA and chromosomal proteins relate to each other, Moudrianakis came up with another simple experiment. He decided to see if he could force DNA to take on a chromatin-like structure without chromosomal proteins. To do this, he put DNA in a test tube and altered the properties of the test-tube environment to resemble the properties of chromosomal proteins. He knew chromosomal proteins are positively charged and neutralize the negative charge of DNA, so he put salt in with the DNA to neutralize its charge. He also knew chromosomal proteins are hydrophobic, which means they chemically push away water. To duplicate this effect, he added alcohol to the test tube. He took electron micrographs of the DNA as he did this, and at a particular point the DNA suddenly shortened and thickened dramatically. The micrograph from this moment is also framed and propped up in his office. "When I showed that picture to Mike Beer without telling him what it was, he said, 'Oh my God, what a beautiful picture of chromatin,' and I said, 'Mike, there's no chromatin in that picture. It's DNA,' " Moudrianakis says, laughing with delight at the memory. DNA molecules in his experiment were taking on a structure very similar to chromatin: about the same width, and twisted into a structure known as a supercoil (like a telephone cord that's been twisted through frequent use). Scientists conducted follow-up tests on other kinds of DNA, and always found the same result. "We concluded that what we proved was a property of the double helix itself," Moudrianakis said. The physical properties of the material that make up DNA, and the configuration they're placed in, make DNA naturally want to coil up in response to certain chemical and physical conditions in its environment. "It just needs to come to the right thermodynamic state, and that's what chromosomal proteins help it do," Moudrianakis explains. In the decades since then, the biophysicist has continued to work with chromatin. He was the first to determine the structure of a single unit of chromatin (known as a nucleosome). "It's a piece of DNA wrapped around a spool of eight chromosomal proteins," Moudrianakis explains. "The proteins in the spool are partitioned in three groups." The surface of the spool is like a bead with grooves on it. The DNA fits into the grooves, taking two turns around every bead. The chemical bonds between the three protein groups in the spool are weak and can be pried open and closed, just like those expandable library shelves. "When the chromatin structure is stretched open, genes in the DNA are active, and when it's closed, the genes are inactive," Moudrianakis concludes. These processes controlling when genes are turned on and off, he explains, are key to understanding genes' effects on health and development. Transcription, or Getting an order for the protein factory Information is read from DNA in a process called transcription. It starts when a compound known as RNA polymerase pries the DNA in two, then binds to one side of the DNA and moves along the single bases sticking out from that side. Depending on which side of the DNA the RNA polymerase is on, it may travel up or down the DNA--opposite sides "read" in opposite directions. Whatever direction the code reads on the side in question is "downstream." RNA polymerase uses the patterns in the bases as a guide for building a third compound: messenger RNA. Messenger RNA is like a genetic order form--it carries the recipe for a protein from the DNA to protein-building factories outside the nucleus. In the 1960s, scientists began to decipher how DNA's code of bases translates into amino acids, the building blocks of proteins. They also found sequences of bases that seemed to have a more elementary function: telling gene-reading equipment where to start and where to stop, for example. The area on the DNA where RNA polymerase binds to start reading a gene is known as the gene's "promoter" site. Robert Schleif had to put one DNA puzzle on a back burner for a few years. But he ultimately solved it, winning a permanent place in textbooks for showing that DNA could do something scientists hadn't expected it to do. In his postdoctoral studies at Harvard, Schleif had begun to study a protein known as AraC in the bacterium E. coli. The protein is involved with the bacterium's metabolism of a sugar, arabinose. Scientists had found some indications that AraC was active in an odd gene regulation process that could turn genes on or off depending upon the availability of arabinose. Schleif's mentor at Harvard, Walter Gilbert, had shown that it was possible for a protein to bind to DNA in a way that repressed the activity of a gene. The protein he isolated was produced by one gene and then bound to DNA at a different gene's promoter site. This made it impossible for RNA polymerase to bind to the DNA at the second gene's promoter site and start reading it. Schleif suspected that the AraC protein might be involved in a similar regulatory process in E. coli. Soon after he arrived at Brandeis University as an assistant professor in 1972, Schleif heard of puzzling experimental results produced by another researcher working with the AraC protein--results that suggested AraC could repress the activity of the other genes by binding to the DNA upstream from those genes, instead of binding on their promoter sites. Schleif conducted his own experiments, and his data suggested that AraC's DNA binding site might be even farther away than previously suspected--perhaps 300 base pairs away.
When Schleif announced his findings at a meeting, interest in DNA looping spread rapidly.	Scientists had no way of explaining such power at a distance. It was like a paperweight holding down a paper that it wasn't even partially on top of, or a dam creating a reservoir on a tributary miles downstream. "We were shocked," remembers Schleif, a thin man with bright eyes, unruly gray hair, and a ready smile. "We tried very hard to proceed with the work, and we developed techniques for the isolation and study of the relevant DNA, but these [techniques] were not quite good enough." Stumped, Schleif began to explore other questions about the AraC gene, but in the back of his mind he kept picking apart the problem. He had a cause, protein binding to DNA; and an effect, genes shutting down. How were the two linked? He started to think about the possibility that while the AraC protein was bound to DNA, it might be interacting with a second protein also bound to DNA. AraC binds to DNA upstream from the genes that were getting turned off; if it interacted with another protein downstream from the genes, the two proteins could potentially pinch off a segment of DNA in a loop that contained the genes. Pinching the genes in the loop, he reasoned, might turn the genes off by leaving little or no room for the gene-reading machinery to come in and bind to the DNA at the right spot to start reading those genes. Schleif wondered if inserting the right amount of DNA between the two proteins that created the loop wouldn't similarly disrupt the looping process. When his lab, still at Brandeis years later, encountered further evidence that their earlier findings correctly estimated the distance between the genes and AraC's DNA binding site, he was ready with a test for his loopy idea. But he encountered unexpected resistance. "In our group meeting, I proposed three ways that this phenomenon might work," he recalls, smiling fondly. "One of them was DNA looping, and everyone laughed at me." Schleif couldn't sell anyone on the notion of conducting the experiment and so ended up doing it himself. Once he had arranged and modified the necessary equipment, Schleif got just the results he was hoping for. An insertion of five base pairs re-enabled the genes, but with a 10 base pair insertion the genes continued to be repressed. "I was actually in the next building talking with a colleague at the time when my technician excitedly came over and said it had worked!" he remembers.
Schleif: His "loopy" ideas paid off.	When he announced his findings at a meeting, Schleif recalls, interest in DNA looping spread rapidly. At the time, many scientists were studying the genetics of animal viruses, and a number of researchers had encountered a problem similar to Schleif's but slightly different in one respect. They were finding that proteins binding to DNA a distance from a gene could be key to that gene being turned on, instead of turning the genes off. "These sites quite a distance away were called enhancers because they enhanced the activity of a gene's promoter," Schleif says. "People were conjecturing how in heck it worked. None of the proposals involved interaction at a distance through looping. I gave my presentation, and the next year at the meeting everybody was saying OK, looping is the way an enhancer works. So it had generated a big change in people's thinking." Why would DNA want to loop? "DNA looping solves a problem for you in that you can only put so many proteins at a gene's promoter site to regulate it," says Schleif, who published his first looping findings in 1986, shortly before joining Hopkins. "But if you've got a complex system that must respond to a variety of signals, DNA looping allows proteins to bind at some distance from that promoter [site] and still affect it. "Now, of course, it's clear that looping is not the only way you can get apparent action at a distance, and that things are generally more complex than the simple system we happened to have stumbled onto," says Schleif, who today continues to study regulation by the AraC gene in E. coli. "But it was really nice to find something that explained a problem that had puzzled me for a long time and was a general principle that nature used from time to time. In general in science you can't expect to be so lucky." Schleif's lucky opportunity let him help science open a window onto a range of new mechanisms used to regulate DNA. Splicing, or Even DNA needs an editor Not everything in the genetic code goes into the protein-making message. Large sections of the coding, known as introns, are "junk DNA"--they don't code for anything. When the gene-reading machinery is working properly, it always splices introns out of the messenger RNA and discards them before splicing the exons, the legitimate protein-building instructions, back into a whole and sending them on their way out of the nucleus. Scientists have found that exons sometimes can also get spliced out of messenger RNA and discarded. This can make the protein-building instructions in some genes less like a model-building kit and more like a box of Legos--there's not just one final product made by bringing all the parts together in a particular way, but potentially several different products can be made by bringing some or all of the parts together in different ways. Instead of one gene always creating one protein--a model that prevailed for decades--scientists know now that some genes can create many different proteins. Somewhat lost in the huzzah over last February's announcement that an initial, sketchy map of the human genome had been completed was the biggest scientific surprise of those findings: The genome appeared to have significantly fewer genes than scientists were anticipating. Researchers thought there would be about 80,000 genes, but they found only about 30,000 (other biologists have recently challenged this number). One week after the genome was published, Victor Corces, chairman of Hopkins's biology department, published a paper in the February 22, 2001, Nature that might help account for some of the deficit. Earlier estimates of genes to be found in the human genome were based on an estimate of the number of proteins in a human, and Corces's finding has the potential to increase the number of proteins that can be created from a given number of genes, perhaps lessening the deficit. Corces isn't naturally inclined to seek headlines. He's been quietly working for decades to understand chromatin structure in the fruit fly genome--highly technical work that requires meticulous attention to detail. His research combines genetics, molecular biology, and cell biology, and sometimes gets so technical, Corces admits, that it can leave other scientists a little confused when he tries to give them a quick summary.
Corces: accounting for the genome's gene "deficit"	Corces's group was studying a gene known as mod(mdg4) that scientists had shown was essential to a fruit fly's ability to establish or maintain chromatin structure. Without proper chromatin structure, the genetic library has no shelves, the genetic blueprint no drafting table, and an organism isn't long for the world. Corces's group had identified two mutations in the gene that could severely impair the gene's ability to function, and had shown that putting either of those mutations in both copies of a fruit fly's mod(mdg4) gene killed fly embryos. However, when they placed one type of fatal mutation in one copy of the gene and the second kind of fatal mutation in the other allele, the flies survived. "We didn't understand what we were seeing initially and had assumed it was an error," Corces recalls. But when they took a closer look at how instructions for making the critical protein were put together, they encountered a surprise. Normally, the instructions for building a protein only come from the base pairs along one side of DNA after its strands have been split in two. Corces's group found that instructions for building the fruit fly protein appeared to be coming from both sides. They considered three explanations. Something might be making RNA polymerase stop and jump from one side of the DNA to the other. The gene might be rearranging itself to bring all the right sequences to one side of the DNA. Or both sides of the DNA were being read at once into two different messenger RNA molecules that were later spliced together.
For many conditions, researchers are now looking not just for one problem in one gene, but for how a variety of genes interact with factors in the environment.	"We found no examples of the first two ideas in the literature, but there were a few references to the third," says Corces. Those few examples typically described the information from the other side of DNA being spliced out-- nobody had ever before seen an example where the information from the other strand was kept in. Although unprecedented in that regard, trans-splicing still seemed the most plausible explanation. Corces's group later showed that the two mutations occurred on different sides of the DNA molecule. Each copy of the gene could therefore contribute a valid and different part of the protein-making instructions. The cell's gene-reading machinery then spliced these together into the undamaged formula for building the whole protein. The information from the side of the DNA opposite the mod(mdg4) gene that Corces's group had been studying is actually lifted from another fruit fly gene. This gene is thought to be involved in apoptosis, a process that sick or damaged cells can use to pull their own plug. "We've known for some time that one gene can make multiple protein products, but this increases even further the possibility of making multiple protein products from just two genes," Corces says. He notes that mod(mdg4)'s links to trans-splicing means that there are now more than 20 proteins that the gene can make wholly or in part. At the time the trans-splicing paper was published, Steven Salzberg, a researcher with joint appointments at Hopkins and The Institute for Genomic Research in Rockville, Maryland, called it a very significant finding: "It suggests another way we need to look at the genome to look for more proteins." Since that time, Salzberg has developed and begun using computerized techniques to search for other potential instances of trans-splicing. As the post-genomic era of research begins, another one of Erwin Schrödinger's odd creations may help offer a perspective on the challenges facing the new age: his imaginary pet cat. This luckless beast, he proposed, spends its days inside a black box with a small sample of a radioactive element and a vial of poisonous gas; the two are connected together so that when the radioactive element decays, the gas is released and the cat is killed. Including the radioactive element was Schrödinger's way of getting nature periodically to flip a coin for him inside the black box. Over shorter spans of time, radioactive elements decay unpredictably, and scientists can only give the odds that a decay will occur, not firm predictions of when. Schrödinger set up his cat scenario so that every hour there's a 50-50 chance that the unspecified radioactive element will decay and kill the cat. It's impossible to know how the coin flip has gone until you open the box. Schrödinger wasn't trying to work through any childhood traumas with felines. He wanted to demystify one particularly mind-blowing aspect of quantum mechanics. This aspect suggested that the most basic unit of light, a photon, could either take on the characteristics of a wave or a particle; but once it had been observed having the characteristics of a wave or a particle, it would stay that way. The cat in the sealed black box, Schrödinger said, was just like the photon before observation. Until the box is opened, it's impossible to know if the cat is alive or dead, and it must therefore be considered to be in an "in-between" state. Once the box is opened, the cat will permanently move into one state or the other. Biologists' picture of the gene is starting to resemble Schrödinger's photon. More and more, they're thinking of the gene as potential information that is capable of going down different roads and producing different results, depending on factors outside of what's coded into DNA. These factors include mechanisms like DNA looping or alterations in the way DNA is stored that can change the level at which a gene is expressed; the editing that the gene-reading machinery performs on protein-building instructions; and, possibly, whether those protein-building instructions come from one side of DNA or both sides. And there are many other factors not described in this article, such as changes to a protein's structure after it's been produced; imprinting, a process that can selectively make one copy of a gene more active on the basis of the parent that the gene originally came from; and transposons, genes that jump in the DNA. The path from DNA to life, therefore, has more than one black box, but these boxes are more permeable than those in quantum mechanics. The science of drilling holes in these theoretical black boxes and determining how information in the gene is connected with effects on life has its own name: epigenetics. The journal Science devoted a special issue to it this past summer. Meanwhile, changes and advances in basic DNA research are reflected in a broadening in the way medical researchers study the gene's roles in disease. For a variety of conditions like psychological illnesses, heart diseases, and addiction, researchers searching for causes are now looking not just for one problem in one gene, but for how a variety of genes interact with factors in the environment. As the basic and applied sciences pull closer together, and the many black boxes of epigenetics gradually start to reveal their contents, it's tempting to speculate that these discoveries may play a large part in life's ability to contain so many seemingly contradictory urges and imperatives. Even in the modern day, it's hard to avoid getting a little philosophical when considering the intricacies of DNA. Must be something in our genes. Michael Purdy is a Baltimore science writer and frequent contributor to Johns Hopkins Magazine. Return to November 2001 Table of Contents


	The Johns Hopkins Magazine \| The Johns Hopkins University \| 3003 North Charles Street \| Suite 100 \| Baltimore, Maryland 21218 \| Phone 410.516.7645 \| Fax 410.516.5251