Johns Hopkins Magazine - September 1996 Issue

A Primer on Vision

By Elise Hancock

Is the world really the way it looks?

From a commonsense point of view, the answer is no. If the world were the way it appears on your retina, the universe would jerk every time you took a step or moved your head. It would disappear every time you blinked. And since the retina receives a flat picture, we'd all be two-dimensional, living in a world of meaningless fragments.

If you think about it, you'll notice that in fact you seldom see an entire object. Half the rocking chair will sit behind the doorframe. Chunks of tree are hidden by picket fence. Yet your mind makes these fragments whole, without giving it a thought. We infer continuity from blink to blink, jerk to jerk. We understand three dimensions so deeply that even when we view an object from an unfamiliar angle--imagine a rocking chair seen through a glass floor--we still know what it is. At a glance, literally in milliseconds, we routinely make sense of what we see.

No computer, however powerful, can perform such marvels. Cameras can easily produce an image of a scene, no problem, but within the image computers cannot even reliably pick out an edge, to decide where one object stops and the next begins.

Do other creatures have vision like ours?

Not precisely, though Old World rhesus monkeys are similar. All creatures have vision adapted to the way they live.

Humans see as much color as three receptor types for color make possible, as do many fish and birds. (Without color vision, the peacock's tail would hardly impress his lady.) We also see detail well, though not as well as do eagles, hawks, and vultures. But dogs and cats live in a world constructed from two color receptors, and cats, nocturnal by nature, see large dim objects better than details. Honeybees are customized for finding flowers (they have not only color vision but short-wave vision), while pigeons are adapted to pecking and guard duty. Having eyes on the sides of their head, these birds see a 340� surround; 3D vision is limited to where it matters most--right in front of the beak. This list could go on and on.

The major commonality presumably goes back hundreds of millions of years: In all vertebrates and some invertebrates the receptor cells in the retina rely on the same two chemicals: opsin and retinal, otherwise known as vitamin A. That's one reason why vitamin A deficiency is so devastating.

I studied all about rods and cones and corneas in 8th grade. Could you jump ahead to how the system interprets the disjointed jumble we see?

The neat stuff starts in the retina, which is technically part of the brain, only displaced toward the light. However, we can keep the eyeball minimalist.

So: Light energy bounces off each person's personal universe (whatever's out there); enters the eye; and is focused on the retina as an isomorphic (iso = same, morphic = shape) picture of what you see. The image is shaped by some 5 million rods and 100 million cones. These serve much like pixels on a TV screen, because each responds only to photons of a given wavelength and/or brightness. Those that are stimulated go ready, set, fire!--and the picture appears. We see.

The picture is upside-down on the retina, but your brain doesn't care. To know where things are, it only needs consistency. "People always ask about that," says Michael McCloskey, a professor of cognitive science. "They're kind of assuming that there's some kind of homunculus inside, somebody looking at the image who needs to have it right-side up."

It works better to think of the retinal image as 100 million bits of raw data--nothing the brain can use till it's been deconstructed and recoded into dozens and dozens and dozens of categories. These the cortex will finally reassemble some 4 to 5 milliseconds later. ("Be sure to step around that chair.")

Deconstruction: The retina itself starts the sorting, as many millions of signals pass through its two layers of neurons to converge in only one million ganglion cells. Random neural firings (noise) don't make the cut, because the ganglions pass along only strong signals. Nor do stimuli reporting more of the same. Rather, the ganglions single out indications of movement, edges, and color contrast, as coded by the retinal neurons.

From the retina, the picture goes to the brain per se?

Not yet. First comes a structure in the thalamus that is much like a layer-cake: six copies of the isomorphic image, one stacked precisely on top of the other. This is the lateral geniculate nucleus, or LGN, and you can think of it as an organizer.

Each layer gets signals from both eyes, the better to see three dimensions (and therefore to know which stimuli belong together). But only two layers get clues about motion, destined for what neurologists call the "Where" pathway of the brain. The other four layers seize on stimuli to do with "What."

The layers talk with one another, and the LGN talks with regions elsewhere in the brain. For example, it gets feedback from memory, presumably to help it process signals. Indeed, researchers have recently learned that traffic is heavier between the LGN and the cortex than between the LGN and eyeball.

From the LGN, streams for What and Where go separately to the striate cortex, where the isomorphic image will be analyzed for the orientation of edges. Specialized receptors fire if stimulated by a line that is vertical, horizontal, at 122�, 127�, or whatever. Most signals for Where then go to the parietal lobe (in the crown of the head), and those for What to the temporal lobe (near the temples). Now the isomorphic picture transmutates into enigmatic blips (to researchers). But for the see-er, meaning emerges.

The striate cortex takes up some 20 percent of the cortex, more than any other sense, and it used to be considered "the" visual cortex. In the last few years, however, neurologists have realized that in humans, about 60 percent of the cortical surface involves vision, which acts as a consultant for many other areas, such as the motor cortex. ("There's that mug.") Still, 60 percent is a lot of cortex, an amount that underlines the centrality of vision to the human animal.

Not all vertebrates rely so much on vision. Rats, for instance, use most of their brain to process smells. But for humans, what we see is the major way we understand and locate objects in the world, including ourselves.

Where does the conscious mind come in?

That's what neuroscientists would like to know, and do not. "The highest levels in the visual system are in the [infero]temporal lobe," explains Edward Conner, a neurophysiologist at the Krieger Center. "If there's any single place in the brain where you say, 'Oh, that's a cardinal,' that's it. But many people would argue that [conscious thought] is a function of the entire system."

It used to be thought that ultimately, for every concept or memory, there'd be a single cell. You'd have a single cell that recognized Grandma, for instance. Now, however, researchers think that conscious experience is diffuse, a matter of patterns, not single cells.

"If it's one neuron, one concept," says Hopkins neurologist John Hart, "my mental capacity is limited by my number of neurons. If it's patterns, I can keep on learning"--as people do. A single neuron may have several thousand connections, which points toward pattern. In addition, neurons die all the time, so encoding everything in single cells would be risky. Natural systems tend to be robust, with many backups and alternative pathways. It makes sense that the mind would be the same way. Otherwise, says Hart, "you'd lose too much."

So memories are stored as patterns?

Probably: patterns made from specific elements stored in specific regions of the brain. Whatever the system is, it is not mushy. To use a code again and again--codes for the color red, rightward movement, or a line angled at roughly 127�, for instance--the brain needs to know where each code is.

Let's take visual memory, which neuroscientists agree is primarily stored in the temporal lobes, by category. Each category constitutes a network of neuronal activation that stores "all the things we've learned since childhood," says Hart. These categories constitute our tools for visual thought, and they contribute to the famous adaptability of human beings, a species that can live in climates from the tropics to the polar ice cap. We all have much the same categories, Hart says, but their specific contents are learned.

Some categories are characteristics--color, shape, size, texture, movement, depth, "and maybe some things we don't have words for yet," says Hart. Others are objects: animals, plants, food, fruits, vegetables, letters, numbers, colors, body parts, and household objects.

The categories are anatomically separate in the brain, neurologists believe, on evidence from animals, and because the categories get disordered individually. For instance, a famous patient in Germany could not perceive movement. Rather, she would see a series of still pictures in which people and objects mysteriously appeared and disappeared. She'd see a car in the distance, then the car close by, but never the car moving. To cross streets, she had to figure out the speed of traffic from sounds.

Brain lesions in particular parts of the visual cortex tend to produce identical handicaps within a category: an inability to recognize colors, for instance, or fruits, or animals.

Is the brain structure of the categories similar for all of us?

That's controversial, but Hart thinks so. He's been testing cognitive function in pre-operative epilepsy patients for 10 years now, in order to help pinpoint tissues involved in memory and language, so the surgeons can avoid them. In Hart's experience, the occasional city-dwelling patient lacks a plant category. Other than that, he says, "we all put things in pretty much the same place. Memory seems to have a predisposition for particular locations. So there must be some kind of gross organization, hard-wired."

Otherwise, he argues, "why would lesions in the same part of the brain knock out the same category?" Also, the sensory systems work much the same on all of us, and why would categories be different? Nature usually sticks with a design that works.

How does the entire image go back together again?

That's another question neuroscientists ask themselves. They call it the "binding problem."

One promising approach emphasizes "attention," now being studied by a number of researchers at Hopkins. The large question here is, When you pay attention to something visual--say your toddler's ball in front of striped wallpaper--what happens in your brain?

Other stimuli are being inhibited at the cortical level, as it turns out. Receptors respond to the wallpaper stripes and signals reach the striate cortex, but then they peter out. The neural firing loses its vigor. So it is literally true that you can only pay attention to one thing at once. Brain imaging shows that even when your attention flits, it flits to one thing at a time.

But, you may say, the ball's roundness, redness, and movement are all in different categories. How does the brain gather them up? And how does it know where to go, or which stimuli to knock out?

One theory, among several being worked on at Hopkins, holds that as something (the ball) fires groups of neurons, their ons and offs oscillate at a particular rate--which the brain uses to identify scattered neurons all responding to the same object. And it dampens everything else.

Send EMail to Johns Hopkins Magazine

Return to table of contents.