Our long-term goal is to reverse-engineer the human brain in order to understand how a system can construct the complex phenomenon that we call vision.

Many approaches to vision focus on the static aspects of vision and analyze how the three-dimensional structure of the world is estimated from the two-dimensional images on the retina. However, due to the movements of the observer, movements of the eyes, and the movements of objects, it is clear that natural vision is highly dynamic, as emphasized by various researchers such as Joseph Ternus, Gunnar Johansson, and James J. Gibson. The projection of the three-dimensional world on our retinae (proximal stimulus) undergoes complex real-time changes that are dependent on both the properties of our environment (distal stimulus) and our own movements.

REFERENCE-FRAMES AND THE METRIC OF VISUAL REPRESENTATIONS

Thus, the visual system needs to select in real-time appropriate reference-frames and metrics in order to disentangle the properties of the environment from those that result from our own actions.

Retinotopy, the initial representation in the visual system: The optics of the eyes map the three-dimensional environment into two-dimensional images on the retina. These two-dimensional representations, known as retinotopy, are preserved in early visual areas in the cortex.

 

EXAMPLE OF RETINOTOPY: Spatially neighboring stimuli are mapped on neighboring regions in early visual cortex. Top panels: Stimuli viewed by the observer; bottom panels: Areas activated by each stimulus (color coded) in early visual cortex.

 

Most approaches to vision build perceptual representations by spatial and motion mechanisms that operate on these retinotopic representations. However, it has been long known that a retinotopic image is neither necessary nor sufficient for the perception of form: When a moving object is viewed behind a narrow slit cut out of an opaque surface (anorthoscopic perception), all information about the moving object’s shape collapses temporally on a narrow retinotopic locus in a fragmented manner, i.e. there is no spatially extended retinotopic image of the shape. Yet, observers perceive a spatially extended and perceptually integrated shape moving behind the slit instead of a series of fragmented patterns that are confined to the region of the slit. Anorthoscopic perception shows that a retinotopic image is not necessary for the perception of form.

ANORTHOSCOPIC PERCEPTION

The visibility of a “target stimulus” can be completely suppressed by a retinotopically non-overlapping “mask stimulus” that is presented in the spatio-temporal vicinity of the target stimulus, phenomena known as para- and meta- contrast masking (Bachmann, 1984; Breitmeyer & Öğmen, 2006). These masking effects indicate that the existence of a retinotopic image is not a sufficient condition for the perception of form and that the dynamic context within which the stimulus is embedded plays a major role in determining whether form perception will take place.

NON-RETINOTOPIC VISION

Our research shows that many phenomena such as form, motion perception, visual search, attention, hitherto thought to occur in retinotopic representations, occur instead in non-retinotopic representations.

WHY NON-RETINOTOPIC VISION? THE PROBLEMS OF MOTION BLUR AND MOVING GHOSTS

The visible persistence of a briefly presented stationary stimulus is approximately 120 ms under normal viewing conditions (e.g., Haber & Standing, 1970; see also Coltheart, 1980). Based on this duration of visible persistence, one would expect moving objects to appear highly blurred. For example, a target moving at a speed of 10 deg/s should generate a comet-like trailing smear of 1.2 deg extent. The situation is similar to pictures of moving objects taken at an exposure duration that mimics visible persistence. As illustrated below, in such a picture, stationary objects are relatively clear but moving objects exhibit extensive blur.

ILLUSTRATION OF MOTION BLUR AND MOVING GHOSTS PROBLEMS (Reproduced from Ogmen (2007) Adv Cogn Psychol., 3(1-2): 67–84. Original photo from FreeFoto.com by permission). 

Unlike photographic images, however, visual objects in motion typically appear relatively sharp and clear (e.g., Bex, Edgar, & Smith, 1995; Burr & Morgan, 1997; Farrell, Pavel, & Sperling, 1990; Hammett, 1997; Hogben & Di Lollo, 1985; Ramachandran, Rao, & Vidyasagar, 1974; Westerink & Teunissen, 1995).

Our research suggests that the extent of motion blur is controlled by visual masking mechanisms in retinotopic representations.

However, masking mechanisms solve only partly the motion blur problem. If we consider the example shown in the picture above, masking mechanisms would make the motion streaks appear shorter thereby reducing the amount of blur in the picture. Yet, although deblurred, moving objects would still suffer from having a ghost-like appearance. For example, notice the appearances of targets moving fast, those that are moving more slowly, and the stationary objects. Rapidly moving objects have a ghost-like appearance without any significant form information while slowly moving objects have a more developed form, and finally, static objects possess the clearest form. This is because static objects remain long enough on a fixed region of the film to expose sufficiently the chemicals while moving objects expose each part of the film only briefly thus failing to provide sufficient exposure to any specific part of the film. Similarly, in the retinotopic space, a moving object will stimulate each retinotopically localized receptive-field briefly and an incompletely processed form information would spread across the retinotopic space just like the ghost-like appearances. We call this the “problem of moving ghosts”. We hypothesize that information about the form of moving targets is conveyed to a non-retinotopic space where it can accrue over time to allow neural processing to synthesize shape information.

COPERNICAN REVOLUTIONS IN THE BRAIN: REFERENCE FRAMES AND IMPLIED PERCEPTUAL AND COGNITIVE REVOLUTIONS

If retinotopic representations are mapped to non-retinotopic representations, what are then the appropriate representations to be used in non-retinotopic spaces? Since the problem itself arises from a variety of motions, its solution can be found in reference frames built according to motion patterns.

To appreciate this, consider the problem faced by astronomers prior to the 16th century. They used the earth as the center of their system and tried to express planetary motions according to this reference frame. This resulted in an overly complex system containing epicycles. A fundamental revolution occurred when Copernicus proposed a system where the reference frame shifted from earth to the sun.

In developmental psychology, Jean Piaget used the Copernican revolution analogy to highlight the shift of children’s perceptual and cognitive structures from self-centered (egocentric) reference frames to exocentric reference frames: “the child eventually comes to regard himself as an object among others in a universe that is made up of permanent objects (that is, structured in a spatio-temporal manner) and in which there is at work a causality that is both localized in space and objectified in things”.

Our research investigates the nature of non-retinotopic reference frames and their implications on perceptual and cognitive processes. Our recent and past findings can be found on the Publications page.