Our long-term goal is to reverse-engineer the human brain in order to understand how a system can construct the complex phenomenon that we call vision.
Many approaches to vision focus on the static aspects of vision and analyze how the three-dimensional structure of the world is estimated from the two-dimensional images on the retina. However, due to the movements of the observer, movements of the eyes, and the movements of objects, it is clear that natural vision is highly dynamic, as emphasized by various researchers such as Joseph Ternus, Gunnar Johansson, and James J. Gibson. The projection of the three-dimensional world on our retinae (proximal stimulus) undergoes complex real-time changes that are dependent on both the properties of our environment (distal stimulus) and our own movements.
REFERENCE-FRAMES AND THE METRIC OF VISUAL REPRESENTATIONS
Thus, the visual system needs to select in real-time appropriate reference-frames and metrics in order to disentangle the properties of the environment from those that result from our own actions.
Retinotopy, the initial representation in the visual system: The optics of the eyes map the three-dimensional environment into two-dimensional images on the retina. These two-dimensional representations, known as retinotopy, are preserved in early visual areas in the cortex.
EXAMPLE OF RETINOTOPY: Spatially neighboring stimuli are mapped on neighboring regions in early visual cortex. Top panels: Stimuli viewed by the observer; bottom panels: Areas activated by each stimulus (color coded) in early visual cortex.
Most approaches to vision build perceptual representations by spatial and motion mechanisms that operate on these retinotopic representations. However, it has been long known that a retinotopic image is neither necessary nor sufficient for the perception of form: When a moving object is viewed behind a narrow slit cut out of an opaque surface (anorthoscopic perception), all information about the moving object’s shape collapses temporally on a narrow retinotopic locus in a fragmented manner, i.e. there is no spatially extended retinotopic image of the shape. Yet, observers perceive a spatially extended and perceptually integrated shape moving behind the slit instead of a series of fragmented patterns that are confined to the region of the slit. Anorthoscopic perception shows that a retinotopic image is not necessary for the perception of form.

ANORTHOSCOPIC PERCEPTION
The visibility of a “target stimulus” can be completely suppressed by a retinotopically non-overlapping “mask stimulus” that is presented in the spatio-temporal vicinity of the target stimulus, phenomena known as para- and meta- contrast masking (Bachmann, 1984; Breitmeyer & Öğmen, 2006). These masking effects indicate that the existence of a retinotopic image is not a sufficient condition for the perception of form and that the dynamic context within which the stimulus is embedded plays a major role in determining whether form perception will take place.
NON-RETINOTOPIC VISION
Our research shows that many phenomena such as form, motion perception, visual search, attention, hitherto thought to occur in retinotopic representations, occur instead in non-retinotopic representations.
WHY NON-RETINOTOPIC VISION? THE PROBLEMS OF MOTION BLUR AND MOVING GHOSTS
The visible persistence of a briefly presented stationary stimulus is approximately 120 ms under normal viewing conditions (e.g., Haber & Standing, 1970; see also Coltheart, 1980). Based on this duration of visible persistence, one would expect moving objects to appear highly blurred. For example, a target moving at a speed of 10 deg/s should generate a comet-like trailing smear of 1.2 deg extent. The situation is similar to pictures of moving objects taken at an exposure duration that mimics visible persistence. As illustrated below, in such a picture, stationary objects are relatively clear but moving objects exhibit extensive blur.
