Multistage Model for Binocular Rivalry
Abstract
Binocular rivalry is the alternating perception that occurs when incompatible stimuli are presented to the two eyes: one monocular stimulus dominates vision and then the other stimulus dominates, with a perceptual switch occurring every few seconds. There is a need for a binocular rivalry model that accounts for both well-established results on the timing of dominance intervals and for more recent evidence on the distributed neural processing of rivalry. The model for binocular rivalry developed here consists of four parallel visual channels, two driven by the left eye and two by the right. Each channel consists of several consecutive processing stages representing successively higher cortical levels, with mutual inhibition between the channels at each stage. All stages are architecturally identical. With n the number of stages, the model is implemented as 4n nonlinear differential equations using a total of eight parameters. Despite the simplicity of its architecture, the model accounts for a variety of experimental observations: 1) the increasing depth of rivalry at higher cortical areas, as shown in electrophysiological, imaging, and psychophysical experiments; 2) the unimodal probability density of dominance durations, where the mode is less than the mean; 3) the lack of correlation between successive dominance durations; 4) the effect of interocular stimulus differences on dominance duration; and 5) eye suppression, as opposed to feature suppression. The model is potentially applicable to issues of visual processing more general than binocular rivalry.
INTRODUCTION
When incompatible stimuli (such as orthogonal gratings) are presented to the two eyes, they are not fused into a single image. Instead, the monocular stimuli take it in turns to dominate perception. This phenomenon, binocular rivalry, provides a valuable means of studying the perceptual process because it involves a changing percept without any change in the visual stimulus. There has been a surge of interest in binocular rivalry over the last ten years, largely because physiological and imaging experiments have shown that rivalry is a process distributed across a hierarchy of visual cortical areas (Alais and Blake 2004). Binocular rivalry modulates neural activity in the primary visual cortex (Polonsky et al. 2000; Tong and Engel 2001) and in areas in higher cortex including V2 and V4 (Leopold and Logothetis 1996), MT (Logothetis and Schall 1989), inferior temporal cortex and superior temporal sulcus (Sheinberg and Logothetis 1997), and other high-order areas (Tong et al. 1998).
Modeling and predictive studies, however, have not kept pace with the empirical work. Early models consisted of two channels, one for each eye, and a single processing stage (Lehky 1988; Sugie 1982). These models and more recent ones with similar architecture (Laing and Chow 2002) were able to reproduce the stochastic alternation between one percept and the other (Fox and Herrmann 1967). Lumer (1998) also used two channels in his model, but expanded the number of stages to four. Rivalry in this model produced larger neural modulation at the later stages than in the earlier stages, replicating some of the results from experiments on behaving monkeys (Sheinberg and Logothetis 1997). More recently, Wilson (2003) described a four-channel, two-stage model. Two of the channels were driven by the left eye, two by the right, and for each eye, one channel was selective for one stimulus feature and the second channel was selective for an incompatible feature. This model was able to account for the observation that the perceptual alternation typical of binocular rivalry can occur when incompatible stimuli are swapped between the eyes several times a second (Logothetis et al. 1996).
The model described here goes a step further. It consists of four channels and multiple stages. Because all stages are architecturally identical, the number of stages can be set to match the empirical data. The model has three aims:
to account for well-established findings on the duration and independence of dominance intervals in rivalry, and on the effect of differing stimulus strength in the two eyes;
to contribute to the ongoing debate about the nature of binocular rivalry suppression: low-level and monocular (eye suppression) or high-level (feature, or stimulus, suppression);
most importantly, to account for the growing electrophysiological, imaging, and psychophysical evidence that rivalry is a process distributed across a number of visual areas.
Two principles have been used in designing the model. First, it is designed to be as simple as possible, and to produce explanatory and predictive power rather than exact replication of published data. The result is a model with eight parameters, each of which has an easily understood role in the rivalry process. Second, the model is designed to be modular, in that it is built by putting together a sequence of modules, each of which can be separately analyzed. The model is a quantitative version of previous qualitative models (Freeman and Morley 1997; Nguyen et al. 2001).
METHODS
Two-channel model
MODEL ARCHITECTURE.
The full model, consisting of four channels, is described below. To simplify explanation, we start with the two-channel model shown in Fig. 1. One channel, denoted A, is selective for a particular stimulus property (e.g., horizontal contours) and is driven by stimuli presented to the left eye. The other channel, B, is selective for an incompatible property (e.g., vertical contours) and is driven by stimuli to the right eye. Stage 1 represents monocular cells in layer 4C of primary visual cortex, but the locations of the other stages are left undefined. Stage 2, for example, could represent simple cells in primary visual cortex. The general stage, considered below, has subscript k, which is an integer less than or equal to the total number of stages n.

FIG. 1.Two-channel model. A: channel A is selective for one stimulus feature (e.g., horizontal contours) and is driven by the left eye. Channel B is selective for a different feature (e.g., vertical contours) and is driven by the right eye. These 2 channels are mutually inhibitory by way of feedback connections at each stage. B: this diagram shows signal processing at stage k of channel A, by cell Ak. Cell receives 3 synaptic inputs, sums the weighted inputs, and integrates the sum to produce the postsynaptic potential pk. This potential passes through a soft threshold to form the action potential rate ak.
SINGLE STAGE.
Early investigations of binocular rivalry led to a search for a model that could explain the empirical findings. A constant in this search was mutual inhibition. It was assumed that incompatible stimuli to the two eyes activated two neural populations that inhibited each other (Lehky 1988; Sugie 1982). Despite the assumption of mutual inhibition in most of the existing models for binocular rivalry, there is little or no direct evidence for such inhibition. The circumstantial evidence, however, is compelling: when one eye's stimulus is visible, the other's is not. For this reason alone, mutual inhibition will be assumed in the model developed here.
Ak, representing a single cell on channel A at stage k, is illustrated in Fig. 1B. The cell has three synaptic inputs that it weights (positive for excitatory input, negative for inhibition), sums, and integrates over time to form a postsynaptic potential pk. The postsynaptic potential must exceed a threshold to produce the action potential rate ak. Both postsynaptic potential and action potential rate are functions of time, but this dependency on time will not be shown explicitly in what follows. Now assume a second cell, Bk, on channel B at the same stage, with action potential rate bk. Cells Ak and Bk are assumed to be mutually inhibitory, as illustrated in Fig. 1A for k = 1, 2. The rate of change of postsynaptic potential pk with time, dpk/dt, is defined by
(1)
(2)The postsynaptic potential pk must reach a threshold level pt, to produce action potentials. The conversion of postsynaptic potential to action potential rate is defined by the following equation
(3)The equations for the postsynaptic potential qk and action potential rate bk in cell Bk are of the same form as those for cell Ak
(4)
(5)SYNAPTIC WEIGHTS.
The model, as defined above, has three weights: wes, wis, and wib. One weight can be expressed in terms of the other two as follows. Assume that the average output of cells Ak and Bk is greater than their average input. All stages in Fig. 1 are architecturally identical, so that a growth of output across a single stage will lead to an average output at the final stage that is much larger than the average input to the first stage. This is physiologically untenable: we do not expect action potential rates to differ drastically from visual area to area. By the same argument, we do not expect the average output of a stage to be much less than its average input.
The conclusion, therefore, is that the average output of a stage is similar to its average input. This conclusion applies to both time-varying activity and steady state. At steady state, the average inputs and outputs are obtained by setting the derivatives in Eqs. 1 and 4 to zero, and adding the resulting equations
(6)
(7)Four-channel model
The model described thus far consists of two channels. Such a model cannot address some of the important results in the binocular rivalry literature, such as stimulus rivalry (Logothetis et al. 1996). These results derive from experiments using two incompatible stimuli presented to the same eye at different times. If the model is to account for experiments such as these, it must have two channels for each eye, with one channel selective for each of the incompatible stimuli.
MODEL ARCHITECTURE.
The four-channel model, illustrated in Fig. 2, uses the two-channel model as foundation. The two channels driven by the left eye are 1 and 3, where channel 1 is selective for one stimulus feature (e.g., horizontal contours) and channel 3 is selective for an incompatible feature (e.g., vertical contours). The other two channels are driven by the right eye. The connections for the cell on channel 1 at stage 2, A12, are also shown in the figure. The cell receives excitatory input from the previous stage with weight wes, where e stands for excitatory and s for self. It also receives excitatory input from the channel with the same stimulus selectivity but driven by the other eye; this connection represents binocular summation. The synaptic weight in this case is weo, where o stands for interocular. The inhibitory inputs to a cell are all within the same stage, consistent with the evidence that inhibitory connections do not extend between visual areas (Bullier 2004). The weights of these inputs indicate their sources: wif (f for interfeature inhibition), wib (b for both interfeature and interocular inhibition), wio, and wis.

FIG. 2.Four-channel model. Two of the channels (1 and 2) are selective for one stimulus feature and the other 2 channels for a different feature. Left eye drives 2 channels (1 and 3), and the right eye the other 2 channels. Each cell receives 2 excitatory synaptic inputs from the previous stage and 4 inhibitory inputs from its own stage, as illustrated for cell A12. Weights w of the synaptic inputs are shown for this cell.
MODEL EQUATIONS.
The equations for a stage in this model are best given in matrix terms. With pjk and ajk representing the postsynaptic potential and action potential rate, respectively, for channel j at stage k
(8)

(9)
(10)
(11)
(12)MODEL INPUTS.
The input to each channel is a sum of stimulus and noise components, s and n, respectively
(13)
(14)MODEL OUTPUTS.
The model is compared with psychophysical data involving two types of decision process. The model's output differs between these two cases.
In studies of the dynamics of rivalry (Fig. 5), and of the effect of unequal stimulus strength in the two eyes (Fig. 6), experimental subjects indicate the intervals during which a given stimulus is dominant. In the model, the dominant stimulus at any given time is defined to be that driving the channel whose final stage has the highest output action potential rate.
The remaining experimental studies (Fig. 4 and 7) measure discrimination sensitivities for monocular test stimuli delivered during either a dominance or suppression phase of the tested eye. For the model, it is assumed that the discrimination process is based on activity at a specific stage, the discrimination stage, which is not necessarily the final one. Sensitivity was determined as follows. First, the dominance intervals for a channel were found and the maximum action potential rate in that channel's discrimination stage was determined for each interval. Next, if these maxima occurred more frequently than 1 per 7 s (the average rate at which subjects triggered test stimuli in the experiments), the smallest maxima in excess of this rate were discarded. The remaining maxima were averaged. Finally, this action potential rate was converted to a sensitivity by delivering small increments to the channel's input and finding the resulting change in action potential rate at the discrimination stage; sensitivity is output increment divided by input increment. The procedure for suppression intervals was the same, except that minimum action potential rate was found in each interval and enough of the largest minima were discarded to ensure that the remaining ones occurred at 1 per 7 s.
Numerical methods
Both the two- and four-channel models are defined by Eqs. 8, 9, 10, 12, 13, and 14. The parameter values for the four-channel model
(15)
FIG. 3.Time course of neural activity. Equations defining the 2-channel model were numerically integrated over a period of 10 s. Inputs to the A and B channels consisted of independent noise processes with equal mean, and are illustrated in the top graph. Normalized action potential rates of 3 model stages are shown in the remaining graphs. Difference between activity in the A channel (black lines) and B channel (gray lines) grows from the input to the output.

FIG. 4.Depth of binocular rivalry suppression. Nguyen et al. (2003) determined the depth of binocular rivalry suppression by measuring visual sensitivity to a monocular stimulus during both the dominance and suppression phases of rivalry. Gray lines show the sensitivity during suppression divided by that during dominance, as a function of the complexity of the discrimination performed by the subject; task complexity increases from left to right. Suppression depth grows with task complexity, a growth interpreted to indicate increased suppression depth along the visual pathway. The model reproduces this behavior. Sensitivity of each stage on channel A was measured during suppression, when its activity is lower than that of channel B, and during dominance, when its activity is higher than that of B. Black line gives sensitivity during suppression divided by that during dominance, as a function of stage number. Psychophysical data are laterally shifted to match them with the model.

FIG. 5.Statistics of dominance durations. A: dominance duration indicates the length of the time interval for which one monocular stimulus is continuously visible. Gray line shows a previously published probability density (Fox and Herrmann 1967) of dominance durations. Black line, calculated by running the model, matches the single mode and long tail of the experimental data. B:Fox and Herrmann (1967) determined the correlation between a dominance duration and the following duration (separation = 1), between durations 2 intervals apart (separation = 2), and for all other separations up to 10. Correlation is very low except in the (trivial) case in which an interval is correlated with itself (separation = 0). Model data (black line) closely match the empirical measurement (gray line).
The equations were numerically integrated with a fourth-order Runge–Kutta algorithm, using the Matlab (The MathWorks) programming language. The accuracy of the implementation was determined by independently coding the same equations as a Simulink flow diagram (Simulink is part of the Matlab suite) and verifying that the two implementations produced identical time courses. Simulations used a time step of 5 ms.
RESULTS
Two-channel model
Some of the results described here require only two channels for their explanation. We therefore start with the two-channel model.
TIME COURSE.
Figure 3 shows the model's time course computed over 10 s. Action potential rate in the A channel is shown in black and that in B by gray. Two important characteristics of the model's behavior can be seen in the figure. First, there are time intervals where the A channel's output is higher than that of B, indicating that the A channel's stimulus dominates perception. These intervals apparently vary randomly in duration, reminiscent of the rivalry process. Second, the difference between the action potential rate in channel A and that in channel B is small for the early stages and increases as activity progresses to later stages. These two aspects of the model will now be examined quantitatively.
AMPLIFICATION OF NEURAL MODULATION.
Several lines of evidence indicate that binocular rivalry modulates neural signals at later stages of the visual pathway more than it modulates earlier signals. The evidence comes from studies of the correlation between single-neuron recordings and perceptual reports in monkeys (Sheinberg and Logothetis 1997), functional magnetic resonance imaging in humans (Polonsky et al. 2000; Tong et al. 1998), and psychophysical experiments (Nguyen et al. 2003).
The model also produces deeper modulation of activity in its later stages, as illustrated in Fig. 3. This can be shown analytically as follows. Mutual inhibition in the model decreases activity in one channel relative to the other. We are therefore interested in the difference in activity between the two channels at one stage relative to the activity difference at a previous stage. Assume steady state, by setting to zero the derivatives in Eqs. 1 and 4. Subtraction of the second equation from the first, and using Eq. 7 yields
(16)
(17)AMPLIFICATION OF SUPPRESSION.
The amplification of binocular rivalry suppression shown in Fig. 3 can be compared with experimental data. The gray lines in Fig. 4 show psychophysical data from Nguyen et al. (2003), who induced binocular rivalry and presented a brief test stimulus to measure visual sensitivity during dominance and suppression periods. The vertical axis in Fig. 4 shows the sensitivity during suppression divided by that during dominance, and the gap between the data and the dashed line therefore shows suppression depth. The test stimulus consisted of two lobed semicircles, and the subject's task was to discriminate between them. These two stimulus components were made progressively more alike to require more complex form discriminations and to thereby tap into decision processes at neural locations further along the visual pathway: task complexity increases from left to right along the horizontal axis. Suppression deepens with task complexity.
Predictions from the model were obtained by calculating its time course over 100 s. Channel A's sensitivity was calculated for both its dominant and suppressed states, as described in methods. The black line in Fig. 4 shows the sensitivity of the suppressed channel divided by that in the dominant channel, as a function of stage number. As with the experimental observations, suppression deepens from left to right. We do not know to what visual area each model stage corresponds, nor can we nominate the neural sites underlying the visual tasks of Nguyen et al. (2003). The best that can be done, therefore, is to shift the model data laterally so that it matches the psychophysical data. The model data were not adjusted vertically; despite this, the model fits the psychophysical data quite closely. Why is there a decline in sensitivity during suppression? In the model, sensitivity losses originate in the nonlinear relationship between postsynaptic potential and action potential rate: low action potential rates lie on the low-gradient portion of the nonlinearity.
DURATION OF DOMINANCE INTERVALS.
Many studies of binocular rivalry include measurement of its time intervals. Levelt (1967), for example, measured the durations of successive dominance intervals. He was the first to show that the probability density of durations was skewed: it had a single mode that was less than its mean and a substantial number of intervals with durations much longer than the mean. The gray line in Fig. 5A shows a published estimate of this probability density.
Can the model reproduce these findings? To answer this question, the model was run for 500 s and dominance interval durations for channel A were compiled into a probability density, shown by the black line in Fig. 5A. The model density has a shape similar to that of the empirical curves, but differs in that it has an excess of very short and very long intervals. The mismatch for very short intervals is at least partly a result of the lack of the response latency found in human subjects. The very long intervals in the model data are more difficult to reconcile with the empirical data. They could be present because the model lacks an adaptation mechanism; adaptation would tend to produce a sensitivity loss in the dominant channel at long intervals, and a truncation of such intervals.
CORRELATION OF DOMINANCE INTERVALS.
There is a second well-established principle concerning the timing of binocular rivalry: successive dominance intervals have uncorrelated durations (Fox and Herrmann 1967). Simulation shows that this is also true of the model, as seen in Fig. 5B. The black line in this figure was obtained from the model by calculating the time course over 500 s and finding the durations of dominance intervals for both channels. Correlations were calculated between each duration and itself (separation = 0), the following interval (separation = 1), and with intervals at separations up to 10. The only substantial correlation was the (trivial) zero lag. The model therefore accords with the experimental data (gray line) in this respect.
STIMULUS STRENGTH.
Levelt (1966) proposed a further set of principles for the timing of binocular rivalry alternations, that is, that an increase in the strength of the inducing stimulus to one eye:
does not affect the mean dominance time for that stimulus;
reduces the mean dominance time for the other stimulus.
By “strength” Levelt meant stimulus parameters such as contrast and the sharpness of contours. Subsequent research has shown that although the second statement is correct, the first needs to be softened. Figure 6A shows empirical data (Leopold and Logothetis 1996): as stimulus contrast increases for one eye, dominance duration increases slightly for that stimulus but decreases relatively rapidly for the other stimulus.

FIG. 6.Dominance intervals and stimulus strength. A: effect of differing monocular stimulus strengths is shown by previously published data. One eye received a stimulus with a contrast of 1 and the other eye received an equal or lesser contrast as shown on the top horizontal axis. Eye receiving the higher strength was dominant for longer periods, as shown by the average dominance durations represented by open circles. Bottom horizontal axis estimates cortical input for each contrast, for easier comparison with the model. It gives the median action potential rate from an equal mixture of parvocellular and magnocellular cells in the lateral geniculate nucleus, as calculated from the contrast-response functions of Sclar et al. (1990). B: for the model, the mean input to the A channel was fixed at 1 and the mean input to the other channel was set equal to or lower than this value; the horizontal axis shows the ratio of the means. Model reproduces the form of the experimental findings, but over a more limited range of input strengths.
Figure 6B shows that the model reproduces this behavior, with one caveat: the stimulus range producing the required changes in dominance times is considerably smaller than the range of contrasts for the experimental data. This difference can be at least partly explained by the steep contrast-response functions measured in lateral geniculate nucleus and cortical cells (Sclar et al. 1990). The bottom horizontal axis in Fig. 6A shows lateral geniculate responses corresponding to the contrasts on the upper axis. These values were calculated from the contrast-response functions of an equal mixture of parvocellular and magnocellular cells in Sclar et al.'s sample. Although the new horizontal axis does not completely match the horizontal axis in Fig. 6B, it shows that contrast normalization can account for at least some of the difference between the two graphs. Another possibility is the lack of an adaptation process in the model: adaptation would tend to reduce the effect of a strong stimulus relative to a weak one.
Four-channel model
The last property of rivalry to be considered is eye suppression. The experiments in this case use binocularly incompatible stimuli presented to the same eye at different times. To compare model to experiment we need to switch to the four-channel model.
EYE SUPPRESSION.
The earliest models of binocular rivalry assumed that rivalry arose from mutual inhibition between primary visual cortical cells driven by the left eye and those driven by the right eye. This led to the idea of eye suppression, that is, that when one eye's stimulus is suppressed any stimulus to that eye will be suppressed, regardless of stimulus features. The alternative hypothesis is feature (or stimulus) suppression, which states that it is a stimulus feature (e.g., horizontal contours) that is suppressed, regardless of the eye to which the feature is presented (Logothetis et al. 1996). There is psychophysical evidence for eye suppression (Blake et al. 1980; Nguyen et al. 2001). Nguyen et al. induced rivalry with orthogonal gratings and then superimposed a test stimulus on one of the gratings to measure sensitivity. The orientation of the test varied: it matched that of the conditioning stimulus on which it was superimposed, that of the other stimulus, or took one of several values in between. Suppression depth was calculated by dividing test sensitivity when the tested eye's conditioning stimulus was suppressed, by that during dominance. Sensitivity during suppression averaged 62% of that during dominance, a value comparable with previous estimates (Blake and Camisa 1978; Makous and Sanders 1978). More important, suppression depth varied little with orientation, as shown by the gray lines in Fig. 7B, and as expected of eye suppression. The results do not conform to the expectation of feature suppression. At the right end of the axis, when the tested eye's stimulus is suppressed and the test stimulus has the same features as the dominant eye's stimulus, feature suppression predicts that the data should lie above 1.

FIG. 7.Eye suppression. A: stimuli were delivered to the left-eye channel selective for one feature (channel 1) and to the right-eye channel selective for a binocularly incompatible feature (channel 4); the remaining 2 channels were unstimulated. Model was run for 10 s and the time course of normalized action potential rate is shown for each of the 4 channel outputs. Channels 1 and 3, both driven by the left eye, tend to be dominant at the same time and to be suppressed at other times. Similarly channels 2 and 4, driven by the right eye, also tend to accompany each other. B:Nguyen et al. (2001) tested for eye suppression by delivering test stimuli to a single eye, and matching the test orientation to that of either the conditioning stimulus to the tested eye (left end of axis) or to that of the other eye (right end). Each symbol below the horizontal axis shows the left and right eye conditioning stimuli and, below that, the monocular test stimulus. Vertical axis shows sensitivity to the test stimulus when the tested eye's conditioning stimulus is suppressed, divided by test sensitivity when the tested eye's conditioning stimulus is dominant; the relative flatness of the data, shown by gray lines, confirms eye suppression. Model was tested in the same way, by finding the sensitivity of activity at stage 5 to stimuli delivered along a single channel. Model data, shown by the black line, overlie the empirical data.
Like the experimental data, the model produces eye suppression. This is illustrated in the time course shown in Fig. 7A. To generate this time course, stimuli were applied to the left-eye channel selective for one orientation and the right eye channel selective for the orthogonal orientation: s1 = s4 = 1. Lesser inputs were applied to the other two channels: s2 = s3 = 0.8. It can be seen that the activities in the two left-eye channels (shown in black) tend to vary together. When activity in one left-eye channel is high, so is that in the other left-eye channel. Similarly, the two right-eye channels (shown in gray) also tend to vary together.
To compare the model with the experimental observations, the model was analyzed in the following steps. First, the model was run for 100 s with the stimuli set as above. The tested eye was assumed to be that driving channel 1. Second, the sensitivity of channel 1's stage 5 was found when that channel was dominant and also when it was suppressed; the procedure for calculating sensitivity is described in methods. Stage 5 was chosen because the experiment required the subject to decide at which of two locations the test stimulus was located: the neural site underlying this task presumably lies relatively early in the visual pathway. Third, the sensitivity during suppression was divided by that during dominance to provide the filled circle at the left side of Fig. 7B. Fourth, at the right side of the figure, the tested eye is the same but the test orientation is orthogonal. The analysis was therefore repeated, except that the sensitivities found were for channel 3. The resulting model data match well with the empirical data, confirming that the model produces eye suppression. Conversely, feature suppression requires that the data trend upward from left to right. Given that the model data slope in the opposite direction, if anything, the model offers no support for the feature suppression hypothesis.
DISCUSSION
Model parameters
The model developed here has the virtue of simplicity: it has eight independent parameters and yet can account for a variety of experimental observations. As stated in the introduction, the aim of this model-building exercise was explanatory power rather than exact reproduction of experimental data. Accordingly, the model parameters were set so that the model matched major features of the experimental data; an error minimization procedure was not used. The parameters were set as follows.
STAGE TIME CONSTANT, τ.
The mean duration of dominance intervals is proportional to τ, which therefore sets the overall timescale of the model. The time constant was set so that the mean dominance duration, 1.69 s, was close to an early empirical measurement, 1.63 s, of the same quantity (Fox and Herrmann 1967).
NUMBER OF STAGES, n; INHIBITORY WEIGHT,Wib.
Both these parameters contribute to suppression depth, as shown by Eq. 17. With respect to suppression depth, therefore, an increase in one parameter can be compensated for by a decrease in the other. To match the gradual decline of suppression depth with task complexity in psychophysical experiments (Fig. 4), the number of stages had to be at least six. The number of stages was therefore set at six, and wib adjusted to obtain a close fit with the psychophysical data.
INPUT TIME CONSTANT, τ0.
Two recent models (Laing and Chow 2002; Wilson 2003) show that the stochastic alternations in rivalry can be produced by chaotic relationships between action potential rates in mutually inhibitory channels. The input noise in the present model may therefore represent chaotic relationships between spike trains rather than variability in individual trains. The time constant for the noise was therefore set at a value, 40 ms, over which differences in action potential rates are likely to be significant.
NOISE AMPLITUDE, σ.
This parameter plays a key role when stimulus-dependent inputs differ between channels. A channel with lower-strength input can dominate only when the sum of its stimulus-dependent and noise components exceeds that of the dominant channel. The noise amplitude was set at a level for which the model approximates the shape of the empirical curves in Fig. 6. It should also be noted that there is a trade-off between the two parameters, τ0 and σ. When the input time constant is small, the rapid fluctuations in the input are smoothed by the low-pass filtering of the stages, and the noise amplitude has to be increased to compensate.
WEIGHTS Weo, Wio, AND Wif.
These weights are set equal to zero for the two-channel model and are therefore significant only in the four-channel model. Weights weo and wio have conflicting roles: weo mediates binocular excitation of a neuron and wio mediates interocular inhibition. Weight weo was set so that each neuron at the output stage of the model could be monocularly excited from both eyes (Burkhalter and Van Essen 1986). Eye suppression (Fig. 7) occurs only if the inhibition between channels driven by different eyes and selective for the same feature is greater than that between channels driven by the same eye and selective for different features. This required that wio take a value close to weo, and that wif be substantially smaller than both of the other weights.
Neural mechanisms
It is of interest to compare the model parameter settings with corresponding values measured experimentally. This is possible for two of the parameters.
STAGE TIME CONSTANT, τ.
The time constant setting, τ = 80 ms, is considerably longer than the time constant for excitation in single cortical cells (Gutnick and Crill 1995). This may be attributable to the lack of a mechanism for spatial spread in the model. It will take time for inhibition to spread across a population of cells at any one stage of the model (Wilson et al. 2001). The time for spatial spread is presumably incorporated into the time constant.
NUMBER OF STAGES, n.
Sheinberg et al. (1997) showed that the responses of single cells in inferior temporal (IT) cortex correlate closely with perceptual reports during binocular rivalry. Visual signals pass through V1, V2, and V4, on their way to IT (Felleman and Van Essen 1991). It could be, therefore, that when the model is applied to the ventral visual pathway, its stages include the cortical areas V1, V2, V4, and IT.
Previous models
The model developed here builds on ideas used in a number of previous models. Like its predecessors, it has mutual inhibition between cells within a processing stage (Lehky 1988; Sugie 1982), noisy inputs to generate dominance of one population over another (Lumer 1998), and more than two channels (Wilson 2003). The major innovation in the current model is the use of multiple stages with identical architecture. Lumer (1998) used four stages with differing architecture to show that the activity difference between dominant and suppressed channels grows from stage to stage. The model developed here goes further by demonstrating that there is a sensitivity difference between dominant and suppressed channels, and that the increased sensitivity difference at higher stages is in quantitative agreement with psychophysical data and in qualitative agreement with electrophysiological data.
A notable difference between the present model and previous ones lies in the use of adaptation. Adaptation has been used previously to produce the switches from dominance to suppression, in two ways (Laing and Chow 2002; Wilson 2003). First, synaptic depression weakens the effect of inhibition on suppressed neurons, allowing them to become dominant. The second mechanism is action potential adaptation: dominant cells lose sensitivity because of their higher action potential rates, and lose dominance as the suppressed cells become more sensitive because of their low action potential rates. The present model differs in that it has no adaptation. It has a single nonlinearity (the action potential threshold), but that does not qualify as adaptive because it is instantaneous and produces no change in sensitivity over time. Switches between dominance and suppression are initiated at the model's input when the stochastic driving function for the dominant channel falls below that of the suppressed one. There is some evidence in the literature against a role for adaptation in binocular rivalry. Although adaptation should produce a lessening depth of suppression toward the end of a dominance interval, two studies have found depth to be constant across an interval (Fox and Check 1972; Norman et al. 2000). Nevertheless, given the ubiquity of adaptation in sensory systems, it would be surprising if it had no role at all in rivalry.
The site at which binocular rivalry is initiated constitutes another important difference between models. As in Wilson's (2003) model, the model developed here assumes that perceptual switches in binocular rivalry are instigated at the lowest stage of the model, corresponding to primary visual cortex. This assumption distinguishes these models from others that assume perceptual switches arise from top-down influences (Dayan 1998; Lumer et al. 1998) or from a brain stem oscillator (Miller et al. 2000). The existence of eye suppression and feature suppression provides important evidence in this debate about the initiation site of binocular rivalry. The present model concurs with the existence of eye rivalry (Fig. 7) because rivalry switches in the model are initiated through the mutual inhibition of monocularly driven cells. How does this match with the demonstration of feature suppression obtained by rapidly swapping stimuli between the two eyes (Logothetis et al. 1996)? Wilson (2003) used his model to show that the crucial factor here is the stimulus flicker that Logothetis et al. used in conjunction with eye-swapping. Flicker moves the site of rivalrous cellular activity from stage 1 of Wilson's model to stage 2, thereby reducing eye suppression and making the rivalry more like feature suppression.
Comparisons with physiology
The depth of binocular rivalry suppression is small in the early stages of the present model (Fig. 4). This corresponds well with the paucity of primary visual cortical cells whose activity correlates with behavioral reports of perception during rivalry (Leopold and Logothetis 1996). It does not match well, however, with two studies (Polonsky et al. 2000; Tong and Engel 2001) in which modulation of the magnetic resonance signal from primary visual cortex during rivalry was more than half as big as the modulation produced by physical alternation of stimuli. How is this discrepancy to be explained? Polonsky et al. discussed several possible reasons for the difference. There are a further two possibilities that they did not raise. First, the imaging studies used a grating presented to one eye and a grating with incompatible orientation and color presented to the other eye. Leopold et al. used orthogonal gratings that did not differ in color. It could be, therefore, that the imaging studies yielded larger modulations because their stimuli evoked not just contour rivalry, but color rivalry as well. Second, Tong et al., who recorded the largest rivalry-driven modulation of all of the studies, sampled activity from an area of the cortex (the representation of the blind spot) in which all cells were driven by the same eye. It remains to be seen whether the large activity modulations they recorded can also be found in cortical tissue containing intermingled left-eye- and right-eye-driven cells.
These considerations aside, it still remains to reconcile the small suppression depths in the model's first and second stages with the imaging results. A common assumption about primary visual cortex is that it contains at least three levels of processing—monocular, simple, and complex cells—and that these levels are sequential (Hubel and Wiesel 1962). It could be, therefore, that three stages of the model also reside in primary visual cortex. The last of these produces activity modulation comparable with that in the imaging studies. If the model is to be tested in future electrophysiology experiments, what does it predict? The essential predictions are illustrated in Fig. 3: that at any given time during binocular rivalry, one population of cells has a high firing rate and another has a low rate, that this relationship reverses cyclically, and that the firing rate difference between populations is greater in higher than in lower visual cortex.
Generalizing the model
The last issue discussed is that of generalizing the model. As it stands, the model's behavior is determined by its subcortical input. There are a number of studies, however, that show that binocular rivalry can also be influenced by top-down inputs such as attention. It has been shown that subjects can willfully change the switching rate in rivalry (Lack 1969) and that attention to one of the monocular stimuli producing rivalry slightly shortens the intervals for which the unattended stimulus is dominant (Meng and Tong 2004). Further, there is rivalry between stimuli, such as two figures with differing biological motion, that require high-level processing for their interpretation (Watson et al. 2004). These findings do not invalidate the feedforward design used in the present model for several reasons: 1) rivalry can occur in the absence of wilful intervention; 2) the ability to keep a specific rivalrous stimulus in view, at the expense of the other, is weak; and 3) rivalrous stimuli requiring high-level processing for their interpretation also produce low-level incompatibilities between the monocular stimuli. Nevertheless, a more general model requires the addition of a feedback pathway to account for the observed top-down effects.
A broader question concerns the extent to which the model can be applied to visual phenomena other than binocular rivalry. The model produces a single percept from multiple sensory inputs through a winner-take-all mechanism. When one sensory input has a brief, small advantage over other inputs, the activity resulting from that input builds from stage to stage while the activity arising from other inputs declines. Activity differences between channels translate into sensitivity differences through the nonlinear transformation from postsynaptic potential to action potential rate. Although the model has been applied here to binocular rivalry data, there is nothing in it that restricts it to this field. The building blocks of the model—multiple channels, multiple stages, cross-channel excitation and inhibition—could in principle be applied to other areas of study, such as form vision. It would be of considerable interest to see whether such a generalization is possible.
FOOTNOTES
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
I thank I. Cathers, C. Clifford, and V. Nguyen for comments on an earlier version of this paper.
REFERENCES
- Alais and Blake 2004 Alais D and Blake R. Binocular Rivalry. Cambridge, MA: MIT Press, 2004.
Google Scholar - Blake and Camisa 1978 Blake R and Camisa J. Is binocular vision always monocular? Science 200: 1497–1499, 1978.
Crossref | PubMed | ISI | Google Scholar - Blake et al. 1980 Blake R, Westendorf DH, and Overton R. What is suppressed during binocular rivalry? Perception 9: 223–231, 1980.
Crossref | PubMed | ISI | Google Scholar - Bullier 2004 Bullier J. Communications between cortical areas of the visual system. In: The Visual Neurosciences, edited by Chalupa LM and Werner JS. Cambridge, MA: MIT Press, 2004, p. 522–540.
Google Scholar - Burkhalter and Van Essen 1986 Burkhalter A and Van Essen DC. Processing of color, form and disparity information in visual areas VP and V2 of ventral extrastriate cortex in the macaque monkey. J Neurosci 6: 2327–2351, 1986.
Crossref | PubMed | ISI | Google Scholar - Dayan 1998 Dayan P. A hierarchical model of binocular rivalry. Neural Comput 10: 1119–1135, 1998.
Crossref | PubMed | ISI | Google Scholar - Felleman and Van Essen 1991 Felleman DJ and Van Essen DC. Distributed hierarchical processing in the primate cerebral cortex. Cereb Cortex 1: 1–47, 1991.
Crossref | PubMed | ISI | Google Scholar - Fox and Check 1972 Fox R and Check R. Independence between binocular rivalry suppression duration and magnitude of suppression. J Exp Psychol 93: 283–289, 1972.
Crossref | PubMed | Google Scholar - Fox and Herrmann 1967 Fox R and Herrmann J. Stochastic properties of binocular rivalry alternations. Percept Psychophys 2: 432–436, 1967.
Crossref | Google Scholar - Freeman and Morley 1997 Freeman AW and Morley J. Now you see it, now you don't. Today's Life Sci 9: 32–36, 1997.
Google Scholar - Gutnick and Crill 1995 Gutnick MJ and Crill WE. The cortical neuron as an electrophysiological unit. In: The Cortical Neuron, edited by Gutnick MJ and Mody I. New York: Oxford Univ. Press, 1995, p. 33–51.
Google Scholar - Hubel and Wiesel 1962 Hubel DH and Wiesel TN. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. J Physiol 160: 106–154, 1962.
Crossref | PubMed | ISI | Google Scholar - Lack 1969 Lack LC. The effect of practice on binocular rivalry control. Percept Psychophys 6: 397–400, 1969.
Crossref | Google Scholar - Laing and Chow 2002 Laing CR and Chow CC. A spiking neuron model for binocular rivalry. J Comput Neurosci 12: 39–53, 2002.
Crossref | PubMed | ISI | Google Scholar - Lehky 1988 Lehky SR. An astable multivibrator model of binocular rivalry. Perception 17: 215–228, 1988.
Crossref | PubMed | ISI | Google Scholar - Leopold and Logothetis 1996 Leopold DA and Logothetis NK. Activity changes in early visual cortex reflect monkeys' percepts during binocular rivalry. Nature 379: 549–553, 1996.
Crossref | PubMed | ISI | Google Scholar - Levelt 1966 Levelt WJM. The alternation process in binocular rivalry. Br J Psychol 57: 225–238, 1966.
Crossref | ISI | Google Scholar - Levelt 1967 Levelt WJM. Note on the distribution of dominance times in binocular rivalry. Br J Psychol 58: 143–145, 1967.
Crossref | PubMed | ISI | Google Scholar - Logothetis et al. 1996 Logothetis NK, Leopold DA, and Sheinberg DL. What is rivalling during binocular rivalry? Nature 380: 621–624, 1996.
Crossref | PubMed | ISI | Google Scholar - Logothetis and Schall 1989 Logothetis NK and Schall JD. Neuronal correlates of subjective visual perception. Science 245: 761–763, 1989.
Crossref | PubMed | ISI | Google Scholar - Lumer 1998 Lumer ED. A neural model of binocular integration and rivalry based on the coordination of action-potential timing in primary visual cortex. Cereb Cortex 8: 553–561, 1998.
Crossref | PubMed | ISI | Google Scholar - Lumer et al. 1998 Lumer ED, Friston KJ, and Rees G. Neural correlates of perceptual rivalry in the human brain. Science 280: 1930–1934, 1998.
Crossref | PubMed | ISI | Google Scholar - Makous and Sanders 1978 Makous W and Sanders RK. Suppression interactions between fused patterns. In: Visual Psychophysics and Physiology, edited by Armington AC, Krauskopf J, and Wooten BR. New York: Academic Press, 1978, p. 167–179.
Google Scholar - Meng and Tong 2004 Meng M and Tong F. Can attention selectively bias bistable perception? Differences between binocular rivalry and ambiguous figures. J Vision 4: 539–551, 2004.
Crossref | PubMed | ISI | Google Scholar - Miller et al. 2000 Miller SM, Liu GB, Ngo TT, Hooper G, Riek S, Carson RG, and Pettigrew JD. Interhemispheric switching mediates perceptual rivalry. Curr Biol 10: 383–392, 2000.
Crossref | PubMed | ISI | Google Scholar - Nguyen et al. 2003 Nguyen VA, Freeman AW, and Alais D. Increasing depth of binocular rivalry suppression along two visual pathways. Vision Res 43: 2003–2008, 2003.
Crossref | PubMed | ISI | Google Scholar - Nguyen et al. 2001 Nguyen VA, Freeman AW, and Wenderoth P. The depth and selectivity of suppression in binocular rivalry. Percept Psychophys 63: 348–360, 2001.
Crossref | PubMed | Google Scholar - Norman et al. 2000 Norman H, Norman F, and Bilotta J. The temporal course of suppression during binocular rivalry. Perception 29: 831–841, 2000.
Crossref | PubMed | ISI | Google Scholar - Polonsky et al. 2000 Polonsky A, Blake R, Braun J, and Heeger DJ. Neuronal activity in human primary visual cortex correlates with perception during binocular rivalry. Nat Neurosci 3: 1153–1159, 2000.
Crossref | PubMed | ISI | Google Scholar - Sclar et al. 1990 Sclar G, Maunsell JHR, and Lennie P. Coding of image contrast in central visual pathways of the macaque monkey. Vision Res 30: 1–10, 1990.
Crossref | PubMed | ISI | Google Scholar - Sheinberg and Logothetis 1997 Sheinberg DL and Logothetis NK. The role of temporal cortical areas in perceptual organization. Proc Natl Acad Sci USA 94: 3408–3413, 1997.
Crossref | PubMed | ISI | Google Scholar - Sugie 1982 Sugie N. Neural models of brightness perception and retinal rivalry in binocular vision. Biol Cybern 43: 13–21, 1982.
Crossref | PubMed | ISI | Google Scholar - Tong and Engel 2001 Tong F and Engel SA. Interocular rivalry revealed in the human cortical blind-spot representation. Nature 411: 195–199, 2001.
Crossref | PubMed | ISI | Google Scholar - Tong et al. 1998 Tong F, Nakayama K, Vaughan JT, and Kanwisher N. Binocular rivalry and visual awareness in human extrastriate cortex. Neuron 21: 753–759, 1998.
Crossref | PubMed | ISI | Google Scholar - Watson et al. 2004 Watson TL, Pearson J, and Clifford CW. Perceptual grouping of biological motion promotes binocular rivalry. Curr Biol 14: 1670–1674, 2004.
Crossref | PubMed | ISI | Google Scholar - Wilson 2003 Wilson HR. Computational evidence for a rivalry hierarchy in vision. Proc Natl Acad Sci USA 100: 14499–14503, 2003.
Crossref | PubMed | ISI | Google Scholar - Wilson et al. 2001 Wilson HR, Blake R, and Lee S-H. Dynamics of travelling waves in visual perception. Nature 412: 907–910, 2001.
Crossref | PubMed | ISI | Google Scholar

