Sensory and decisional components of endogenous attention are dissociable

Endogenously cueing a stimulus for attention enhances sensory processing of the attended stimulus, and efficiently selects information from the attended stimulus for guiding behavioral decisions. Yet, it is unknown if these sensory and decisional components of endogenous attention are under the control of common, overlapping or distinct mechanisms. Here, we tested human observers in a multi-alternative visuospatial attention task with probabilistic spatial cues, whose predictive validity varied across spatial locations. Observers’ behavioral choices were analyzed with a multidimensional signal detection model, derived from Bayesian decision theory. The model effectively decoupled attention’s effects on perceptual sensitivity from those on decisional thresholds (bias), and revealed striking dissociations between them. Sensitivity was highest at the cued location, and not significantly different among uncued locations, suggesting a spotlight-like allocation of sensory resources at the cued location. On the other hand, bias varied systematically with endogenous cue validity, suggesting a graded distribution of decision thresholds across all locations. Cueing-induced modulations of sensitivity and bias were uncorrelated within and across subjects. Finally, bias selectively correlated with key measures of decision-making: reaction times were strongly correlated with bias, cueing-induced benefits in reaction times were correlated with bias changes independently of sensitivity changes, and subjects who exhibited higher bias, rather than higher sensitivity, toward the cued location produced more optimal behavioral decisions. Our model and findings demonstrate that endogenous cueing of attention does not engage a unitary selection mechanism, but rather involves the operation of dissociable sensory enhancement and decisional selection processes. Significance statement The capacity for selective attention allows us to engage efficiently with the most important objects in the world around us. Attention affects the way we perceive important stimuli and also affects how we make behavioral decisions about these stimuli. In this study, we test human observers on a multiple alternative attention task and show that attention’s effects on perceptual processing and decision-making can be dissociated. Specifically, attention’s sensory effects alone are in line with the conventional notion of an attentional “spotlight”. These insights are obtained with a novel Bayesian decision model, which we propose as a powerful tool for teasing apart component processes of cognitive phenomena, and could aid the search for the neural correlates of these phenomena.


Introduction
Attention is the remarkable cognitive capacity that enables us to select and process only the most important information in our sensory environment. In laboratory tasks, endogenous attention is engaged by cues that are predictive of upcoming stimuli or events of interest.
The sensory and decisional effects of attention may be quantified, respectively, with changes in signal-to-noise ratio (SNR) associated with sensory processing, and changes in decision thresholds (criteria) associated with downstream decision processes (18,20,22,23). Signal detection theory (SDT), a highly successful Bayesian framework for the analysis of behavior (19,(24)(25)(26)(27)(28)(29), specifies two key psychophysical metrics to enable this quantification: i) sensitivity (d'), which measures the improvement in the quality (SNR) of sensory information processing of the attended stimulus and ii) bias (b), which measures the relative weighting of information from the attended stimulus for guiding behavioral decisions (18,20,26,30,31). A rich literature has reported diverse, and often contradictory, effects of endogenous cueing on sensitivity and bias. For instance, previous studies have reported a benefit (1) or no benefit (12,13) for sensitivity at the cued location, a cost (3,32,33) or no cost (1) for sensitivity at uncued locations, and a graded modulation of bias (13) or no modulation of bias (1) at cued and uncued locations. Studies investigating the neural basis of sensitivity and bias have only added to these controversies. For instance neural responses in area V4 have been reported to modulate only with sensitivity, but not with bias (21), or with bias, but not with sensitivity (16), or with both (34) (see Discussion for further details).
A key reason for these contradictions is the lack of an appropriate psychophysical (SDT) model for analyzing behavioral responses in spatial attention tasks. In the spatial attention task design, employed by the vast majority of studies in the attention literature (Fig. 1A) (21), the observer is cued to attend to one of two (or multiple) stimulus locations. At an unpredictable time following cue onset an event of interest, for example a change in grating orientation, occurs at one of the locations ("change" trials). On some trials, no events (changes) occur at any location ("catch" trials). On change trials, the observer must detect and report the location of the change or, on catch trials, must indicate that no change occurred. Such attention tasks have been commonly analyzed with a combination of multiple binary choice (Yes/No) one-dimensional SDT models, with independent decision variables at each location (Fig. 1B) (35)(36)(37)(38). However, such a formulation is inadequate, at best, and incorrect, at worst, for modeling behavior in such tasks. Because the observer must report a single location of change, modeling behavior with multiple independent Yes/No decisions models produces a response conflict if a "Yes" decision occurs at more than location on the same trial (Fig. 1B, see also Discussion).
Here, we analyze human attention behavior by extending a multidimensional signal detection model, the m-ADC model, which we recently developed to overcome these pitfalls (39,40) ( Fig. 1C). The model extends conventional signal detection theory and derives an optimal decision rule from Bayesian decision theory to effectively decouple sensitivity from bias in attention tasks. We show that the model successfully fits human behavioral choices in a multi-alternative (five-alternative) attention task employing probabilistic, endogenous cueing, and outperforms alternative models in model selection analysis. We apply the model to test if endogenous cueing engages common or distinct mechanisms to modulate sensitivity and bias. The results reveal essential dissociations among these fundamental components of attention.

Results
A model for predicting and fitting behavior in multi-alternative attention tasks We measured the effect of endogenous cueing of attention on sensitivity and bias by extending the recently developed m-ADC model framework (41). We developed the model, from first principles, for multi-alternative attention tasks that employ the method of constant stimuli, i.e. in which stimuli can occur at different, unpredictable strengths at each location (SI Methods; Fig. 1C, SI Fig. S1A). This new model is essential, for example, for studies that seek to measure the effect of attention on the psychometric function at cued and uncued locations (40). The decision manifold in this model comprises a family of intersecting hyperplanes in a multidimensional decision space (Fig. 1C, SI Methods). These hyperplanes represent the decision criteria. These criteria are optimal for distinguishing each class of signal from noise (e.g. change at a given location versus no change), and provide a close approximation to the theoretical optimum for distinguishing signals of one class from another (e.g. changes at one location from another, SI Fig. S1B; SI Methods). The model is able to quantify perceptual sensitivities and decision criteria from stimulus-response contingency tables for tasks with any number of alternatives. After a variable delay (600-2200 ms; exponentially distributed) all patches disappeared briefly (100 ms). Upon reappearance, either one of four patches changed in orientation ("change" trials) or none changed ("no change" trials). After 200 ms the fixation cross changed to yellow, instructing the subject to indicate the location of change (or no change) by pressing one of five different buttons. B. Multiple one-dimensional signal detection models representing independent Yes vs. No (change vs. no change) decisions at each location. In such a composite model a response conflict may arise when decision variables exceed their respective criteria simultaneously at multiple locations (asterisks).Black Gaussians: noise distribution for "no change" trials; red Gaussian: signal distribution for "change" trials at the cued location; light blue Gaussians: signal distribution for "change" trials at an uncued location. Dashed vertical lines: means of signal and noise distributions, whose separation quantifies sensitivity (d') at each location. Solid vertical lines: Criterion at each location (c). C. A multidimensional (m-ADC) signal detection model, schematized here for changes at the cued and an uncued location. Decision variables at each location are represented along orthogonal decision axes in a multidimensional perceptual space. Circles: Contours of multivariate Gaussian decision variable distributions. Black: noise distribution; red: signal distribution for "change" trials at the cued location; light blue: signal distribution for "change" trials at an uncued location. Thin contours: lower sensitivity values; thick contours: higher sensitivity values. Thick gray lines: Decision manifold delineating domains in the perceptual space corresponding to cued, uncued and no change decisions. Gaussians at the side represent the marginals at each location. D. Proportion of change events at each location, relative to the cued location, in a block with high cue validity (left) and in a block with neutral cues (right). Red: Cued location; green: location opposite to the cue (Opp); blue: location adjacent to the cue, either in the visual hemifield ipsilateral to the cue (Adj-Ipsi) or in the visual hemifield contralateral to the cue (Adj-Contra).E. A contingency table from a representative behavioral session. The rows represent locations of change and the columns represent locations of response, both measured relative to the cue (data pooled across angles tested). Grey: "No change" trials. Color conventions are as in panel B.F. Fitting and predicting responses with a 4ADC model. Choice proportions fitted with the m-ADC model (ordinate) plotted against actual choice proportions (abscissa). G. The three panels plot predicted proportions against actual response proportions by fitting either hits and false-alarms (case i), misses and false-alarms (case ii) or hits and misses (case iii). In each case, correct rejections were included in the fitting. Data points: Choice probabilities (joint) for each stimulusresponse contingency at each angle (data pooled across subjects). Upper left insets: Subset of stimulus-response contingencies used for fitting. Lower right insets: Distribution of goodness of fit p-values across subjects. We validated the model on a five-alternative visual change detection task (Fig. 1A).
Following fixation, four oriented Gabor gratings appeared, one in each visual quadrant, and one of the four gratings was cued for attention with a central, endogenous cue ( Fig. 1A; central, directed arrow). After an unpredictable delay, the screen was blanked and the four gratings reappeared. Following reappearance, either one of the four gratings had changed in orientation ("change" trials, 75%) or none had changed ("no change" trials, 25%).
Participants indicated the location of change (one of four possibilities) or no change, with five distinct button press responses.
We tested two versions of the task on two different groups of subjects: a version that included only validly cued and invalidly cued trials and a version that also included neutrally cued trials), presented in pseudorandomly interleaved blocks (Fig. 1A, right). Valid cues indicated the location of the upcoming change with 66.7% validity and invalid cues with 8.3%-16.7% validity (Fig. 1D, left). Neutral cues were not predictive of the upcoming change (25% validity; Fig. 1D, right). In the following analyses, results from the validly and invalidly cued trials were combined across the two groups of subjects (n=30 subjects), except when reporting neutral cueing effects (n = 10 subjects).
We examined how well the modified m-ADC model described human observers' behavior in this multialternative attention task.
First, we fitted a 4-ADC model for each subject's 5x5 stimulus-response contingency table obtained from the five-alternative task (exemplar table in Fig. 1E). Because changes could occur at one of 6 orientation change values, the contingency table for each subject contained 63 independent observations (details in Methods). For these analyses, responses across two of the uncued locations (ipsilateral and contralateral) were averaged into a single contingency (adjacent) because responses were not significantly different across these locations (p>0.2 for hits, misses and false alarms at these locations, signrank test). This simplification significantly reduced the number of model parameters to be estimated (Methods). Goodness-of-fit p-values obtained from a randomization test (based on the chisquared statistic) were generally greater than 0.7 (median: 0.84; range: 0.57-0.98; Fig. 1F, lower inset), indicating that the model successfully fit observers' responses in this multialternative attention task.
Next, we tested the model's ability to predict individual subjects' responses: for this we fit the model using only a subset of the observers' behavioral choices (33%-58%) and tested its ability to predict their remaining (42%-67%) choices (Fig. 1G). Three different subsets of contingencies were selected for fitting: i) hits and false alarms (Fig. 1G, left), ii) false-alarms and misses (Fig. 1G, middle) or iii) hits and misses (Fig. 1G, right). Correct rejection responses were included either implicitly (cases i and ii) or explicitly (case iii; see Methods for details). The model was able to predict all of the remaining observations in the contingency table with high accuracy (Fig. 1G) Fig 1D, left). We tested the effect of these different cue validities on modulations of sensitivity and bias, as estimated by the m-ADC model. Bias was quantified based both on the constant criterion (b CC ) and on the likelihood ratio (b LR ) measures (Methods). Cue validity was not different between cueipsilateral and cue-contralateral locations, and parameter values were strongly correlated and not significantly different between these locations (SI Fig. S4); these locations were treated as a single cue-"adjacent" location for these analyses (Methods). First, we tested whether subjects were utilizing information provided by cue regarding the location of the imminent change. We quantified the effect of cueing "raw" performance metrics, including hit, false alarm and correct rejection rates ( Fig. 2A). Subjects (n=30) exhibited highest hit rates for changes at the cued location as compared to the other locations, across the range of angles tested ( Fig. 2A, top; p<0.001 for difference in hit rates between cued vs. opposite and cued vs. adjacent; Wilcoxon signed rank test, Benjamini-Hochberg correction for multiple comparisons), indicating that subjects were indeed heavily relying on the cue to perform the task. Nevertheless, false-alarm rates were also highest at the cued location (Fig 2A, bottom right; cued vs. opposite: p = 0.0014; cued vs. adjacent: p<0.001).We asked whether this pattern of higher hit rates concomitantly with higher false alarm rates occurred due to a higher sensitivity at the cued location, a higher bias (lower criterion) at the cued location, or both ( (26,34); Methods), by estimating these psychophysical parameters . To distinguish between these possibilities, we analyzed these psychometric data with a 4-ADC model. Next, we asked if these modulations of sensitivity and bias reflected a benefit (relative to baseline) at the cued location, a cost at the uncued locations, or both. For this, a subset of the subjects (n=10) who were tested on the cued detection task were also tested on a neutrally-cued version of the task presented in interleaved blocks (Methods). In this task, changes were equally likely (25%) at each of the four locations. Within this pool of subjects, sensitivity for the neutrally cued locations was significantly lower than that at the cued location (p=0.004), but only marginally significantly different from that at the uncued In sum, these results show that endogenous cueing of attention produced both a higher sensitivity as well as higher bias towards the cued location relative to uncued locations. The increase in sensitivity manifested primarily as a benefit at the cued location. In contrast, cueing produced both a strong benefit in bias at the cued location and a strong cost at uncued locations, relative to baseline. Sensitivity and bias: Same or different attentional mechanisms?
Are the enhancements of sensitivity and bias by endogenous cueing controlled by the same or different attentional mechanisms? To answer this question, we evaluated two competing models (Fig. 3A). According to the "common" mechanism model, cue-induced selection biases compete for sensory processing resources to enhance sensitivity; in this model sensitivity at each location co-varies systematically with bias at that location. On the other hand, according to the "disjoint" mechanisms model bias and sensitivity modulation are decoupled and independent of each other. We sought to distinguish between these two models by examining evidence from seven different approaches.  Fig 3D). The significantly higher evidence in favor of a model that incorporated identical sensitivities, but distinct criteria, at all uncued locations confirms that criteria, rather than sensitivities, varied in a graded manner with endogenous cue validity.
Third, despite these trends in parameters at the uncued locations, we asked if subjects who showed higher sensitivity at the cued location would also have a higher bias (or lower criterion) at the cued location. Sensitivity and criteria were strongly positively correlated at the cued location (ρ=0.6, p<0.001, SI fig S6A) indicating that across observers' criteria co- Fourth, although sensitivity and bias at the cued location did not covary, we asked if attentional modulation of sensitivity and bias, produced by endogenous cueing, would be correlated across subjects. In other words, would subjects who showed the greatest differential sensitivity at the cued location (relative to uncued locations), also show the greatest differential bias toward the cued location? To answer this question we tested whether sensitivity and bias modulations, measured either as a difference index (ADI) or a modulation index (AMI) across cued and uncued locations (Methods), were correlated.
Neither bias value (constant criterion or likelihood ratio) was significantly co-modulated with Fifth, although sensitivity and bias were not co-modulated across the population of subjects, we asked if sensitivity and bias would be co-modulated for individual subjects across experimental blocks. We divided data from each subject's experimental session into two subsets of contiguous experimental blocks. Difference indices (ADI) or modulation indices (AMI) for bias from each block were subjected to an n-way analysis of variance with the corresponding sensitivity (ADI or AMI) index a continuous predictor, and subjects as random effects. This analysis revealed that, neither measure of bias was significantly co-modulated with sensitivity, as measured by either modulation index, across blocks (p>0.1, for all tests).
Sixth, although average sensitivities were not significantly different between opposite and adjacent locations across the population, some observers exhibited a higher sensitivity at the opposite location than at adjacent locations (d' Opp >d' Adj , n=19/30), or vice versa (d' Opp <d' Adj , n=11/30). To test whether these differences in sensitivity were also reflected in the observers' relative biases toward these two locations, we performed the following analysis: We divided subjects into two groups depending on the relative value of their sensitivities at opposite and adjacent locations (d' opp >d' adj or vice versa), and then tested for differences in constant criteria or likelihood ratio bias between opposite and adjacent locations within each group. This analysis did not reveal any distinct patterns in (either measure of) bias across the two groups (SI Fig. S7). Thus, inter-individual patterns of differential sensitivity toward the uncued locations (opposite vs. adjacent) did not translate into corresponding patterns of differential bias toward these same locations.
Finally, we examined the possibility that the graded variation of bias with cue validity was a consequence of parametric assumptions in the m-ADC model. Observers' false-alarm rates at each location (cued, opposite, adjacent)a metric independent of the m-ADC model's assumptionssystematically covaried with cue validity ( Fig. 2A, bottom right), indicating that the graded variation of bias with cue validity was not due to m-ADC model assumptions.
Nevertheless, we sought to verify these trends with a "similarity choice model" (42) To summarize, sensitivity enhancements were strongest at the cued location and not significantly different across uncued locations: bias modulations were more graded across locations and varied systematically with endogenous cue validity. This result suggests that the "spotlight" model of attention applies to primarily to sensitivity control, rather than to bias control, mechanisms. Moreover, modulations of sensitivity and bias by endogenous cueing were uncorrelated across subjects. These results overwhelmingly favor the hypothesis that sensitivity and bias are mediated by dissociable neural mechanisms during endogenous attention, favoring the "disjoint mechanisms" model ( Fig. 3A, right).

Covariation of sensitivity and bias with decision dynamics
Attention produces systematic effects on decision dynamics, manifesting as faster response times (RT) when making perceptual decisions about stimuli at attended locations. To further disambiguate the "common" from the "disjoint" mechanisms models we asked if observers' RT correlated with sensitivity, with bias or with both. According to the "common mechanism" model, both sensitivity and bias would contribute in a closely similar manner towards determining behavioral response times. On the other hand, according to the "disjoint mechanisms" model sensitivity and bias would contribute independently to determining response times. In these analyses, we report correlations based on the likelihood ratio measure of bias (b LR ), as both measures of bias (b CC and b LR ) showed very similar trends. First, we examined response times for all (change) responses made to each location (including both correct and incorrect responses). RT (normalized to cued location; Methods) varied in a graded fashion with cue validity; the lowest RTs occurred for changes at the cued location, followed by the opposite, and then, adjacent, locations (p<0.05, corrected for multiple comparisons; Fig.4A). Similar graded trends were observed when the data were analyzed separately based on hits or false alarms ( Fig. 4B -C). This graded variation with cue validity suggested a close relationship between RT and bias.
Next, we quantified the relationship between RT, sensitivity and bias, considering only correct responses (hits). RT (normalized) and bias were robustly negatively correlated (Fig.   4D, right); ρ = -0.29 p=0.006), whereas although RT showed a decreasing trend with d', their correlation was not significant (Fig. 4D, left); ρ = -0.17, p=0.10). To test if the correlation between RT and bias was rendered more robust by the latent covariation between RT and d', we performed a partial correlation analysis of RT versus bias controlling for the effect of d' (and vice versa). This analysis confirmed a significant negative partial correlation between RT and bias, controlling for d' (ρ p = -0.21, p=0.04); the partial correlation between RT and d', controlling for bias, was negative, but not significant (ρ p = -0.19, p=0.07). These results indicate a strong relationship between RT and bias such that shorter reaction times occurred for decisions at locations of greater bias. Do these correlations with RT simply reflect a motoric response bias, rather than a cueing induced choice bias, toward the cued location? According to the motoric bias hypothesis subjects provided faster and more impulsive responses to cued locations manifesting as higher false alarms and lower RTs at the cued location. The correlation of RT with bias was observed even when considering only correct response trials, arguing against this alternative hypothesis. Nevertheless, to rule out the hypothesis of motoric response bias we conducted the following analysis: We correlated false alarm rates in each of the six experimental blocks with the mean reaction time on false alarm trials in that block; this analysis was done separately, for the cued and uncued locations. A significant correlation would indicate that faster responses (due to a motor bias) produced more false alarms and, correspondingly, a higher bias. Contrary to this hypothesis, we found no significant correlation between RT and false alarm rates across blocks at any location, including at the cued location (cued: ρ =0.05, p=0.58; opp: ρ =0.13, p=0.18; adj: ρ =-0.15, p=0.10) these results were confirmed with an anova analysis with RT as the response variable and false alarms as continuous predictors with subjects as random factors. These results indicate that decisional bias induced by endogenous cue validity, rather than a motor bias, explained the systematic pattern of RT correlations.
Thus, attention modulated RT and bias in a closely similar manner. Moreover, bias variations far outperformed sensitivity variations in explaining the variation in RT across cued and uncued locations. Taken together, these results suggest that bias, rather than sensitivity, modulations determine attention's response time effects, lending further support to the "disjoint mechanisms" model.

Sensitivity, bias and optimal decision metrics
Previous studies have hypothesized that attention enables optimal decision-making at the attended location (43). We evaluated this hypothesis using the m-ADC model, and tested if subject's decisional optimality would vary with sensitivity, with bias or with both. First, we measured subjects' observed cost ratio (β obs ), defined as the ratio of the prior odds ratio to the bias at each location (SI Methods, equation 12), and compared it to the optimal cost ratio (see Methods for details). The m-ADC model decision rule is based on minimizing Bayesian risk (or maximizing utility) and, in this framework, β opt is equal to the ratio of the cost of correct rejections versus false alarms to the cost of hits versus misses (β opt = (C CR -C FA )/ (C Hit -C Miss ); SI Methods, equation 8). Therefore, subjects with a goal of maximizing successes and minimizing errors, without assigning differential costs to different types of errors or successes, should have assumed an optimal cost ratio of unity (β opt =1) at all locations (Fig. 5A, dashed horizontal line). Under these assumptions the bias at each location should match the prior odds ratio at that location. Yet, the vast majority of subjects exhibited systematic deviations from this optimum (Fig. 5A): the observed cost ratio was significantly greater than 1 at the cued location, and was significantly less than 1 at uncued locations (p<0.001, signed rank test; Fig 5A), across subjects. Because the cost ratio is inversely related to bias (SI Methods, equation 12), this implies that subjects adopted a lower bias than optimal at the cued location and a higher bias than optimal at uncued locations.
We investigated the reason for this systematic pattern of sub-optimalities. A first, potential scenario is that subjects' observed cost ratio deviated from the optimal ratio because they perceived a prior odds ratio at each location that differed systematically from the actual prior odds ratio. This could have happened because subjects failed to detect some proportion of changes, especially when the change in orientation was small. However, this explanation is not tenable for the following reason: small orientation changes would be difficult to detect at both cued and uncued locations. Hence, the perceived prior ratio would have had to be lower than the actual ratio at all locations. While this scenario can account for the lower than optimal bias at the cued location (Fig. 5A, red bar), it cannot account for the higher than optimal bias at each of the uncued locations (Fig. 5A, green and blue bars). An alternative scenario is that subjects assumed different cost ratios (β-s) at the different locations, such that they judged errors arising from false alarms as more costly compared to errors arising from misses at the cued location (β>1), and vice versa at uncued locations (β<1).
Nevertheless, no rational explanation can be readily conceived of for subjects attributing different costs to false alarms and misses at the different locations.
A third, more plausible, scenario is that each subject sought to minimize Bayesian risk by adopting a "subjective" cost ratio, one that was uniform across all locations, but differed from 1 (β s opt ≠ 1). This ratio was subjectively optimal to each observer considering her/his own estimate of the relative cost of false alarms and misses. We determined this β s opt for each observer as the cost ratio that minimized the difference between the actual Bayesian risk and the optimal Bayesian risk (ΔR obs-opt ), parametrized by the ratio of false alarm to correct rejection costs (C FA /C CR ; Methods). With data pooled across subjects, β s opt occurred at a value of 1.3, across the range of C FA /C CR values tested (Fig. 5B). We investigated how sensitivity and bias covaried with optimality metrics, in this, third, scenario.
First, we examined two metrics, which measured the deviation from optimality of each observer's overall performance: i) an objective sub-optimality index ( Next, we quantified the deviation from optimality of performance at each location. We defined a locational sub-optimality index (SI L ), which quantifies the how the cost ratio at each location (β obs ) deviated from each individual's optimal cost ratio (β s opt ). Locational suboptimality indices were significantly lower at the cued location than at either uncued location (p<0.001; Fig 5E), indicating that subjects' decisions were more optimal at the cued location compared to uncued locations. Moreover, the locational sub-optimality index revealed opposite patterns of covariation with bias and sensitivity at the cued location: the suboptimality index was negatively correlated with bias (ρ=-0.5, p=0.006), and was positively correlated with sensitivity (ρ=0.76, p<0.001; Fig. 5F).
To summarize, subjects who exhibited a greater bias toward the cued location, and lower bias toward uncued locations, made more optimal decisions overall in this attention task ( Fig. 5C-D). Decisional optimality was highest at the cued location and showed opposite patterns of correlations with sensitivity and bias, as measured with the locational suboptimality index (Fig. 5E-F). This dissociation between sensitivity and bias in terms of decision optimality further substantiates the "disjoint mechanisms" model.

Discussion
We demonstrate that perceptual (sensitivity) and decisional (bias) components of endogenous visuospatial attention, are not under the control of single, unitary mechanism, but operate through dissociable mechanisms. Bias modulated systematically with endogenous cue validity (prior odds), was uncorrelated with sensitivity modulations, and correlated strongly with key decisional metrics, including response times and decisional Bachinski and Bacharach (1) were among the earliest to test the effects of spatial cueing of attention on sensitivity and bias, using dot stimulus detection task with probabilistic cueing (80% valid cues). They reported a benefit in sensitivity on the cued side, with little corresponding cost on the uncued side. Surprisingly, they found that bias was not different between cued and uncued locations. In direct contradiction to these findings, Shaw (12,17) reported that, in luminance detection tasks, focusing attention produced bias (criterion) changes without concomitant changes of sensitivity. Along similar lines, Muller and Findlay (13) demonstrated results similar to those of Shaw (12), showing that attention primarily produced changes of bias at the cued (relative to uncued) location in luminance detection tasks.
These contradictions can be readily explained by shortcomings in the psychophysical models used to analyze these three-alternative task designs. The analysis of Bashinski and Bacharach (1) suffered from at least two critical flaws: First, the authors analyzed a threealternative detection task with two one-dimensional models and incorrectly grouped misidentification (mislocalization) responses with misses (SI Fig. S2). Second, they either pooled the false-alarm rates across cued and uncued locations, or partitioned the falsealarm rates according to an incorrect, ad hoc rule, a pitfall highlighted by other studies as well (3,13). Both these factors likely resulted in incorrect estimates of sensitivity and bias.
Similarly, Muller and Findlay employed a two-stage signal detection model, which was unable to take into account all 9 stimulus-response contingencies in their three-alternative task. Specifically, their estimates of d' were based only on hit and false-alarm contingencies; misses and misidentification responses were ignored. Nevertheless, misses and misidentification responses constitute a significant proportion of overall responses for each event type (Fig.1E; SI Fig. S2). In general, analyses that ignore any category of response can produce inaccurate estimates of sensitivities and biases (SI Fig. S2).
To overcome the pitfalls of analyzing a three-alternative design with one-dimensional SDT models, Hawkins et al (3) employed a post-hoc response probe paradigm (following a protocol previously adopted by Downing (33)). In this paradigm subjects were cued to attend to one of four locations, and following a brief presentation of the target a response probe appeared. Subjects had to indicate whether or not a target had appeared at the location of the response probe. Cue validity was determined as the proportion of trials in which the attentional cue matched the location of the response probe. Despite multiple potential stimulus locations, the introduction of a response probe rendered this a 2-AFC (Yes/No) design that is amenable to analysis with a one-dimensional signal detection model. With this model, the authors found that endogenous cueing (central cueing) of attention produced systematic changes in sensitivity at cued versus uncued locations. Surprisingly, endogenous cueing did not reliably produce changes of bias. In contrast, the m-ADC model revealed a graded variation of bias across cued and uncued locations, with the highest bias at the endogenously cued location (Fig. 2C). Which result is correct?
These contradictory results can be readily explained within the framework of Bayesian decision theory, based on the respective task designs. In the Hawkins et al paradigm, also adopted by later studies (21,51) subjects' decision need be based only on sensory evidence at the response probe location, and there is no need to evaluate sensory evidence at any location (e.g. cued location) against evidence at any other location (e.g. uncued location).
Moreover, because bias was calculated based on the likelihood ratio corresponding to the yes-no criterion, and because the prior probability of target appearance at each probed response location was 50% (cued or uncued), the likelihood ratio measure of bias was close to 1 at all locations (e.g. SI Methods, equation 12), and not significantly different across locations. On the other hand, in the m-ADC paradigm with probabilistic cueing (Fig. 1A) subjects must directly compare sensory evidence at the cued and uncued locations when formulating their decision (e.g. SI Methods, equation 14). Thus, the cued m-ADC paradigm, unlike the post-hoc response probe paradigm permits measuring competition for decisional bias, which is essential for measuring and quantifying a change of bias with cueing.
The need for an appropriate psychophysical model and task design is also highlighted by neuroscience studies that have sought to identify the neural correlates of sensitivity and bias modulation. For example, Chanes et al (38) tested the roles of different frequencies of brain oscillations with rhythmic TMS over the right FEF, and reported frequency specific effects on sensitivity (at high-beta frequencies) and criterion (at gamma frequencies). They adopted a three-alternative task design in which subjects had to detect and report the location of a Gabor grating with one of three button presses ('left', 'right', 'neither'). Again, because of the lack of an appropriate psychophysical model for analyzing this three alternative task, mislocalizations were entirely removed from the analysis as "error" responses, a pitfall that could lead to incorrect estimates of sensitivity and bias. Along similar lines, Luo and Maunsell (21) sought to determine if attentional modulation of neural activity in area V4 was correlated with sensitivity or bias changes while the monkey performed either a conventional three-alternative attention task (their Fig. 1A) or a delayed matched to sample task. The authors concluded that neural activity signatures, including an increase in the firing rate of V4 neurons, were correlated with sensitivity (signal-to-noise ratio) changes, but not with criterion changes. As before, sensitivity and bias were quantified with a combination of onedimensional models. Interestingly, Baruni et al (16) reported results in direct contradiction to these findings. These authors manipulated absolute and relative rewards across different stimulus locations, and reported that neural modulation of V4 activity did not reflect the action of signal to noise (sensitivity) mechanisms. Again, these contradictions could perhaps be reconciled if the m-ADC model were employed to analyze animals' behavior in these attention tasks. appeared. After a variable delay (600 ms-2200 ms, drawn from an exponential distribution), the stimuli briefly disappeared (100 ms), and reappeared. Following reappearance either one of the four stimuli had changed in orientation, or none had changed. The subject had to indicate the location of change, or indicate "no change", by pressing one of five buttons on the response box (configuration in SI Fig. S2).
We term trials in which a change in orientation occurred in one of the four patches as "change" trials, and trials in which no change in orientation occurred as "catch" or "no change" trials. 25% of all trials were no change trials, and the remaining 75% were change trials. We term the location toward which the cue was directed as the "cued" (C) location, the location diagonally opposite to the cued location, as the "opposite" (O) location, and two other locations as "adjacent-ipsilateral" (A-I) or "adjacent-contralateral" (A-C) locations, depending on whether they were in the same visual hemifield or opposite visual hemifield to the cued location (Fig 1 D-E). Changes occurred at the cued location on two-thirds of the change trials, at the opposite location on one-sixth of the change trials, and at each of the adjacent locations on one-twelfth of the change trials. Thus, the cue had a conditional validity of 67% on change trials, and an overall validity of 50%. The experiment was run in six blocks of 48 trials each (total, 288 trials per subject), with no feedback. In an orienting session prior to the experiment, subjects completed 96 trials (two blocks) with explicit feedback provided at the end of each trial about the location of the change and the correctness of their response. Data from these "training" blocks were not used for further analyses.
Ten out of the thirty subjects were tested on a version of the task that incorporated neutrally cued blocks. In this task, subjects were tested on a total of 8 experimental blocks (48 trials each; total 384 trials), with 4 blocks comprising predictively cued trials (as before) and the remaining 4 blocks comprised neutrally cued trials. On neutrally cued trials, the cue was made up of four directed line segments, each pointing toward one of stimuli in each of the four quadrants (Fig. 1A) and changes were equally likely at all four locations. Subjects were informed by on-screen instructions before the beginning of each block as to whether it was a predictive cueing or neutral cueing block, and the order of blocks were counterbalanced and pseudorandomized across subjects. All other training and testing protocols remained the same as before.
Model. The goal of our psychophysical model was to dissociate sensory from decisional components of attention by analyzing each subject's 5x5 contingency table of responses obtained from the multi-alternative attention task (Fig. 1E). Signal detection theory provides a rigorous framework for distinguishing changes in the quality of perceptual representations (perceptual sensitivity) from changes in the relative weighting of sensory evidence at different locations during decisions (spatial choice bias). Sensitivity is quantified with the index of discriminability (d'), that measures the overlap of signal and noise distributions at each location (here, change versus no-change distributions), whereas bias is quantified with a decision threshold or criterion (c), that indexes the amount of evidence that must be available at a particular location before the subject decides that the event of interest (here, the change) occurred at that location. Conventional, one-dimensional SDT models do not suffice to quantify sensitivity and bias in multi-alternative attention tasks; the reasons are summarized in the Introduction (Fig. 1B-C) elaborated in the Discussion, and discussed in extensive detail elsewhere (41).
We recently developed a multidimensional extension to SDT to quantify sensitivity and bias in attention tasks involving multiple alternative detection/change-detection (m-ADC tasks).
Here, we extend this previous model by deriving, de novo, an optimal decision rule for task designs that employ the method of constant stimuli. In these tasks performance is measured by presenting stimuli at various, unpredictable strengths (e.g. contrast, or magnitude of orientation change) at each location, for example, attention tasks in which the full psychometric function is measured at both cued and uncued locations. The decision rule is constructed from optimal decision theory for minimizing Bayesian risk, or maximizing Bayesian utility, and the resultant optimal decision manifold is shown in Figure1C. The detailed derivation of the model is presented in Supporting Information Section 1. We present an intuition for how the model estimates sensitivity and bias from the 5x5 contingency table.
In our task, the subject has to detect an orientation change that can occur at any one (or

Data analysis.
Contingency tables. Subjects' responses in the task were used to construct 5x5 stimulusresponse contingency tables, with change locations relative to the cue on the rows and response locations relative to the cue on the columns; no change events and responses were represented in the last row and last column respectively. Thus, each contingency table   comprised  five  categories  of  responses: hits, misses, false-alarms, mislocalizations/misidentifications and correct rejections (SI Figure S2). were repeated by excluding this latter set of 6 subjects, who were tested on the more limited range of angles, and in all cases we obtained results similar to those reported in the main text. A single, combined psychometric curve was generated by pooling contingencies across all subjects and computing the above metrics for the pooled data ( Fig. 2A, top). False alarm and correct rejection rates were calculated based, respectively, on subjects' incorrect and correct responses during no-change trials.

Model fitting and prediction analyses.
To compute the psychophysical function (sensitivity as a function of orientation change angle), individual subjects' response contingencies were fitted with the m-ADC model described previously. We estimated sensitivities and criteria with maximum likelihood estimation (MLE), using a procedure described previously (41,53 We simplified the estimation further using the following approach: As the probability of change was identical across the adjacent-ipsilateral and adjacent-contralateral locations ( Fig. 1B), we compared the parameters (sensitivities, criteria) estimated for these two Psychophsysical function fits. The psychophysical function was generated by fitting the sensitivity values across angles at each location with a three-parameter Naka-Rushton function allowing asymptotic sensitivity (d max ) and orientation change value at half-max (Δθ 50 ) to vary as free parameters, keeping the slope parameter (n) fixed at 2; a value determined from pilot fits to the data (7). As before, a single combined psychophysical curve was generated by pooling contingencies across all subjects and computing the above metrics for the pooled data (Fig.2B, top). Goodness-of-fit of the model to the data was assessed using a randomization test based on the chi-squared statistic; the procedure is described in detail elsewhere (40). A small p-value (e.g. p<0.05) for the goodness-of-fit statistic indicates that the observations deviated significantly from the model.
To test if there was covariation between sensitivity and bias within each subject's data, we adopted the following procedure: Two contingency tables were constructed for each subject, with responses drawn from the first half of her/his respective session (first one half of the trials) and the second half (last one half of trials). Psychophysical parameters were estimated from these two subsets of data yielding two measures of each psychophysical parameter per subject. An n-way anova analysis was performed with each measure of bias (b CC , b LR ) as the response variable and sensitivity as a continuous predictor, with subjects as random effects.

SI -Methods
Eye-tracking. Subjects' gaze was binocularly tracked and the deviation in their gaze from the fixation cross was recorded and stored in degrees. Trials in which the eye-position deviated by more than 2 degrees from fixation either in the x-or in the y-direction, from the onset of the Gabors until the final response, were removed from further analysis. All of our subjects were South Asian and several exhibited dark pigmentation of the iris, rendering it nearly indistinguishable from the pupil. Hence, the contrast of the pupil (relative to the iris) was weak, and the tracker occasionally lost the location of the pupil; trials in which this occurred for more than 100 ms continuously were also excluded from the analysis. Finally, we excluded all subjects for whom the combined rejection rate (from eye deviation and lost tracking) was more than 30% (7/37 subjects Figure   1G), and staircasing requires MLE estimation of sensitivity from a running contingency table in real time, a significant technical challenge. In addition, adopting the method of constant stimuli enabled us to generate the entire psychometric function, and provided sufficient variation in the sensitivity and criterion parameters to measure their correlations.

Model comparison analysis.
We compared the m-ADC model against other candidate models. Two models (m-ADC eq-d , m-ADC eq-c and m-ADC obl ) were modified from the m-ADC model but incorporated simpler or more complex assumptions. We also compared the m-ADC model with a similarity choice theory model. In each case, models were compared with the Akaike Information Criterion (AIC) that represents a trade off between model complexity (the number of fitted model parameters) and goodness-of-fit, (based on the log-likelihood function); a lower AIC score represents a better candidate model. Similar results were obtained when using a Bayesian Information Criterion (BIC).
i. Comparison with models assuming identical parameter values at uncued locations.
We compared the goodness of fit of our default model with two other models, one that The model is based on factorization of the probability densities into sensitivity and bias components. Specifically, the joint probability densities are factored as: p ij = η ij β j / Σ k η ik β k where p ij denotes the proportion of responses to location j when an event (e.g. change) occurred at location i, η ij is a symmetric measure of similarity between signals at two locations (i,j) (η ij = η ji ) and β j represents the choice bias for location j. These quantities are defined on a ratio scale such that η ii = 1 and, β j is a choice bias relative to its value for the no change response (β o = 1). To compare the values of these parameters against the m-ADC model (Fig. S9D), we defined: i) a sensitivity parameter (d j ) for each location and angle as being inversely related to the respective similarity parameter η io on a logarithmic scale (d i =log(η io )); ii) a bias parameter (β j ) defined as the geometric mean of the bias parameter across all angles; and iii) a constant criterion parameter defined as the negative of the logarithm of the bias (cc i = -log(β i )). Thus, the choice theory model was fitted with a total of 39 parameters (6 unique η ij -s for each of the six angles, and 3 β j parameters).Parameter estimates were strongly correlated between the m-ADC and choice theory models (SI Fig.   S8E). Nevertheless, the significant difference in the number of parameters between the models resulted in the m-ADC model far outperforming choice theory in model comparison analyses based on AIC (SI Fig. S8A).

Analysis of reaction times.
Reaction times were computed as the time from change onset to the time of response; no limit was placed on permitted response intervals, but subjects were asked to respond as quickly as possible. For each subject, trials in which RTs fell outside three standard deviations from the mean RT was considered outliers and excluded from further analysis. Each subjects' mean RT at each location was normalized by dividing by the median RT at the cued location across the population, and correlated with psychophysical parameter estimates at the respective location. Partial correlations were performed to identify dependencies between RT and measures of bias while controlling for the effects of d', and vice versa. We also performed multilinear regression analysis with RT as the response variable and change angle, mean sensitivity and bias at each location as predictors; all predictors were scaled to zero mean and unit variance before the analysis.
Regression coefficient magnitudes for sensitivity (β d' ) and bias (β b-LR ) were compared with a bootstrap test; as our hypothesis was that the regression coefficient for bias would be higher than that for d' because of the similar graded variation of bias and RT across locations, we constructed a null distribution of regression coefficient differences by shuffling the location labels for the d' and bias values randomly and independently across subjects. In order to discount the hypothesis of motor bias, robust correlations were computed between mean RTs and mean FA rates for each experimental block after subtracting the mean RT and FA values subject-wise to account for subject specific effects. An n-way anova was applied, as before, treating RT as the response variable with FA rates as continuous predictors, and subjects as random factors.
Analysis of optimal decisions. Optimal decisions, in the m-ADC model framework, seek to minimize Bayesian risk (equation 3). Under certain assumptions (e.g. C l k = C l m , m ≠ l or for a given stimulus event type, the cost of all types of response error are identical) the optimal decision surfaces comprise hyperplanes in the multidimensional decision space. Our model fitting analysis suggests that such a model provides an excellent fit to observers' behavior in this multialternative attention task (Fig. 1F, bottom insets), suggesting that observers decision criteria were closely in line with this optimal family of decision surfaces (as defined in equation 12,13 and 14).
One potential definition of optimality is when the subject seeks to maximize the number of correct responses and minimize the number of errors. In this case, cost ratio at location j, β obs-j = (C 0 0 -C 0 j ) / (C j j -C j 0 ) = (C CR -C FA )/ (C Hit -C Miss ) = 1 (where C x y is the cost of responding to location x when the event occurred at location y). Yet, we noticed that β j , calculated as β j = (p j / p i )/b LR-j (equation 8, 10 and 11), systematically deviated from 1 across the population of subjects (Fig. 5A), suggesting that subjects did not assume an equal relative cost for the two kinds of error responses (false alarm vs. correct reject and misses vs. hits). We also noticed that β obs was different for the different locations, being predominantly greater than 1 at the cued location and less than 1 at uncued locations (Fig.   5A). Since there is only one type of correct rejection response, and since the model assumes that the cost for false alarms to each location is identical (SI Methods, assumption in equation 7), one possibility is that subjects assumed a different relative cost of hits to misses at the different locations (cued, opposite, adjacent), a somewhat implausible assumption; in the optimality analysis section of the Results we evaluate this, and other scenarios. Rather, we propose that the reason for the difference in β obs across the different locations is because the subject assumed a single, common cost ratio (β s opt ) at all locations, but deviated from this (subjectively) optimal ratio at some locations where she/he did not consider it necessary to perform optimally. Note that the family of decision surfaces under the m-ADC model remains unchanged under these assumptions (SI Methods, equation 13 and 14).
To determine this subject-specific β s opt we tested different values of β and selected the one that minimized the deviation of the actual Bayesian risk from the expected Bayesian risk assuming a single β s opt at all locations (ΔRisk obs-opt ); again this deviation is zero if the subject did not deviate from β opt at any location (ΔRisk obs-opt = Σ i Σ j C i j p obs j ip opt j i ). Following some algebra, it is clear that C i j for hits and misses (or misidentifications) is a function of β opt , C FA and C CR (see equations 6, 8 and 9, along with the following assumptions: a) C j j = C 0 0 /β opt ; b) C j 0 = C 0 j /β opt ; also C j i = C j 0 for all i ≠ j) whereas p obs j i is a function of β j and d i , and p opt j i is a function of β opt and d i .
For each subject three indices of optimal performance were computed. Two indices quantified the optimality of overall performance: i) an objective sub-optimality index (SI O ) defined as the deviation of the subject's cost ratio from an objectively optimal cost ratio (β opt =1) for maximizing correct responses, and computed as the magnitude of the logarithm of the ratio of β s opt to β opt (SI O = |log(β s opt /β opt )|)and ii) a global sub-optimality index (SI G ), defined and computed as the magnitude of the deviation of the optimal Bayesian risk from the actual Bayesian risk (SI G = ΔRisk obs-opt ). A third, locational sub-optimality index was defined as the deviation of the cost ratio at each location from the subject's own optimal cost ratio, and computed for each location(SI L ) as the magnitude of the logarithm of ratio of β obs (observed cost ratio) at that location and the subject's own β s opt (SI L = |log (β obs / β s opt )|). These sub-optimality indices were correlated with sensitivities, biases and their modulations using robust correlations     individual subjects and locations; color convention is same as Figure 1.