• How do spatial and object attention interact during visual search?
• How does view-invariant object category learning occur?
• How do task-sensitive focal and multifocal attention coexist?
• Why does crowding influence recognition?
• How is useful-field-of-view regulated?
How are spatial and object attention coordinated to achieve rapid object learning and recognition during eye movement search? How do prefrontal priming and parietal spatial mechanisms interact to determine the reaction time costs of intra-object attention shifts, inter-object attention shifts, and shifts between visible objects and covertly cued locations? What factors underlie individual differences in the timing and frequency of such attentional shifts? How do transient and sustained spatial attentional mechanisms work and interact? How can volition, mediated via the basal ganglia, influence the span of spatial attention? A neural model is developed of how spatial attention in the where cortical stream coordinates view-invariant object category learning in the what cortical stream under free viewing conditions. The model simulates psychological data about the dynamics of covert attention priming and switching requiring multifocal attention without eye movements. The model predicts how “attentional shrouds” are formed when surface representations in cortical area V4 resonate with spatial attention in posterior parietal cortex (PPC) and prefrontal cortex (PFC), while shrouds compete among themselves for dominance. Winning shrouds support invariant object category learning, and active surface-shroud resonances support conscious surface perception and recognition. Attentive competition between multiple objects and cues simulates reaction-time data from the two-object cueing paradigm. The relative strength of sustained surface-driven and fast-transient motion-driven spatial attention controls individual differences in reaction time for invalid cues. Competition between surface-driven attentional shrouds controls individual differences in detection rate of peripheral targets in useful-field-of-view tasks. The model proposes how the strength of competition can be mediated, though learning or momentary changes in volition, by the basal ganglia. A new explanation of crowding shows how the cortical magnification factor, among other variables, can cause multiple object surfaces to share a single surface-shroud resonance, thereby preventing recognition of the individual objects.