Visual search for a spatiotemporal target occurs frequently in human experience---from military or aviation staff monitoring complex displays of multiple moving objects to daycare teachers monitoring a group of children on a playground for risky behaviors. In spatiotemporal search, unlike more traditional visual search tasks, the target cannot be identified from a single frame of visual experience; as the target is a spatiotemporal pattern that unfolds over time, detection of the target must also integrate information over time. We propose a new computational cognitive architecture used to model and understand human visual attention in the specific context of visual search for a spatiotemporal target. Results from a previous human participant study found that humans show interesting attentional capacity limitations in this type of search task. Our architecture, called the Spatiotemporal Template-based Search (STTS) architecture, solves the same search task from the study using a wide variety of parameterized models that each represent a different cognitive theory of visual attention from the psychological literature. We present results from initial computational experiments using STTS as a first step towards understanding the computational nature of attentional bottlenecks in this type of search task, and we discuss how continued STTS experiments will help determine which theoretical models best explain the capacity limitations shown by humans. We expect that results from this research will help refine the design of visual information displays to help human operators perform difficult, real-world monitoring tasks.