Multiple Objects Tracking (MOT) is a paradigm first developed by Pylyshyn and Storm in 1988. In a typical MOT trial, a bunch of identical objects will first appear on the screen. Then a subset of them (normally equal to or less than half) will change their color to be identified as targets. After this, all of the objects will tend to the same color and start to move around the display. The participants need to attentively keep tracking of the original targets during the 5-15s moving period, and identify the targets at the end of a trial.
MOT is a great paradigm to show the existence of object-based attention, as well as a convenient tool to study this construct. It has been shown that people can generally track 3 to 5 targets at the same time, and factors (though with debate) such as moving speed, tracking duration, minimum distance between objects could affect tracking performance (e.g. Pylyshyn, 2004; Alvarez & Franconeri, 2007; Bae and Flombaum, 2012). For me, the most interesting questions are why people can only track this many of objects, and how different factors interact with each other to affect tracking performance.
In one project, I try to study how the factors at the initial target identification stage could affect tracking performance. In fact, previous studies in the number estimation literature suggested that people are uncertain about the number of objects present in the display. Therefore, it's possible that the failure in enumerating the number of targets is also a factor that leads to the limit in tracking. What we found confirmed this hypothesis. When we unconstrained the number of responses participants could make for targets, their number of target responses appeared to be really noisy, and most of the errors were underestimation. Therefore, some of the difficulty in the MOT paradigm actually arose before tracking even started. The initial image processing stage is also critical for the success in tracking.
In another project, I take a computational modeling approach to study the algorithm that people used to do the tracking task. We tested multiple modified Kalman filter models with MOT trajectories of different target loads and speeds. Our first finding was that some inherent limits in the human visual system, for example the spatial resolution and temporal resolution, are really important in determining people's tracking performance. We also found that a model that 'smartly' uses past velocity information to make prediction of future movement performed almost equally to a model that does not predict at all. This result suggested that the marginal advantage of extrapolation, given the inherent limit of human observers, is really small. People are either not extrapolating, or the effect is really hard to be observed. This finding helped resolve many conflict findings about whether human do prediction in previous literature. We are currently using our model to simulate human performance with different types of fixation patterns. This will help us understand more about the algorithm and strategy people use to track multiple targets at the same time.
Relevant papers and presentations:
Ma, Z., & Flombaum, J. I. (2013). Off to a bad start: Uncertainty about the presence of targets at the onset of multiple object tracking. Journal of
Experimental Psychology: Human Perception & Performance, 39, 1421-1432. doi: 10.1037/a0031353
Zhong, S., Ma. Z., Wilson, C., Liu, Y., & Flombaum, J. I. (2014). Why do people appear not to extrapolate trajectories during multiple object
tracking? A computational investigation. Journal of Vision, 14(12): 12, 1-30. doi: 10.1167/14.12.12
Ma, Z., Zhong, S., Wilson, C., & Flombaum, J. I. (2015). Multiple object tracking explained with neither fixed nor flexible resources. VSS Annual
Meeting, St. Pete Beach, Florida, USA
Zhong, S., Ma., Z., Wilson, C., & Flombaum, J. I. (2014). Kalman filter models of multiple-object tracking within an attentional window. VSS Annual
Meeting, St. Pete Beach, Florida, USA
Flombaum, J. I., Zhong, S., Ma, Z., Wilson, C., & Liu, Y. (2013). What is the marginal advantage of extrapolation during multiple object tracking?
Insights from a Kalman filter model. VSS Annual Meeting, Naples, FL, USA
MOT is a great paradigm to show the existence of object-based attention, as well as a convenient tool to study this construct. It has been shown that people can generally track 3 to 5 targets at the same time, and factors (though with debate) such as moving speed, tracking duration, minimum distance between objects could affect tracking performance (e.g. Pylyshyn, 2004; Alvarez & Franconeri, 2007; Bae and Flombaum, 2012). For me, the most interesting questions are why people can only track this many of objects, and how different factors interact with each other to affect tracking performance.
In one project, I try to study how the factors at the initial target identification stage could affect tracking performance. In fact, previous studies in the number estimation literature suggested that people are uncertain about the number of objects present in the display. Therefore, it's possible that the failure in enumerating the number of targets is also a factor that leads to the limit in tracking. What we found confirmed this hypothesis. When we unconstrained the number of responses participants could make for targets, their number of target responses appeared to be really noisy, and most of the errors were underestimation. Therefore, some of the difficulty in the MOT paradigm actually arose before tracking even started. The initial image processing stage is also critical for the success in tracking.
In another project, I take a computational modeling approach to study the algorithm that people used to do the tracking task. We tested multiple modified Kalman filter models with MOT trajectories of different target loads and speeds. Our first finding was that some inherent limits in the human visual system, for example the spatial resolution and temporal resolution, are really important in determining people's tracking performance. We also found that a model that 'smartly' uses past velocity information to make prediction of future movement performed almost equally to a model that does not predict at all. This result suggested that the marginal advantage of extrapolation, given the inherent limit of human observers, is really small. People are either not extrapolating, or the effect is really hard to be observed. This finding helped resolve many conflict findings about whether human do prediction in previous literature. We are currently using our model to simulate human performance with different types of fixation patterns. This will help us understand more about the algorithm and strategy people use to track multiple targets at the same time.
Relevant papers and presentations:
Ma, Z., & Flombaum, J. I. (2013). Off to a bad start: Uncertainty about the presence of targets at the onset of multiple object tracking. Journal of
Experimental Psychology: Human Perception & Performance, 39, 1421-1432. doi: 10.1037/a0031353
Zhong, S., Ma. Z., Wilson, C., Liu, Y., & Flombaum, J. I. (2014). Why do people appear not to extrapolate trajectories during multiple object
tracking? A computational investigation. Journal of Vision, 14(12): 12, 1-30. doi: 10.1167/14.12.12
Ma, Z., Zhong, S., Wilson, C., & Flombaum, J. I. (2015). Multiple object tracking explained with neither fixed nor flexible resources. VSS Annual
Meeting, St. Pete Beach, Florida, USA
Zhong, S., Ma., Z., Wilson, C., & Flombaum, J. I. (2014). Kalman filter models of multiple-object tracking within an attentional window. VSS Annual
Meeting, St. Pete Beach, Florida, USA
Flombaum, J. I., Zhong, S., Ma, Z., Wilson, C., & Liu, Y. (2013). What is the marginal advantage of extrapolation during multiple object tracking?
Insights from a Kalman filter model. VSS Annual Meeting, Naples, FL, USA