When objects undergo large pose change, illumination variation or partial occlusion, most existed visual tracking algorithms tend to drift away from targets and even fail in tracking them. To address this issue, in this study, the authors propose an online algorithm by combining multiple instance learning (MIL) and local sparse representation for tracking an object in a video system. The key idea in our method is to model the appearance of an object by local sparse codes that can be formed as training data for the MIL framework. First, local image patches of a target object are represented as sparse codes with an overcomplete dictionary, where the adaptive representation can be helpful in overcoming partial occlusion in object tracking. Then MIL learns the sparse codes by a classifier to discriminate the target from the background. Finally, results from the trained classifier are input into a particle filter framework to sequentially estimate the target state over time in visual tracking. In addition, to decrease the visual drift because of the accumulative errors when updating the dictionary and classifier, a two-step object tracking method combining a static MIL classifier with a dynamical MIL classifier is proposed. Experiments on some publicly available benchmarks of video sequences show that our proposed tracker is more robust and effective than others. © The Institution of Engineering and Technology 2013.
ASJC Scopus subject areas
- Computer Vision and Pattern Recognition