Despite the recent advances in large-scale video analysis, action detection remains as one of the most challenging unsolved problems in computer vision. This snag is in part due to the large volume of data that needs to be analyzed to detect actions in videos. Existing approaches have mitigated the computational cost, but still, these methods lack rich high-level semantics that helps them to localize the actions quickly. In this paper, we introduce a Semantic Cascade Context (SCC) model that aims to detect action in long video sequences. By embracing semantic priors associated with human activities, SCC produces high-quality class-specific action proposals and prune unrelated activities in a cascade fashion. Experimental results in ActivityNet unveils that SCC achieves state-of-the-art performance for action detection while operating at real time.
|Original language||English (US)|
|Title of host publication||2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)|
|Publisher||Institute of Electrical and Electronics Engineers (IEEE)|
|Number of pages||10|
|State||Published - Nov 9 2017|