A weakly supervised semantic segmentation (WSSS) method aims to learn a segmentation model from weak (image-level) as opposed to strong (pixel-level) labels. By avoiding the tedious pixel-level annotation process, it can exploit the unlimited supply of user-tagged images from media-sharing sites such as Flickr for large scale applications. However, these ‘free’ tags/labels are often noisy and few existing works address the problem of learning with both weak and noisy labels. In this work, we cast the WSSS problem into a label noise reduction problem. Specifically, after segmenting each image into a set of superpixels, the weak and potentially noisy image-level labels are propagated to the superpixel level resulting in highly noisy labels; the key to semantic segmentation is thus to identify and correct the superpixel noisy labels. To this end, a novel L1-optimisation based sparse learning model is formulated to directly and explicitly detect noisy labels. To solve the L1-optimisation problem, we further develop an efficient learning algorithm by introducing an intermediate labelling variable. Extensive experiments on three benchmark datasets show that our method yields state-of-the-art results given noise-free labels, whilst significantly outperforming the existing methods when the weak labels are also noisy.
|Original language||English (US)|
|Number of pages||15|
|Journal||IEEE Transactions on Pattern Analysis and Machine Intelligence|
|State||Published - Apr 9 2016|