Learning sigmoid belief networks via monte carlo expectation maximization

Zhao Song, Ricardo Henao, David Carlson, Lawrence Carin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

Belief networks are commonly used generative models of data, but require expensive posterior estimation to train and test the model. Learning typically proceeds by posterior sampling, variational approximations, or recognition networks, combined with stochastic optimization. We propose using an online Monte Carlo expectation-maximization (MCEM) algorithm to learn the maximum a posteriori (MAP) estimator of the generative model or optimize the variational lower bound of a recognition network. The E-step in this algorithm requires posterior samples, which are already generated in current learning schema. For the M-step, we augment with Pólya-Gamma (PG) random variables to give an analytic updating scheme. We show relationships to standard learning approaches by deriving stochastic gradient ascent in the MCEM framework. We apply the proposed methods to both binary and count data. Experimental results show that MCEM improves the convergence speed and often improves hold-out performance over existing learning methods. Our approach is readily generalized to other recognition networks.
Original languageEnglish (US)
Title of host publicationProceedings of the 19th International Conference on Artificial Intelligence and Statistics, AISTATS 2016
PublisherPMLR
StatePublished - Jan 1 2016
Externally publishedYes

Fingerprint

Dive into the research topics of 'Learning sigmoid belief networks via monte carlo expectation maximization'. Together they form a unique fingerprint.

Cite this