Probabilistic topic models

David Blei, Lawrence Carin, David Dunson

Research output: Contribution to journalArticlepeer-review

163 Scopus citations

Abstract

In this article, we review probabilistic topic models: graphical models that can be used to summarize a large collection of documents with a smaller number of distributions over words. Those distributions are called ¿topics¿ because, when fit to data, they capture the salient themes that run through the collection. We describe both finite-dimensional parametric topic models and their Bayesian nonparametric counterparts, which are based on the hierarchical Dirichlet process (HDP). We discuss two extensions of topic models to time-series data¿one that lets the topics slowly change over time and one that lets the assumed prevalence of the topics change. Finally, we illustrate the application of topic models to nontext data, summarizing some recent research results in image analysis. © 2010 IEEE.
Original languageEnglish (US)
Pages (from-to)55-65
Number of pages11
JournalIEEE Signal Processing Magazine
Volume27
Issue number6
DOIs
StatePublished - Jan 1 2010
Externally publishedYes

Fingerprint Dive into the research topics of 'Probabilistic topic models'. Together they form a unique fingerprint.

Cite this