Topic modeling with nonparametric Markov tree

Haojun Chen, David B. Dunson, Lawrence Carin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

A new hierarchical tree-based topic model is developed, based on nonparametric Bayesian techniques. The model has two unique attributes: (i) a child node in the tree may have more than one parent, with the goal of eliminating redundant sub-topics deep in the tree; and (ii) parsimonious sub-topics are manifested, by removing redundant usage of words at multiple scales. The depth and width of the tree are unbounded within the prior, with a retrospective sampler employed to adaptively infer the appropriate tree size based upon the corpus under study. Excellent quantitative results are manifested on five standard data sets, and the inferred tree structure is also found to be highly interpretable. Copyright 2011 by the author(s)/owner(s).
Original languageEnglish (US)
Title of host publicationProceedings of the 28th International Conference on Machine Learning, ICML 2011
Pages377-384
Number of pages8
StatePublished - Oct 7 2011
Externally publishedYes

Fingerprint Dive into the research topics of 'Topic modeling with nonparametric Markov tree'. Together they form a unique fingerprint.

Cite this