Dynamic rank factor model for text streams

Shaobo Han, Lin Du, Esther Salazar, Lawrence Carin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Scopus citations

Abstract

We propose a semi-parametric and dynamic rank factor model for topic modeling, capable of (i) discovering topic prevalence over time, and (ii) learning contemporary multi-scale dependence structures, providing topic and word correlations as a byproduct. The high-dimensional and time-evolving ordinal/rank observations (such as word counts), after an arbitrary monotone transformation, are well accommodated through an underlying dynamic sparse factor model. The framework naturally admits heavy-tailed innovations, capable of inferring abrupt temporal jumps in the importance of topics. Posterior inference is performed through straightforward Gibbs sampling, based on the forward-filtering backward-sampling algorithm. Moreover, an efficient data subsampling scheme is leveraged to speed up inference on massive datasets. The modeling framework is illustrated on two real datasets: the US State of the Union Address and the JSTOR collection from Science.
Original languageEnglish (US)
Title of host publicationAdvances in Neural Information Processing Systems
PublisherNeural information processing systems foundation
Pages2663-2671
Number of pages9
StatePublished - Jan 1 2014
Externally publishedYes

Fingerprint

Dive into the research topics of 'Dynamic rank factor model for text streams'. Together they form a unique fingerprint.

Cite this