Efficient Estimation of Dynamic Density Functions with Applications in Data Streams

Abdulhakim Qahtan, Suojin Wang, Xiangliang Zhang

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Recently, many applications such as network monitoring, traffic management and environmental studies generate huge amount of data that cannot fit in the computer memory. Data of such applications arrive continuously in the form of streams. The main challenges for mining data streams are the high speed and the large volume of the arriving data. A typical solution to tackle the problems of mining data streams is to learn a model that fits in the computer memory. However, the underlying distributions of the streaming data change over time in unpredicted scenarios. In this sense, the learned models should be updated continuously and rely more on the most recent data in the streams. \n \nIn this chapter, we present an online density estimator that builds a model called KDE-Track for characterizing the dynamic density of the data streams. KDE-Track summarizes the distribution of a data stream by estimating the Probability Density Function (PDF) of the stream at a set of resampling points. KDE-Track is shown to be more accurate (as reflected by smaller error values) and more computationally efficient (as reflected by shorter running time) when compared with existing density estimation techniques. We demonstrate the usefulness of KDE-Track in visualizing the dynamic density of data streams and change detection.
Original languageEnglish (US)
Title of host publicationLearning from Data Streams in Evolving Environments
PublisherSpringer Nature
Pages247-278
Number of pages32
ISBN (Print)9783319898025
DOIs
StatePublished - Jul 29 2018

Fingerprint Dive into the research topics of 'Efficient Estimation of Dynamic Density Functions with Applications in Data Streams'. Together they form a unique fingerprint.

Cite this