This paper addresses the challenges in detecting the potential correlation between numerical data streams, which facilitates the research of data stream mining and pattern discovery. We focus on local correlation with delay, which may occur in burst at different time in different streams, and last for a limited period. The uncertainty on the correlation occurrence and the time delay make it diff cult to monitor the correlation online. Furthermore, the conventional correlation measure lacks the ability of ref ecting visual linearity, which is more desirable in reality. This paper proposes effective methods to continuously detect the correlation between data streams. Our approach is based on the Discrete Fourier Transform to make rapid cross-correlation calculation with time delay allowed. In addition, we introduce a shape-based similarity measure into the framework, which ref nes the results by representative trend patterns to enhance the signif cance of linearity. The similarity of proposed linear representations can quickly estimate the correlation, and the window sliding strategy in segment level improves the eff ciency for online detection. The empirical study demonstrates the accuracy of our detection approach, as well as more than 30% improvement of eff ciency. Copyright 2013 ACM.
|Original language||English (US)|
|Title of host publication||Proceedings of the 22nd ACM international conference on Conference on information & knowledge management - CIKM '13|
|Publisher||Association for Computing Machinery (ACM)|
|Number of pages||10|
|State||Published - 2013|