Anomaly Detection with Neighborhood Preservation Principle

27 27 Workshop on Information-Based Induction Sciences (IBIS27) Tokyo, Japan, November 5-7, 27. Anomaly Detection with Neighborhood Preservation Principle Tsuyoshi Idé Abstract: We consider a task of anomaly diagnosis from multiple sensor data. Our goal is to compute the anomaly score of each sensor under conditions that sensor signals are of highly dynamic and correlated natures, where existing techniques are less useful. We treat a set of sensor signals as a graph stream, and decompose the graph into a set of neighborhood graphs. The anomaly score is then defined based on a novel notion of neighborhood preservation. Applying to a real-world sensor validation task, we demonstrate the utility of our approach. Keywords: sensor data, anomaly detection, neighborhood graph 1 (a) reference data (b) # of data points T [5, 9, 1, 11] [13, 12] 1 IBM, 242-852 1623-14, e-mail: goodidea@jp.ibm.com. IBM Research, Tokyo Research Labotarory, 1623-14 Shimo-Tsuruma, Yamato-shi, Kanagawa 242-852, Japan.... N # of signals 1: (a) (b) (a) (b) 2 4 1. 2. 3....

4. 1 1 Papadimitriou [8] PCA Idé [4] MDS [1] PCA MDS 1 k [3, 2] 2 2.1 N N T 1 N T i (i = 1, 2,..., N) t (t = 1, 2,..., T ) x (t) i D D R N N D D 1 () D D ( ) D (i, j) i j d i,j D d i,j d i,j (1) i d i,i = (2) d i,i (3) d i,i 2 3 d ij = log a i,j {a i,j } i j c i,j 1 T T t=1 [x (t) i x i ][x (t) j x j ] c i,j / c i,i c j,j x i 1 T T t=1 x(t) i a i,j = δ i,j δ i,j 2.2 i k i i k i 1 k k N i k 2 (k ) i k N i i i i k-

target run reference sensor signals weight matrix k-nn graph 2: coupling probabilities anomaly score 1 3 ( ) 2 k- p(j i) p(j i) j i k- 3 k- i k- i [6] 3.1 k j i p(j i) i p(i i) + p(j i) = 1 (1) j i N i p(j i) = p(i i) k p(j i) i k c k p(j i) = 1 k (1 δ i,j) H i j i N i p(j i) ln p(j i) ln k e Hi k 1 i d i d i,j p(j i) p(j i) min d i s.t. e Hi = c (1) (2) H i (1) p(j i) p(j i) = 1 Z i e d ij σ i (3) Z i Z i 1 + e d il l N i σ i (4) σ i c d i,j σ i = 1 3.2 p(j i) p(j i) 1 Hinton Roweis [3]

e i (N i ) p(j i) (5) ē i (N i ) p(j i) (6) e i k k N i e i ( N i ) ē i ( N i ) e i e i k k + 1 ē i i (7) i i E E max { e i (N i ) ē i (N i ), ei ( N i ) ē i ( N i ) } (8) (7) 3.3 j i d i,j 1 j i p(i i) k [3, 2] σ i k [14] k = 2 3 k k T N 2 kn 2 N O(1 3 ) 4 1 PCA MDS [8, 4] [6] 4.1 MotorCurrent UCR [7] MotorCurrent 3 healthy N = 2 T = 1, 5 1 2 1 broken bar T = 1, x 1 x 2 x 1 x 2 Machine 61 1 1.2 N = 61 T 3 3 1 2 4

(a) reference (b) target 1 5 5 1 (a) 1 5 5 1 (b) 15 1 5 5 1 1 5 5 1 4: MotorCurrent Sammon (a) (b) 5 1 15 5 1 time time 3: MotorCurrent 4.2 Sammon Sammon 4 5 Sammon [1] MDS MDS MDS PCA PCA MotorCurrent Sammon 4 5 x 1 x 2 + (b) Machine Sammon 5 (a) (b) +,, and PCA [8] MDS (a) 2 1 1 2. 1.5 1.. 1. 1.5 1 1 2 5: Machine Sammon (a) (b) [4] 6 MotorCurrent k = 3 σ i = 1 4 x 1 x 2 x 6 7 Machine 11 25-11 - k 4 k 4 5 (b) + 2. 1.5 1.. 1. (b)

.6 ing Systems, 15, pages 833 84, 23..5.4.3.2.1 x1 x2 x3 x4 x5 x6 x7 x8 x9 x1 x11 x12 x13 x14 x15 x16 x17 x18 x19 x2 [4] T. Idé and K. Inoue. Knowledge discovery from heterogeneous dynamic systems using change-point correlations. In Proc. SIAM Intl. Conf. Data Mining, pages 571 575, 25. 6: MotorCurrent k=1 k=2 k=3 k=4 k=1 sensor index 7: Machine k 5 [1] T. F. Cox and M. A. A. Cox. Multidimensional Scaling, 2nd ed. Chapman and Hall, 21. [2] J. Goldberger, S. Roweis, G. Hinton, and R. Salakhutdinov. Neighbourhood component analysis. In Advances in Neural Information Processing Systems, 17, pages 513 52, 25. [3] G. Hinton and S. T. Roweis. Stochastic neighbor embedding. In Advances in Neural Information Process- [5] T. Idé and H. Kashima. Eigenspace-based anomaly detection in computer systems. In Proc. ACM SIGKDD Intl. Conf. Knowledge Discovery and Data Mining, pages 44 449, 24. [6] T. Idé, S. Papadimitriou, and M. Vlachos. Computing correlation anomaly scores using stochastic nearest neighbors. In Proc. IEEE Intl. Conf. Data Mining, page (to appear), 27. [7] E. Keogh and T. Folias. The UCR time series data mining archive [http://www.cs.ucr. edu/ eamonn/tsdma/index.html]. 22. [8] S. Papadimitriou, J. Sun, and C. Faloutsos. Streaming pattern discovery in multiple time-series. In Proc. Intl. Conf. Very Large Data Bases, pages 697 78, 25. [9] S. Papadimitriou, J. Sun, and P. Yu. Local correlation tracking in time series. In Proc. IEEE Intl. Conf. Data Mining, pages 456 465. IEEE, 26. [1] J. Sun, H. Qu, D. Chakrabarti, and C. Faloutsos. Neighborhood formation and anomaly detection in bipartite graphs. In Proc. IEEE Intl. Conf. Data Mining, pages 418 425, 25. [11] J. Sun, Y. Xie, H. Zhang, and C. Faloutsos. Less is more: Compact matrix decomposition for large sparse graphs. In Proc. SIAM Intl. Conf. Data Mining, 27. [12] J. Takeuchi and K. Yamanishi. A unifying framework for detecting outliers and change points from time series. IEEE Trans. Knowledge and Data Engineering, 18(4):482 492, 26. [13] K. Yamanishi, J. Takeuchi, G. Williams, and P. Milne. On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms. Data Mining and Knowledge Discovery, 8(3):275 3, 24. [14] X. Yang, S. Michea, and H. Zha. Conical dimension as an intrinsic dimension estimator and its applications. In Proc. SIAM Intl. Conf. Data Mining, 27.