Application of k-nearest neighbor method to time series data
↑ The question is an extension of this question.
I tried to graph the anomaly of x-axis acceleration using the k-nearest neighbor method for time series data (acceleration data).
I was able to get a high degree of abnormality value firmly at the abnormal part.
I have a question here
・ How much anomaly should be taken to determine anomaly? How to determine the threshold
-What kind of code should be applied to evaluate how accurate the abnormality judgment is actually compared to the original data?
Two questions remain.
I would like somebody to teach.
import pandas as pd import numpy as np import matplotlib.pyplot as plt from sklearn.neighbors import NearestNeighbors ''' Divide data into slice windows for each size ''' def main (): df = pd.read_csv ("20191121.csv") # Remove extra data from DataFrame df = df.drop (['name','x_rad/s','y_rad/s','z_rad/s'], axis = 1) df = df.set_index ('time') Visualize # x, y, z-axis acceleration df.plot (). legend (loc ='upper left') # 2480 x-axis accelerations from the front are used as training data, and the next 2479 are used as test data. # # df.iloc  --->53845130 # df.iloc  --->53845150 train_data = df.loc [: 53845130,'x_ags'] test_data = df.loc [53845150 :,'x_ags'] .reset_index (drop = True) # Window width width = 30 # k-nearest neighbor k nk = 1 # Create a set of vectors using window width train = embed (train_data, width) test = embed (test_data, width) Clustering with # k-nearest neighbor method neigh = NearestNeighbors (n_neighbors = nk) neigh.fit (train) #Calculate distance d = neigh.kneighbors (test)  # Distance normalization mx = np.max (d) d = d/mx #Training data plt.subplot (221) plt.plot (train_data, label ='Training') plt.xlabel ("Amplitude", fontsize = 12) plt.ylabel ("Sample", fontsize = 12) plt.grid () leg = plt.legend (loc = 1, fontsize = 15) leg.get_frame (). set_alpha (1) # Abnormality plt.subplot (222) plt.plot (d, label ='d') plt.xlabel ("Amplitude", fontsize = 12) plt.ylabel ("Sample", fontsize = 12) plt.grid () leg = plt.legend (loc = 1, fontsize = 15) leg.get_frame (). set_alpha (1) # Verification data plt.subplot (223) plt.plot (test_data, label ='Test') plt.xlabel ("Amplitude", fontsize = 12) plt.ylabel ("Sample", fontsize = 12) plt.grid () leg = plt.legend (loc = 1, fontsize = 15) leg.get_frame (). set_alpha (1) def embed (lst, dim): emb = np.empty ((0, dim), float) for i in range (lst.size --dim + 1): tmp = np.array (lst [i: i + dim]) [:: -1] .reshape ((1, -1)) emb = np.append (emb, tmp, axis = 0) return emb if __name__ =='__main__': main ()
Answer # 1
You may have misunderstood the k-nearest neighbor method.
The k-nearest neighbor method with k = 1 is to "adopt the same correct answer as the training data of the closest distance", and the size of the distance does not matter in the judgment.
Wikipedia-k-nearest neighbor method
"The k-nearest neighbor method when k = 1 is called the nearest neighbor method, and the class of the training example closest to it is adopted."
Note that k = 1 is not essential to the above story. Even if k>1, k training data are selected from the closest order regardless of the absolute distance, and the correct answer is predicted by the majority vote of the result. In the k-nearest neighbor method, after determining k, in the prediction,The size of the distance order is relevant, but the size of the absolute distance is irrelevant...
On the contrary, if you feel from the domain knowledge that "it is correct that the judgment changes depending on the size of the absolute distance", it means that the k-nearest neighbor method is not suitable. Consider other techniques.
- python - you may need to restart the kernel to use updated packages error
- php - coincheck api authentication doesn't work
- php - i would like to introduce the coincheck api so that i can make payments with bitcoin on my ec site
- [php] i want to get account information using coincheck api
- the emulator process for avd pixel_2_api_29 was killed occurred when the android studio emulator was started, so i would like to
- python 3x - typeerror: 'method' object is not subscriptable
- i want to call a child component method from a parent in vuejs
- dart - flutter: the instance member'stars' can't be accessed in an initializer error
- xcode - pod install [!] no `podfile 'found in the project directory