Pattern Recognition in Time Series

Having processed the time series chart, I would like to find patterns that look something like this:

enter image description here

Using an example time series as an example, I would like to be able to detect patterns, as indicated here:

enter image description here

What is the AI ​​algorithm (I accept march training methods), do I need to use this? Is there a library (in C / C ++) that I can use?

+49
time-series pattern-recognition machine-learning
Aug 01 2018-12-12T00:
source share
5 answers

Here is an example from a small project that I did to split ecg data.

enter image description here

My approach was "switching autoregressive HMM" (google this if you haven’t heard it), where each datapoint is predicted from a previous datatot using a Bayesian regression model. I created 81 hidden states: an unwanted state to capture data between each hit and 80 separate hidden states corresponding to different positions in the heartbeat pattern. The state of pattern 80 was built directly from a selective single beat pattern and had two transitions β€” self-transition and transition to the next state in the pattern. The final state in the template goes either to itself or to an undesirable state.

I trained the model with Viterbi training , updating only the regression parameters.

In most cases, the results were adequate. A similar conditional random field structure will probably work better, but CRF training will require manual markings in the dataset if you have not already marked the data.

Edit:

Here is an example python code - it is not perfect, but it gives a general approach. He implements an EM training, not Viterbi, which can be a little more stable. Ecg dataset from http://www.cs.ucr.edu/~eamonn/discords/ECG_data.zip

import numpy as np import numpy.random as rnd import matplotlib.pyplot as plt import scipy.linalg as lin import re data=np.array(map(lambda l: map(float,filter(lambda x: len(x)>0,re.split('\\s+',l))),open('chfdb_chf01_275.txt'))).T dK=230 pattern=data[1,:dK] data=data[1,dK:] def create_mats(dat): ''' create A - an initial transition matrix pA - pseudocounts for A w - emission distribution regression weights K - number of hidden states ''' step=5 #adjust this to change the granularity of the pattern eps=.1 dat=dat[::step] K=len(dat)+1 A=np.zeros( (K,K) ) A[0,1]=1. pA=np.zeros( (K,K) ) pA[0,1]=1. for i in xrange(1,K-1): A[i,i]=(step-1.+eps)/(step+2*eps) A[i,i+1]=(1.+eps)/(step+2*eps) pA[i,i]=1. pA[i,i+1]=1. A[-1,-1]=(step-1.+eps)/(step+2*eps) A[-1,1]=(1.+eps)/(step+2*eps) pA[-1,-1]=1. pA[-1,1]=1. w=np.ones( (K,2) , dtype=np.float) w[0,1]=dat[0] w[1:-1,1]=(dat[:-1]-dat[1:])/step w[-1,1]=(dat[0]-dat[-1])/step return A,pA,w,K #initialize stuff A,pA,w,K=create_mats(pattern) eta=10. #precision parameter for the autoregressive portion of the model lam=.1 #precision parameter for the weights prior N=1 #number of sequences M=2 #number of dimensions - the second variable is for the bias term T=len(data) #length of sequences x=np.ones( (T+1,M) ) # sequence data (just one sequence) x[0,1]=1 x[1:,0]=data #emissions e=np.zeros( (T,K) ) #residuals v=np.zeros( (T,K) ) #store the forward and backward recurrences f=np.zeros( (T+1,K) ) fls=np.zeros( (T+1) ) f[0,0]=1 b=np.zeros( (T+1,K) ) bls=np.zeros( (T+1) ) b[-1,1:]=1./(K-1) #hidden states z=np.zeros( (T+1),dtype=np.int ) #expected hidden states ex_k=np.zeros( (T,K) ) # expected pairs of hidden states ex_kk=np.zeros( (K,K) ) nkk=np.zeros( (K,K) ) def fwd(xn): global f,e for t in xrange(T): f[t+1,:]=np.dot(f[t,:],A)*e[t,:] sm=np.sum(f[t+1,:]) fls[t+1]=fls[t]+np.log(sm) f[t+1,:]/=sm assert f[t+1,0]==0 def bck(xn): global b,e for t in xrange(T-1,-1,-1): b[t,:]=np.dot(A,b[t+1,:]*e[t,:]) sm=np.sum(b[t,:]) bls[t]=bls[t+1]+np.log(sm) b[t,:]/=sm def em_step(xn): global A,w,eta global f,b,e,v global ex_k,ex_kk,nkk x=xn[:-1] #current data vectors y=xn[1:,:1] #next data vectors predicted from current #compute residuals v=np.dot(x,wT) # (N,K) <- (N,1) (N,K) v-=y e=np.exp(-eta/2*v**2,e) fwd(xn) bck(xn) # compute expected hidden states for t in xrange(len(e)): ex_k[t,:]=f[t+1,:]*b[t+1,:] ex_k[t,:]/=np.sum(ex_k[t,:]) # compute expected pairs of hidden states for t in xrange(len(f)-1): ex_kk=A*f[t,:][:,np.newaxis]*e[t,:]*b[t+1,:] ex_kk/=np.sum(ex_kk) nkk+=ex_kk # max w/ respect to transition probabilities A=pA+nkk A/=np.sum(A,1)[:,np.newaxis] # solve the weighted regression problem for emissions weights # x and y are from above for k in xrange(K): ex=ex_k[:,k][:,np.newaxis] dx=np.dot(xT,ex*x) dy=np.dot(xT,ex*y) dy.shape=(2) w[k,:]=lin.solve(dx+lam*np.eye(x.shape[1]), dy) #return the probability of the sequence (computed by the forward algorithm) return fls[-1] if __name__=='__main__': #run the em algorithm for i in xrange(20): print em_step(x) #get rough boundaries by taking the maximum expected hidden state for each position r=np.arange(len(ex_k))[np.argmax(ex_k,1)<3] #plot plt.plot(range(T),x[1:,0]) yr=[np.min(x[:,0]),np.max(x[:,0])] for i in r: plt.plot([i,i],yr,'-r') plt.show() 
+44
Aug 10 2018-12-12T00:
source share

Why not use a simple consistent filter? Or its general statistical counterpart is called cross-correlation. Given the well-known pattern x (t) and the noisy composite time series containing your pattern shifted to a, b, ..., z as y(t) = x(ta) + x(tb) +...+ x(tz) + n(t). The cross-correlation function between x and y should give peaks in a, b, ..., z

+4
Sep 29 '15 at 16:46
source share

Weka is a powerful collection of machine learning software and supports some time series analysis tools, but I don’t know enough about the field to recommend the best method. However, it is based on Java; and you can invoke Java code with C / C ++ without much trouble.

Packages for manipulating time series are mainly aimed at the stock market. I suggested Cronos in the comments; I have no idea how to make pattern recognition besides the obvious: any good model of the length of your series should be able to predict that after small bumps at a certain distance to the last small bump, large bumps follow. That is, your series demonstrates self-similarity, and the models used in Cronos are designed to simulate it.

If you do not mind C #, you should request the version of TimeSearcher2 from people in HCIL - pattern recognition, for this system, drawing what the template looks like and then checking if your model is enough to capture most instances with a low false position. Probably the most convenient approach for you; all others require enough in common in statistics or pattern recognition strategies.

+2
Aug 01 2018-12-12T00:
source share

I'm not sure which package will work best for this. At some point in college, I did something similar, where I tried to automatically detect some similar figures on the xy axis for a bunch of different graphs. You can do something like the following.

Classes of classes, for example:

  • no class
  • beginning of area
  • middle region
  • end of area

Features:

  • the relative relative and absolute difference along the Y axis of each of the surrounding points in the window is 11 points wide.
  • Features such as contrast to medium
  • Relative difference between point before, point after
+2
Aug 02 '12 at 18:02
source share

I use deep learning if this is an option for you. This is done in Java, Deeplearning4j . I am experimenting with LSTM. I tried 1 hidden layer and 2 hidden layers to handle time series.

 return new NeuralNetConfiguration.Builder() .seed(HyperParameter.seed) .iterations(HyperParameter.nItr) .miniBatch(false) .learningRate(HyperParameter.learningRate) .biasInit(0) .weightInit(WeightInit.XAVIER) .momentum(HyperParameter.momentum) .optimizationAlgo( OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT // RMSE: ???? ) .regularization(true) .updater(Updater.RMSPROP) // NESTEROVS // .l2(0.001) .list() .layer(0, new GravesLSTM.Builder().nIn(HyperParameter.numInputs).nOut(HyperParameter.nHNodes_1).activation("tanh").build()) .layer(1, new GravesLSTM.Builder().nIn(HyperParameter.nHNodes_1).nOut(HyperParameter.nHNodes_2).dropOut(HyperParameter.dropOut).activation("tanh").build()) .layer(2, new GravesLSTM.Builder().nIn(HyperParameter.nHNodes_2).nOut(HyperParameter.nHNodes_2).dropOut(HyperParameter.dropOut).activation("tanh").build()) .layer(3, // "identity" make regression output new RnnOutputLayer.Builder(LossFunctions.LossFunction.MSE).nIn(HyperParameter.nHNodes_2).nOut(HyperParameter.numOutputs).activation("identity").build()) // "identity" .backpropType(BackpropType.TruncatedBPTT) .tBPTTBackwardLength(100) .pretrain(false) .backprop(true) .build(); 

Found a few things:

  • LSTM or RNN are very good at choosing patterns in time series.
  • I tried one time series and a group of different time series. Samples were chosen easily.
  • He is also trying to select patterns for more than just one cadence. If there are patterns by week, and by month, then they will be studied by the network.
0
Jan 25 '17 at 16:30
source share



All Articles