Pattern Recognition in Time Series

Question

Pattern Recognition in Time Series

Having processed the time series chart, I would like to find patterns that look something like this:

enter image description here

Using an example time series as an example, I would like to be able to detect patterns, as indicated here:

enter image description here

What is the AI algorithm (I accept march training methods), do I need to use this? Is there a library (in C / C ++) that I can use?

+49

time-series pattern-recognition machine-learning

Ali Aug 01 2018-12-12T00:

source share

5 answers

Why not use a simple consistent filter? Or its general statistical counterpart is called cross-correlation. Given the well-known pattern x (t) and the noisy composite time series containing your pattern shifted to a, b, ..., z as y(t) = x(ta) + x(tb) +...+ x(tz) + n(t). The cross-correlation function between x and y should give peaks in a, b, ..., z

+4

Davide C Sep 29 '15 at 16:46

source share

Weka is a powerful collection of machine learning software and supports some time series analysis tools, but I don’t know enough about the field to recommend the best method. However, it is based on Java; and you can invoke Java code with C / C ++ without much trouble.

Packages for manipulating time series are mainly aimed at the stock market. I suggested Cronos in the comments; I have no idea how to make pattern recognition besides the obvious: any good model of the length of your series should be able to predict that after small bumps at a certain distance to the last small bump, large bumps follow. That is, your series demonstrates self-similarity, and the models used in Cronos are designed to simulate it.

If you do not mind C #, you should request the version of TimeSearcher2 from people in HCIL - pattern recognition, for this system, drawing what the template looks like and then checking if your model is enough to capture most instances with a low false position. Probably the most convenient approach for you; all others require enough in common in statistics or pattern recognition strategies.

+2

tucuxi Aug 01 2018-12-12T00:

source share

I'm not sure which package will work best for this. At some point in college, I did something similar, where I tried to automatically detect some similar figures on the xy axis for a bunch of different graphs. You can do something like the following.

Classes of classes, for example:

no class
beginning of area
middle region
end of area

Features:

the relative relative and absolute difference along the Y axis of each of the surrounding points in the window is 11 points wide.
Features such as contrast to medium
Relative difference between point before, point after

+2

nflacco Aug 02 '12 at 18:02

source share

I use deep learning if this is an option for you. This is done in Java, Deeplearning4j . I am experimenting with LSTM. I tried 1 hidden layer and 2 hidden layers to handle time series.

 return new NeuralNetConfiguration.Builder() .seed(HyperParameter.seed) .iterations(HyperParameter.nItr) .miniBatch(false) .learningRate(HyperParameter.learningRate) .biasInit(0) .weightInit(WeightInit.XAVIER) .momentum(HyperParameter.momentum) .optimizationAlgo( OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT // RMSE: ???? ) .regularization(true) .updater(Updater.RMSPROP) // NESTEROVS // .l2(0.001) .list() .layer(0, new GravesLSTM.Builder().nIn(HyperParameter.numInputs).nOut(HyperParameter.nHNodes_1).activation("tanh").build()) .layer(1, new GravesLSTM.Builder().nIn(HyperParameter.nHNodes_1).nOut(HyperParameter.nHNodes_2).dropOut(HyperParameter.dropOut).activation("tanh").build()) .layer(2, new GravesLSTM.Builder().nIn(HyperParameter.nHNodes_2).nOut(HyperParameter.nHNodes_2).dropOut(HyperParameter.dropOut).activation("tanh").build()) .layer(3, // "identity" make regression output new RnnOutputLayer.Builder(LossFunctions.LossFunction.MSE).nIn(HyperParameter.nHNodes_2).nOut(HyperParameter.numOutputs).activation("identity").build()) // "identity" .backpropType(BackpropType.TruncatedBPTT) .tBPTTBackwardLength(100) .pretrain(false) .backprop(true) .build();

Found a few things:

LSTM or RNN are very good at choosing patterns in time series.
I tried one time series and a group of different time series. Samples were chosen easily.
He is also trying to select patterns for more than just one cadence. If there are patterns by week, and by month, then they will be studied by the network.

0

Neil Han Jan 25 '17 at 16:30

source share

user1149913 · Accepted Answer · 2012-08-10 14:28

Here is an example from a small project that I did to split ecg data.

enter image description here

My approach was "switching autoregressive HMM" (google this if you haven’t heard it), where each datapoint is predicted from a previous datatot using a Bayesian regression model. I created 81 hidden states: an unwanted state to capture data between each hit and 80 separate hidden states corresponding to different positions in the heartbeat pattern. The state of pattern 80 was built directly from a selective single beat pattern and had two transitions — self-transition and transition to the next state in the pattern. The final state in the template goes either to itself or to an undesirable state.

I trained the model with Viterbi training , updating only the regression parameters.

In most cases, the results were adequate. A similar conditional random field structure will probably work better, but CRF training will require manual markings in the dataset if you have not already marked the data.

Edit:

Here is an example python code - it is not perfect, but it gives a general approach. He implements an EM training, not Viterbi, which can be a little more stable. Ecg dataset from http://www.cs.ucr.edu/~eamonn/discords/ECG_data.zip

import numpy as np import numpy.random as rnd import matplotlib.pyplot as plt import scipy.linalg as lin import re data=np.array(map(lambda l: map(float,filter(lambda x: len(x)>0,re.split('\\s+',l))),open('chfdb_chf01_275.txt'))).T dK=230 pattern=data[1,:dK] data=data[1,dK:] def create_mats(dat): ''' create A - an initial transition matrix pA - pseudocounts for A w - emission distribution regression weights K - number of hidden states ''' step=5 #adjust this to change the granularity of the pattern eps=.1 dat=dat[::step] K=len(dat)+1 A=np.zeros( (K,K) ) A[0,1]=1. pA=np.zeros( (K,K) ) pA[0,1]=1. for i in xrange(1,K-1): A[i,i]=(step-1.+eps)/(step+2*eps) A[i,i+1]=(1.+eps)/(step+2*eps) pA[i,i]=1. pA[i,i+1]=1. A[-1,-1]=(step-1.+eps)/(step+2*eps) A[-1,1]=(1.+eps)/(step+2*eps) pA[-1,-1]=1. pA[-1,1]=1. w=np.ones( (K,2) , dtype=np.float) w[0,1]=dat[0] w[1:-1,1]=(dat[:-1]-dat[1:])/step w[-1,1]=(dat[0]-dat[-1])/step return A,pA,w,K #initialize stuff A,pA,w,K=create_mats(pattern) eta=10. #precision parameter for the autoregressive portion of the model lam=.1 #precision parameter for the weights prior N=1 #number of sequences M=2 #number of dimensions - the second variable is for the bias term T=len(data) #length of sequences x=np.ones( (T+1,M) ) # sequence data (just one sequence) x[0,1]=1 x[1:,0]=data #emissions e=np.zeros( (T,K) ) #residuals v=np.zeros( (T,K) ) #store the forward and backward recurrences f=np.zeros( (T+1,K) ) fls=np.zeros( (T+1) ) f[0,0]=1 b=np.zeros( (T+1,K) ) bls=np.zeros( (T+1) ) b[-1,1:]=1./(K-1) #hidden states z=np.zeros( (T+1),dtype=np.int ) #expected hidden states ex_k=np.zeros( (T,K) ) # expected pairs of hidden states ex_kk=np.zeros( (K,K) ) nkk=np.zeros( (K,K) ) def fwd(xn): global f,e for t in xrange(T): f[t+1,:]=np.dot(f[t,:],A)*e[t,:] sm=np.sum(f[t+1,:]) fls[t+1]=fls[t]+np.log(sm) f[t+1,:]/=sm assert f[t+1,0]==0 def bck(xn): global b,e for t in xrange(T-1,-1,-1): b[t,:]=np.dot(A,b[t+1,:]*e[t,:]) sm=np.sum(b[t,:]) bls[t]=bls[t+1]+np.log(sm) b[t,:]/=sm def em_step(xn): global A,w,eta global f,b,e,v global ex_k,ex_kk,nkk x=xn[:-1] #current data vectors y=xn[1:,:1] #next data vectors predicted from current #compute residuals v=np.dot(x,wT) # (N,K) <- (N,1) (N,K) v-=y e=np.exp(-eta/2*v**2,e) fwd(xn) bck(xn) # compute expected hidden states for t in xrange(len(e)): ex_k[t,:]=f[t+1,:]*b[t+1,:] ex_k[t,:]/=np.sum(ex_k[t,:]) # compute expected pairs of hidden states for t in xrange(len(f)-1): ex_kk=A*f[t,:][:,np.newaxis]*e[t,:]*b[t+1,:] ex_kk/=np.sum(ex_kk) nkk+=ex_kk # max w/ respect to transition probabilities A=pA+nkk A/=np.sum(A,1)[:,np.newaxis] # solve the weighted regression problem for emissions weights # x and y are from above for k in xrange(K): ex=ex_k[:,k][:,np.newaxis] dx=np.dot(xT,ex*x) dy=np.dot(xT,ex*y) dy.shape=(2) w[k,:]=lin.solve(dx+lam*np.eye(x.shape[1]), dy) #return the probability of the sequence (computed by the forward algorithm) return fls[-1] if __name__=='__main__': #run the em algorithm for i in xrange(20): print em_step(x) #get rough boundaries by taking the maximum expected hidden state for each position r=np.arange(len(ex_k))[np.argmax(ex_k,1)<3] #plot plt.plot(range(T),x[1:,0]) yr=[np.min(x[:,0]),np.max(x[:,0])] for i in r: plt.plot([i,i],yr,'-r') plt.show()

Pattern Recognition in Time Series

More articles: