Efficient DTW pairing using numpy or cython

I am trying to calculate pairwise distances between several time series contained in a numpy array. See code below

print(type(sales))
print(sales.shape)

<class 'numpy.ndarray'>
(687, 157)

So, it salescontains 687 time series of length 157. Using pdist to calculate the distances DTW between time series.

import fastdtw
import scipy.spatial.distance as sd

def my_fastdtw(sales1, sales2):
    return fastdtw.fastdtw(sales1,sales2)[0]

distance_matrix = sd.pdist(sales, my_fastdtw)

--- EDIT: tried to do this without pdist()-----

distance_matrix = []
m = len(sales)    
for i in range(0, m - 1):
    for j in range(i + 1, m):
        distance_matrix.append(fastdtw.fastdtw(sales[i], sales[j]))

--- EDIT: parallelizing the inner loop -----

from joblib import Parallel, delayed
import multiprocessing
import fastdtw

num_cores = multiprocessing.cpu_count() - 1
N = 687

def my_fastdtw(sales1, sales2):
    return fastdtw.fastdtw(sales1,sales2)[0]

results = [[] for i in range(N)]
for i in range(0, N- 1):
    results[i] = Parallel(n_jobs=num_cores)(delayed(my_fastdtw) (sales[i],sales[j])  for j in range(i + 1, N) )

All methods are very slow. The parallel method takes about 12 minutes. Can anyone suggest an efficient way?

--- EDIT: following the steps given in the answer below ---

This is what the lib folder looks like:

VirtualBox:~/anaconda3/lib/python3.6/site-packages/fastdtw-0.3.2-py3.6- linux-x86_64.egg/fastdtw$ ls
_fastdtw.cpython-36m-x86_64-linux-gnu.so  fastdtw.py   __pycache__
_fastdtw.py                               __init__.py

, chython fastdtw. . , CTRL-C , , python (fastdtw.py):

/home/vishal/anaconda3/lib/python3.6/site-packages/fastdtw/fastdtw.py in fastdtw(x, y, radius, dist)

/home/vishal/anaconda3/lib/python3.6/site-packages/fastdtw/fastdtw.py in __fastdtw(x, y, radius, dist)

, .

+8
2

TL; DR

fastdtw , cpp, python, .

fastdtw -.


fastdtw, . - (?).

fastdtw O(n) , 10^9, , , , C. , , .

fastdtw, , : cython/cpp-, cython python. , python .

, , Ctr+C, , - -. lib , pure-python.

, fastdtw . , , , Python.

?

  • , . git clone https://github.com/slaypni/fastdtw
  • fstdtw python setup.py build
  • .

: numpy/npy_math.h:

  1. .

setup.py:

import numpy # THIS ADDED
extensions = [Extension(
        'fastdtw._fastdtw',
        [os.path.join('fastdtw', '_fastdtw' + ext)],
        language="c++",
        include_dirs=[numpy.get_include()], # AND ADDED numpy.get_include()
        libraries=["stdc++"]
    )]
  1. 3. + 4.
  2. run python setup.py install

100 . `

+3

, fastdtw

from cdtw import pydtw
from dtaidistance import dtw
from fastdtw import fastdtw
from scipy.spatial.distance import euclidean
s1=np.array([1,2,3,4],dtype=np.double)
s2=np.array([4,3,2,1],dtype=np.double)

%timeit dtw.distance_fast(s1, s2)
4.1 µs ± 28.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit d2 = pydtw.dtw(s1,s2,pydtw.Settings(step = 'p0sym', window = 'palival', param = 2.0, norm = False, compute_path = True)).get_dist()
45.6 µs ± 3.39 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit d3,_=fastdtw(s1, s2, dist=euclidean)
901 µs ± 9.95 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

fastdtw 219 dtaidistance lib 20 cdtw

. dtaidistance :

https://github.com/wannesm/dtaidistance

:

pip install dtaidistance
+2

All Articles