Python - plot pattern search

Question

Python - plot pattern search

This graph is generated by the following gnuplot script. The estimated.csv file is located at this link: https://drive.google.com/open?id=0B2Iv8dfU4fTUaGRWMm9jWnBUbzg

 # ###### GNU Plot set style data lines set terminal postscript eps enhanced color "Times" 20 set output "cubic33_cwd_estimated.eps" set title "Estimated signal" set style line 99 linetype 1 linecolor rgb "#999999" lw 2 #set border 1 back ls 11 set key right top set key box linestyle 50 set key width -2 set xrange [0:10] set key spacing 1.2 #set nokey set grid xtics ytics mytics #set size 2 #set size ratio 0.4 #show timestamp set xlabel "Time [Seconds]" set ylabel "Segments" set style line 1 lc rgb "#ff0000" lt 1 pi 0 pt 4 lw 4 ps 0 # Congestion control send window plot "estimated.csv" using ($1):2 with lines title "Estimated";

I wanted to find a pattern of the estimated signal of the previous chart, close to the next chart. My basic truth (the actual signal is shown in the following graph)

Here is my initial approach

 #!/usr/bin/env python import sys import numpy as np from shapely.geometry import LineString #------------------------------------------------------------------------------- def load_data(fname): return LineString(np.genfromtxt(fname, delimiter = ',')) #------------------------------------------------------------------------------- lines = list(map(load_data, sys.argv[1:])) for g in lines[0].intersection(lines[1]): if g.geom_type != 'Point': continue print('%f,%f' % (gx, gy))

Then call this python script in my gnuplot directly, as shown below:

 set terminal pngcairo set output 'fig.png' set datafile separator comma set yr [0:700] set xr [0:10] set xtics 0,2,10 set ytics 0,100,700 set grid set xlabel "Time [seconds]" set ylabel "Segments" plot \ 'estimated.csv' wl lc rgb 'dark-blue' t 'Estimated', \ 'actual.csv' wl lc rgb 'green' t 'Actual', \ '<python filter.py estimated.csv actual.csv' wp lc rgb 'red' ps 0.5 pt 7 t ''

which gives us the following graph. But this does not seem to give me the correct template, since gnuplot is not the best tool for such tasks.

Is it possible to find the first graph template ( estimated.csv ) by generating peaks in the graph using python? If we see from the end, the picture really seems visible. Any help would be greatly appreciated.

+1

python python-3.x numpy scipy time-series

Desta Haileselassie Hagos Jun 09 '17 at 13:17

source share

1 answer

Franz · Accepted Answer · 2017-06-09 14:49

I think pandas.rolling_max() is the right approach. We load data into a DataFrame and calculate the maximum pumping speed of more than 8500 values. After that, the curves look the same. You can check the parameter a bit to optimize the result.

 import numpy as np import matplotlib.pyplot as plt import pandas as pd plt.ion() names = ['actual.csv','estimated.csv'] #------------------------------------------------------------------------------- def load_data(fname): return np.genfromtxt(fname, delimiter = ',') #------------------------------------------------------------------------------- data = [load_data(name) for name in names] actual_data = data[0] estimated_data = data[1] df = pd.read_csv('estimated.csv', names=('x','y')) df['rolling_max'] = pd.rolling_max(df['y'],8500) plt.figure() plt.plot(actual_data[:,0],actual_data[:,1], label='actual') plt.plot(estimated_data[:,0],estimated_data[:,1], label='estimated') plt.plot(df['x'], df['rolling_max'], label = 'rolling') plt.legend() plt.title('Actual vs. Interpolated') plt.xlim(0,10) plt.ylim(0,500) plt.xlabel('Time [Seconds]') plt.ylabel('Segments') plt.grid() plt.show(block=True)

To answer the question from the comments:

Since pd.rolling() generates certain windows of your data, the first values will be NaN for pd.rolling().max . To replace these NaN s, I suggest going around the entire series and calculating the windows back. Subsequently, we can replace all NaN with values from the inverse calculation. I adjusted the window length for the inverse calculation. Otherwise, we get erroneous data.

This code works:

 import numpy as np import matplotlib.pyplot as plt import pandas as pd plt.ion() df = pd.read_csv('estimated.csv', names=('x','y')) df['rolling_max'] = df['y'].rolling(8500).max() df['rolling_max_backwards'] = df['y'][::-1].rolling(850).max() df.rolling_max.fillna(df.rolling_max_backwards, inplace=True) plt.figure() plt.plot(df['x'], df['rolling_max'], label = 'rolling') plt.legend() plt.title('Actual vs. Interpolated') plt.xlim(0,10) plt.ylim(0,700) plt.xlabel('Time [Seconds]') plt.ylabel('Segments') plt.grid() plt.show(block=True)

And we get the following result:

Python - plot pattern search

More articles: