Prediction: predicting future events using the SVR module

Question

Prediction: predicting future events using the SVR module

I want to predict future events using the SVR module from scikit-learn. Here is my source code I'm trying to work with:

import csv import numpy as np from sklearn.svm import SVR import matplotlib.pyplot as plt plt.switch_backend('newbackend') seq_num=[] win=[] def get_data(filename): with open(filename, 'r') as csvfile: csvFileReader = csv.reader(csvfile) next(csvFileReader) # skipping column names for row in csvFileReader: seq_num.append(int(row[0]) win.append(int(row[6])) return def predict_win(X, y, x): win = np.reshape(X,(len(X), 1)) svr_lin = SVR(kernel= 'linear', C= 1e3) svr_poly = SVR(kernel= 'poly', C= 1e3, degree= 2) svr_rbf = SVR(kernel= 'rbf', C= 1e3, gamma= 0.1) svr_rbf.fit(X, y) svr_lin.fit(X, y) svr_poly.fit(X, y) plt.scatter(X, y, color= 'black', label= 'Data') plt.plot(y, svr_rbf.predict(X), color= 'red', label= 'RBF model') plt.plot(y,svr_lin.predict(X), color= 'green', label= 'Linear model') plt.plot(y,svr_poly.predict(X), color= 'blue', label= 'Polynomial model') plt.xlabel('X, other features') plt.ylabel('win') plt.title('Support Vector Regression') plt.legend() plt.show() return svr_rbf.predict(x)[0], svr_lin.predict(x)[0], svr_poly.predict(x)[0] get_data('net_data.csv') predicted_win = predict_win(X, y, 29)

My dataset is very large, so part of my csv dataset is included at the end. I am interested in the 7th column. I wanted to predict when the values in the 7th column increase or when it decreases. Is it only possible to look in the 7th column and make a time series prediction? Any help on this would be greatly appreciated? Thanks!

 0.007804347,10.0.0.11:49438,10.0.12.12:5001,32,3796291040,3796277984,10,2147483647,28960,3034,29312 0.007856739,10.0.0.11:49438,10.0.12.12:5001,32,3796293936,3796278008,11,2147483647,29056,2999,29312 0.010605189,10.0.0.11:49438,10.0.12.12:5001,32,3796320000,3796291040,20,2147483647,55040,2969,29312 0.010850907,10.0.0.11:49438,10.0.12.12:5001,32,3796348960,3796305520,30,2147483647,84096,2946,29312 0.013598458,10.0.0.11:49438,10.0.12.12:5001,32,3796377920,3796320000,40,2147483647,113024,2951,29312 0.01368011,10.0.0.11:49438,10.0.12.12:5001,32,3796434392,3796348960,60,2147483647,170880,2956,29312 0.015104265,10.0.0.11:49438,10.0.12.12:5001,32,3796434392,3796363440,70,2147483647,199936,2940,29312 0.016406964,10.0.0.11:49438,10.0.12.12:5001,32,3796490864,3796377920,80,2147483647,220160,2943,29312 0.016465876,10.0.0.11:49438,10.0.12.12:5001,32,3796537200,3796432944,81,80,330240,2925,29312 0.018355321,10.0.0.11:49438,10.0.12.12:5001,32,3796547336,3796434392,81,80,333056,2914,29312 0.020171945,10.0.0.11:49438,10.0.12.12:5001,32,3796603808,3796490864,83,80,382336,2956,29312 0.237314523,10.0.0.11:49438,10.0.12.12:5001,32,3810417728,3809658976,529,396,1775360,7109,29312 0.237409075,10.0.0.11:49438,10.0.12.12:5001,44,3810417728,3809700968,530,397,1859328,7381,29312 0.237486647,10.0.0.11:49438,10.0.12.12:5001,44,3810417728,3809700968,371,371,1960704,7365,29312 0.237807596,10.0.0.11:49438,10.0.12.12:5001,44,3810417728,3809700968,371,371,1980928,7362,29312 0.237989588,10.0.0.11:49438,10.0.12.12:5001,44,3810417728,3809700968,371,371,1989632,7400,29312 0.259123971,10.0.0.11:49438,10.0.12.12:5001,32,3811590608,3811251776,261,260,2267648,5885,29312 0.259174008,10.0.0.11:49438,10.0.12.12:5001,32,3811655768,3811289424,261,260,2267648,5918,29312 0.262546461,10.0.0.11:49438,10.0.12.12:5001,32,3811720928,3811354584,261,260,2267648,5823,29312

+1

python python-3.x scikit-learn time-series machine-learning

Mahsolid Nov 01 '16 at 10:03

source share

1 answer

bordeo · Accepted Answer · 2016-11-01T20:55:49+0000

Ok, the svm function below has a problem:

The second line win = ... not used and will result in an error. Delete it.

 def predict_win(X, y, x): win = np.reshape(X,(len(X), 1)) # <----This line svr_lin = SVR(kernel= 'linear', C= 1e3) svr_poly = SVR(kernel= 'poly', C= 1e3, degree= 2) svr_rbf = SVR(kernel= 'rbf', C= 1e3, gamma= 0.1) svr_rbf.fit(X, y) svr_lin.fit(X, y) svr_poly.fit(X, y) plt.scatter(X, y, color= 'black', label= 'Data') plt.plot(y, svr_rbf.predict(X), color= 'red', label= 'RBF model') plt.plot(y,svr_lin.predict(X), color= 'green', label= 'Linear model') plt.plot(y,svr_poly.predict(X), color= 'blue', label= 'Polynomial model') plt.xlabel('X, other features') plt.ylabel('win') plt.title('Support Vector Regression') plt.legend() plt.show() return svr_rbf.predict(x)[0], svr_lin.predict(x)[0], svr_poly.predict(x)[0]

Secondly, I do not know why there is an entire function for reading csv. Ignore it and use pandas. Here is an example of code that will work:

 from sklearn import svm import pandas as pd import numpy as np import matplotlib.pyplot as plt def predict_win(X,y,x): svr_lin = svm.SVR(kernel='linear',C=1e3) svr_poly = svm.SVR(kernel='poly',C=1e3, degree=2) svr_rbf = svm.SVR(kernel='rbf',C=1e3,gamma=0.1) svr_rbf.fit(X,y) svr_lin.fit(X,y) svr_poly.fit(X,y) plt.plot(y,svr_rbf.predict(X),color='red',label='RBF model') plt.plot(y,svr_lin.predict(X),color='green',label='Linear model') plt.plot(y,svr_poly.predict(X),color='blue', label='Polynomial model') plt.xlabel('X, other features') plt.ylabel('win') plt.title('Support Vector Regression') plt.legend() plt.show() return [svr_rbf.predict(x)[0],svr_lin.predict(x)[0],svr_poly.predict(x)[0]] df = pd.read_csv('data.csv') data_np_array = df.values y = np.ndarray.copy(data_np_array[:,6]) Xleft = np.ndarray.copy(data_np_array[:,:6]) Xright = np.ndarray.copy(data_np_array[:,7:]) X = np.hstack((Xleft,Xright)) x0 = np.ndarray.copy(X[0,:]) xp = predict_win(X,y,x0) percent_off = [min(data_np_array[0,2],prediction)/max(data_np_array[0,2],prediction) for prediction in xp]

The intermediate steps in which you clear the imported data, turn it from a data array into a numpy array, copy your 7th column as a regression to fit, remove it from your training data, and rebuild the new array. before installation in UVO.

 df = pd.read_csv('data.csv') data_np_array = df.values y = np.ndarray.copy(data_np_array[:,6]) Xleft = np.ndarray.copy(data_np_array[:,:6]) Xright = np.ndarray.copy(data_np_array[:,7:]) X = np.hstack((Xleft,Xright))

Let me know if this worked. I just took a few rows from the above data table.

Prediction: predicting future events using the SVR module

More articles: