How to get a weighted Gaussian filter

I have a set of weighted x,y points, as shown below (full set here ):

 # xyw -0.038 2.0127 0.71 0.058 1.9557 1 0.067 2.0016 0.9 0.072 2.0316 0.83 ... 

I need to find a smooth line that corrects these points in accordance with the importance assigned to each, that is: more weight means that the data point should have more relevance.

This is the code that I still have, which basically applies gaussian_filter1d to the data (I got an idea from this question: the algorithm for smoothing strings in python? ):

 import matplotlib.pyplot as plt import numpy as np from scipy.ndimage import gaussian_filter1d # Read data from file. data = np.loadtxt('data_file', unpack=True) x, y, w = data[0], data[1], data[2] # Return evenly spaced numbers over a specified interval. t = np.linspace(0, 1, len(x)) t2 = np.linspace(0, 1, 100) # One-dimensional linear interpolation. x2 = np.interp(t2, t, x) y2 = np.interp(t2, t, y) # Obtain Gaussian filter with fixed sigma value. sigma = 7 x3 = gaussian_filter1d(x2, sigma) y3 = gaussian_filter1d(y2, sigma) # Make plot. cm = plt.cm.get_cmap('RdYlBu') plt.scatter(x, y, marker="o", c=w, s=40, cmap=cm, lw=0.5, vmin=0, vmax=1) plt.plot(x3, y3, "r", lw=2) plt.show() 

This code creates the following graph (the bluer dots have a higher weight value):

plot

The problem is that this fit does not take into account the weights assigned to each point. How can I enter this information into a Gaussian filter?

+4
source share
1 answer

Please note that the following idea is not an exact solution, but it is worth a try.

The idea is to use the weight parameter w to repeat the corresponding values ​​in x and y . Therefore, if you scale w , for example, to the range [1,10] , all corresponding values ​​in x and so in y will be duplicated 10 times for w equal to 10. That is, the new x , y . Thus, we consider weight as the frequency of values ​​in x and y . Having done this, introducing new algorithms, we hope that you will get the desired results, as shown in the examples below.

  • For the first figure, the blue-red spectrum corresponds to lower-up weights. Numbers are names duplicating factor as described above.
  • For the second picture, your data, we did not touch the color format.

enter image description here

enter image description here

+5
source

All Articles