Extrapolating Pandas DataFrame s
DataFrame can be extrapolated, however in Pandas there is no simple method call and another library is required (for example, scipy.optimize ).
Extrapolation
Extrapolation generally requires some extrapolations to assume data assumptions . One way is to bind a curve to a regular parametrized equation to data to find parameter values ββthat best describe existing data, which are then used to calculate values ββthat go beyond that data. The difficult and limiting problem with this approach is that when choosing a parametrized equation, some assumption about the trend should be made. This can be found through trial and error with different equations to give the desired result, or sometimes it can be inferred from the data source. The data provided in the question are actually not large enough for the data set to obtain a curve of a suitable well; however, this is good enough to illustrate.
Below is an example of extrapolating a DataFrame using a polynomial of order 3 rd
f (x) = ax 3 + bx 2 + cx + d (Eq. 1)
This common function ( func() ) corresponds to a curve for each column to obtain unique column-specific parameters (i.e. a, b, c, d). These parameterized equations are then used to extrapolate the data in each column for all indices with NaN s.
import pandas as pd from cStringIO import StringIO from scipy.optimize import curve_fit df = pd.read_table(StringIO(''' neg neu pos avg 0 NaN NaN NaN NaN 250 0.508475 0.527027 0.641292 0.558931 500 NaN NaN NaN NaN 1000 0.650000 0.571429 0.653983 0.625137 2000 NaN NaN NaN NaN 3000 0.619718 0.663158 0.665468 0.649448 4000 NaN NaN NaN NaN 6000 NaN NaN NaN NaN 8000 NaN NaN NaN NaN 10000 NaN NaN NaN NaN 20000 NaN NaN NaN NaN 30000 NaN NaN NaN NaN 50000 NaN NaN NaN NaN'''), sep='\s+')
Extrapolating Results
Interpolated data: neg neu pos avg 0 NaN NaN NaN NaN 250 0.508475 0.527027 0.641292 0.558931 500 0.508475 0.527027 0.641292 0.558931 1000 0.650000 0.571429 0.653983 0.625137 2000 0.650000 0.571429 0.653983 0.625137 3000 0.619718 0.663158 0.665468 0.649448 4000 NaN NaN NaN NaN 6000 NaN NaN NaN NaN 8000 NaN NaN NaN NaN 10000 NaN NaN NaN NaN 20000 NaN NaN NaN NaN 30000 NaN NaN NaN NaN 50000 NaN NaN NaN NaN Extrapolated data: neg neu pos avg 0 0.411206 0.486983 0.631233 0.509807 250 0.508475 0.527027 0.641292 0.558931 500 0.508475 0.527027 0.641292 0.558931 1000 0.650000 0.571429 0.653983 0.625137 2000 0.650000 0.571429 0.653983 0.625137 3000 0.619718 0.663158 0.665468 0.649448 4000 0.621036 0.969232 0.708464 0.766245 6000 1.197762 2.799529 0.991552 1.662954 8000 3.281869 7.191776 1.702860 4.058855 10000 7.767992 15.272849 3.041316 8.694096 20000 97.540944 150.451269 26.103320 91.365599 30000 381.559069 546.881749 94.683310 341.042883 50000 1979.646859 2686.936912 467.861511 1711.489069 Data was extrapolated with these column functions: f_neg(x) = 1.864e-11 x^3 + -1.471e-07 x^2 + 0.0003 x + 0.4112 f_neu(x) = 2.348e-11 x^3 + -1.023e-07 x^2 + 0.0002 x + 0.4870 f_avg(x) = 1.542e-11 x^3 + -9.016e-08 x^2 + 0.0002 x + 0.5098 f_pos(x) = 4.144e-12 x^3 + -2.107e-08 x^2 + 0.0000 x + 0.6312
Plot for avg column

Without a large dataset or knowing the data source, this result may be completely wrong, but it should demonstrate the process of extrapolating a DataFrame . The proposed equation in func() will probably need to be played in order to get the correct extrapolation. In addition, no attempt was made to make the code efficient.
Update:
If your index is not numeric, like DatetimeIndex , see this answer for how to extrapolate it.
tmthydvnprt
source share