Fill in missing data by interpolation in Google Spreadsheet

I have a Google Spreadsheet with the following data

ABD 1 Date Weight Computation 2 2015/12/09 =B2*2 3 2015/12/10 65 =B3*2 4 2015/12/11 =B4*2 5 2015/12/12 =B5*2 6 2015/12/14 62 =B6*2 7 2015/12/15 =B7*2 8 2015/12/16 61 =B8*2 9 2015/12/17 =B9*2 

I want to calculate the weight wrt date and / or use it with other columns that calculate other values โ€‹โ€‹from the weight. However, you will notice that there are some missing entries. What I want is another column that has data that is based on a Weight column with missing values, interpolated and filled. For instance:

  ABCD 1 Date Weight WeightI Computation 2 2015/12/09 65 =C2*2 # use first known value 3 2015/12/10 65 65 =C3*2 4 2015/12/11 64 =C4*2 # =(62-65)/3*(1)+65 5 2015/12/12 63 =C5*2 # =(62-65)/3*(2)+65 6 2015/12/14 62 62 =C6*2 7 2015/12/15 61.5 =C7*2 # =(61-62)/2*(1)+62 8 2015/12/16 61 61 =C8*2 9 2015/12/17 61 =C9*2 # use the last known value 

Column C contains values โ€‹โ€‹populated using linear interpolation when I need to find missing data between two known points.

I believe this is a very simple and common use case, so I'm sure this is a trivial thing, but I can not find a solution using the built-in functions. I do not have much experience with spreadsheets. I spent several hours experimenting with = INDEX, = MATCH, = VLOOKUP, = LINEST, = TREND, etc., but I can't come up with any of the examples. The only solution I could use was to create a custom function using Google Apps Script. Although my solution works, it seems to be running very slowly. My table is also huge.

Any pointers, solutions?

+6
source share
3 answers

Find a solution that satisfies most of my requirements using:

  • It is used =FILTER() to first delete empty lines where data is not available (thanks for the help from "pnuts").

  • AND =MATCH() to search for two consecutive rows from a filtered table. In my case, I was able to use this function because column A sorted and has no repetitions.

  • And then using the line formula to interpolate the values.

Thus, the output will look like this:

  ABCDE 1 Date Weight FDdate FWeight IWeight 2 2015/05/09 2015/05/10 65.00 #N/A 3 2015/05/10 65.00 2015/05/13 62.00 65.00 4 2015/05/11 2015/05/15 61.00 64.00 5 2015/05/12 63.00 6 2015/05/13 62.00 62.00 7 2015/05/14 61.50 8 2015/05/15 61.00 61.00 9 2015/05/16 61.00 10 2015/05/17 61.00 

If cells C2 and D2 have the following range formula (a minor note: you can, of course, combine the following formulas if columns A and B are adjacent):

 C2 =FILTER($A$2:$A$10, NOT(ISBLANK($B$2:$B$10))) D2 =FILTER($B$2:$B$10, NOT(ISBLANK($B$2:$B$10))) 

Cells E2 through E10 contain the following line interpolation formula: [ y = y1 + (y2 - y1) / (x2 - x1) * (x - x1) ]:

 E2 =(INDEX($D:$D, MATCH($A2, $C:$C, 1), 1)) +(INDEX($D:$D, MATCH($A2, $C:$C, 1) + 1, 1) - INDEX($D:$D, MATCH($A2, $C:$C, 1), 1)) /(INDEX($C:$C, MATCH($A2, $C:$C, 1) + 1, 1) - INDEX($C:$C, MATCH($A2, $C:$C, 1), 1)) *(INDEX($C:$C, MATCH($A2, $C:$C, 1), 1) - $A2) * -1 

This solution does not work when the first B2 cell does not matter, resulting in the formula # N / A. All this would be much more effective if we had something like =INTERPOLATE_LINE( A2, $A$2:$A$10, $B$2:$B$10 ) in google spreadsheet, but unfortunately this does not exist . Please correct me if I missed this in my reading of the supported functions in google spreadsheet.

+4
source

You might want to use forecast , for which it may be more convenient to first separate the dates that you have from those that you donโ€™t (and regroup later). So, only three readings say:

  AB 1 10/12/2015 65 2 14/12/2015 62 3 16/12/2015 61 

and dates that require values โ€‹โ€‹in the lower left:

  6 09/12/2015 65.6 7 11/12/2015 64.3 8 12/12/2015 63.6 9 15/12/2015 61.5 10 17/12/2015 60.2 

The formula leading to 65.6 in B6 (and copied from there to satisfy):

 =forecast(A6,$B$1:$B$3,$A$1:$A$3) 

This is not calculated as you show, but can be considered somewhat more accurate, in particular by extrapolating the missing final values, and not just repeating their closest available value.

After calculating the values, you probably want to collect the data in order of dates. Therefore, I suggest copying B6: B10 and Edit, Paste special, Paste values only on top, and then sort according.

The table below compares the results above (blue) with the results in your OP (green) and marks the data:

SO34150309 example

+7
source

I found a solution that fully meets the requirements. I used a separate sheet to break the calculation into pieces.

Create a new sheet. Enter the following formulas in cells A2-F2, and then copy them on the page.

  • Cell A2: Copy the weight data into the first column. (In this example, the sheet name is Daily Record, and the weights are written in column D.)

    'Daily Record'!D2

  • Cell B2: Find the most recently recorded weight.

    =INDEX(FILTER(A$2:A2,A$2:A2 <> ""),COUNT(FILTER(A$2:A2,A$2:A2 <> "")),1)

  • Cell C2: Count the number of days since the last weighing.

    =IF(A2<>"",0,IF(ROW(C2)<3,0,C1+1))

  • Cell D2: Find the next recorded weight (from the current date or later).

    =IFERROR(INDEX(FILTER(A2:A,A2:A <> ""),1,1),"")

  • Cell E2: Count the number of days before the next weighing.

    =IF(A2<>"",0,IF(E3="","",E3+1))

  • Cell F2: Calculate the interpolated weight.

    =IF(A2 <> "", A2, IF(D2 = "", "", B2 + (D2-B2)*C2/(C2+E2)))

+1
source

All Articles