In fact, the Weldords algorithm allows AFAICT to easily adapt to calculate weighted variance. And by setting the scale to -1, you should be able to effectively undo items. I did not check the math, whether it allows negative weights, but at first glance it should be!
I did a little experiment using ELKI :
void testSlidingWindowVariance() { MeanVariance mv = new MeanVariance(); // ELKI implementation of weighted Welford! MeanVariance mc = new MeanVariance(); // Control. Random r = new Random(); double[] data = new double[1000]; for (int i = 0; i < data.length; i++) { data[i] = r.nextDouble(); } // Pre-roll: for (int i = 0; i < 10; i++) { mv.put(data[i]); } // Compare to window approach for (int i = 10; i < data.length; i++) { mv.put(data[i-10], -1.); // Remove mv.put(data[i]); mc.reset(); // Reset statistics for (int j = i - 9; j <= i; j++) { mc.put(data[j]); } assertEquals("Variance does not agree.", mv.getSampleVariance(), mc.getSampleVariance(), 1e-14); } }
I get about 14 digits of accuracy compared to the exact two-pass algorithm; this is about the same as you would expect from a pair. Please note that Welford does receive some computational costs due to additional divisions - it takes about twice as much as the exact two-pass algorithm. If the size of your window is small, it may be much more reasonable to actually recalculate the average value, and then skip the variance in the second one each time.
I added this experiment as a unit test in ELKI, you can see the full source here: http://elki.dbs.ifi.lmu.de/browser/elki/trunk/test/de/lmu/ifi/dbs/elki/math /TestSlidingVariance.java it is also compared with accurate two-pass dispersion.
However, the behavior may be different on skewed datasets. Obviously, this data set is uniformly distributed; but I also tried a sorted array and it worked.
Erich Schubert Jan 05 '13 at 13:47
source share