What is the most efficient data structure for storing time series of a partially changing array?

Question

What is the most efficient data structure for storing time series of a partially changing array?

My problem: I have an array of objects of size N. after each time (t + 1), some of the values of the array may or may not change. so let's say that the index t + 1 15 changes, but everything else remains the same.

What is the most efficient way to store something like this (in memory), other than just an array of arrays, of course? I want to be able to get all array values at any time, say getValues (long time) in the most efficient way.

Say 4 arrays

time 1 zero zero zero hoog

time 2 zero zero alphabet xyz

(note that only abc is here), but we keep the values of the last index from time 1.

+4

time data-structures time-series core-data series

Dez uezue 30 sept '10 at 6:13

source share

1 answer

jbapple · Accepted Answer · 2010-10-03T14:43:30+0000

What you are asking for is known in the CS field as “partially constant arrays”. For more information about conservation, see the Kaplan poll .

One simple solution is to use search trees instead of arrays. You can use purely functional balanced search trees to ensure that every read or write takes O (lg n) the worst time (where n is the size of the array) instead of O (1). (If you save timestamped versions in an expandable array, adding a new version of O (1) is amortized and accessing the old version is O (1) in the worst case.)

If you are not familiar with purely functional search trees, you can read this introduction to Clojure persistent arrays . These are purely functional trees, indexed integers of a fixed width (in the form of trie) and, therefore, a fixed depth. This may be the fastest solution to your problem, both in the worst case and in the real world.

The only other partially persistent arrays that I know of can take longer in the worst case. Some of them have higher amortization, and some of them have more expected complexity if the arrays are accessed in various ways.

What is the most efficient data structure for storing time series of a partially changing array?

More articles: