I have a large dataset , possibly over a million records. All elements have an assigned time stamp, and the elements are added to the set at runtime (usually, but not always, with a newer time stamp). I need to show a subset of this data with a specific time range. This time range is usually quite small compared to the general data set, that is, out of 1,000,000+ elements not exceeding about 1,000 in a given given time interval. This time range moves at a constant pace, for example. every second the time range moves by one second. In addition, the user can adjust the time range at any time ("move" through the data set) or set additional filters (for example, a filter by text).
So far, I have not worried about performance, trying to understand that everything is in order, and worked only with smaller test suites. I am not quite sure how to solve this problem effectively and I will be happy for every contribution. Thanks.
Edit: The language used is C # 4.
Update: now I use the interval tree, an implementation can be found here: https://github.com/mbuchetics/RangeTree
It also comes with an asynchronous version that restores the tree using a parallel task library (TPL).
source share