I am currently studying F #, and I am studying its use for analyzing financial time series. Can anyone recommend a good data structure for storing time series data?
F # offers a wide selection of native types, and I'm looking for some simple combination that will provide an elegant, concise and efficient solution.
I am looking for tick storage data, which consists of millions of time-stamped records and several (~ 5-20) numeric and text data fields with possible missing values.
My first thoughts were perhaps a sequence of tuples or records, but I was wondering if anyone could kindly suggest something that worked well in the real world.
EDIT:
A few additional points for clarification:
Common operations that I will most likely require:
- Time Search - i.e. Find the latest data point at a given time.
- Time pooling
- Attaches (Updates and deletions will be rare.)
I should clearly indicate that I am studying the use of F # primarily as an interactive research tool, with the ability to compile as a (really big) added bonus.
OTHER EDITING:
I also had to mention my role / use of F #, and this data is purely research, not development. The goal is that as soon as we understand the data (and what we want to do with it), we can later indicate the tools that our developers will build. For example, data warehouses, etc., in which we will begin to use our data structures, etc.
Although I am concerned that our models are computationally intensive, consume a lot of memory and may not always be encoded in a recursive manner. So we all ultimately have to request large chunks anyway.
I must also say that I have always used Matlab or R for these tasks, but now I am interested in F # because it offers high-level interactive flexibility for research, but the same code can be used in production.
I apologize for the fact that I did not give this contextual information at the beginning (this is my first question), now I see that it helps people to formulate their answers.
Thanks again to everyone who took the time to help me.