Implemented an implementation of IList file with memory for storing large data sets "in memory"?

I need to perform chronologically operations on huge time series implemented as IList. The data is ultimately stored in a database, but it would be pointless to send tens of millions of queries to the database.

Currently, ILIM in memory throws an OutOfMemory exception when trying to save more than 8 million (small) objects, although I will need to handle tens of millions.

After some research, it seems the best way to do this is to store the data on disk and access it through the IList wrapper.

Memory mapped files (introduced in .NET 4.0) seem like the right interface to use, but I'm wondering what is the best way to write a class that IList should implement (for easy access) and handle the memory mapped file internally.

I am also interested to hear if you know about other ways! I was thinking, for example, about the IList shell using data from db4o ( someone mentioned here using a memory mapped file as an IoAdapterFile, although perhaps using db4o adds performance cost and works directly with a memory mapped file).

I met this question in 2009, but it did not give useful answers or serious ideas.

+7
source share
3 answers

I found this PersistentDictionary <> , but it only works with strings, and while reading the source code, I'm not sure if it was designed for very large datasets.

More scalable (up to 16 TB), ESENT PersistentDictionary <> uses the ESENT database engine present in Windows (XP +) and can store all serializable objects containing simple types.

Disk-based data structures, including a dictionary, list, and array with the help of an intelligent serializer , looked exactly like me, but it does not work smoothly with extremely large data sets, especially since it does not use native .NET MemoryMappedFiles, and support for 32-bit systems is experimental.

Update 1 . I ended up implementing my own version, which makes extensive use of .NET MemoryMappedFiles; it is very fast, and I will probably release it on Codeplex as soon as I do it better for more general purposes.

Update 2 : TeaFiles.Net also worked great for my purpose. Highly recommended (and free).

+8
source

I see several options:

  • "in memory-DB"
    for example, SQLite can be used this way - there is no need for any settings, etc., just by deploying a DLL (1 or 2) with the application, and the rest can be done programmatically
  • Download all the data into a temporary table in the database, with unknown (but large) amounts of data that I found that it pays off very quickly (and processing can usually be done inside the database, which is even better!)
  • use a MemoryMappedFile and a fixed size structure (access to the array through offset), but be careful that physical memory is the limit, except that you use some kind of "sliding window" to display only parts in memory.
+3
source

Memory mapped files are a good way to do this. But it will be very slow if you need to accidentally access things.

It is best to choose the size of the fixed structure when storing it in memory (if possible), then you use the offset as the identifier of the list item. However, deleting / sorting is always a problem.

+1
source

All Articles