Memory management in c #

Good afternoon,

I have text files containing a list of pairs (2 grams, number) compiled by analyzing the body of newspaper articles that I need to load into memory when I start the task that I am developing. To save these pairs, I use the following structure:

private static Dictionary<String, Int64>[] ListaDigramas = new Dictionary<String, Int64>[27]; 

The idea of โ€‹โ€‹having a set of dictionaries is related to performance issues, as I read somewhere that a long dictionary negatively affects performance. However, every 2 grams fall into the dictionary, the corresponding first character of the ASCII code minus 97 (or 26 if the first character is not a character in the range from 'a' to 'z').

When I load pairs (2 grams, a number) into memory, the application takes up a total of 800 MB of RAM and remains so until I use a program called Memory Cleaner to free up memory. After that, the memory adopted by the program drops to the range of 7Mb-100Mb, without loss of functionality (I think).

Is there a way to free memory in this way, but without using an external application? I tried using GC.Collect() , but in this case it does not work.

Thank you very much.

+4
source share
7 answers

The only other idea that I could come up with if you really want to save memory usage would be to store the dictionary in a stream and compress it. Factors to consider are how often you access and upload this data, and how much the data is compressed. Text from newspaper articles will compress very well, and the performance hit may be less than you think.

Using an open source library like SharpZipLib ( http://www.icsharpcode.net/opensource/sharpziplib/ ), your code will look something like this:

 MemoryStream stream = new MemoryStream(); BinaryFormatter formatter = new BinaryFormatter(); formatter.Serialize(stream, ListaDigramas); byte[] dictBytes = stream.ToArray(); Stream zipStream = new DeflaterOutputStream(new MemoryStream()); zipStream.Write(dictBytes, 0, dictBytes.Length); 

To inflate, you need an InflaterInputStream and a loop to inflate a stream in pieces, but it's pretty simple.

You will need to play with the application to make sure that the performance is acceptable. Remember, of course, that you will need enough memory to store the dictionary when you inflate it for use (unless someone has a clever idea to work with the object in a compressed state).

Honestly, keeping it as it is in memory and allowing Windows to swap it for a sample file is probably your best / fastest option.

Edit
I never tried, but you could serialize directly in the compression stream, which means that the overhead for compression is minimal (you still have overhead for serialization):

 MemoryStream stream = new MemoryStream(); BinaryFormatter formatter = new BinaryFormatter(); Stream zipStream = new DeflaterOutputStream(new MemoryStream()); formatter.Serialize(zipStream, ListaDigramas); 
0
source

You are using a static field, so it is likely that after loading it it never gets garbage collection, so if you do not call the .Clear() method of this dictionary, it probably will not depend on garbage collection.

+8
source

It's pretty mysterious to me how such utilities ever do this on some machine. All they do is call EmptyWorkingSet (). It may look good in Taskmgr.exe, but otherwise it's just a way to keep the hard drive busy without the need. You will get the same, minimizing the main window of your application.

+4
source

I donโ€™t know the details of how the memory cleaner works, but given that it is unlikely to know the internal workings of program memory allocations, the best that may be possible is simply to make the pages be replaced on disk, reducing the apparent memory usage of the program.

Garbage collection will not help if you do not have objects that you no longer use. If you use dictionaries that the GC considers you are in, since this is a static field, then all objects in them are considered used and must belong to the programโ€™s active memory. There is no way around this.

+3
source

What you see is the overall use of the application. This is 800 MB and will remain so. As you can see from the comments, cleaning the memory makes it look like the application uses less memory. What you can try to do is access all the values โ€‹โ€‹in the dictionary after running the memory cleaner. You will see that memory usage is increasing again (it is read from swap).

What you probably want is not to load all this data into memory. Is there a way to get the same results using an algorithm?

Alternatively, and this is likely to be the best option, if you are actually storing information here, you can use a database. If you are cumbersome to use a regular database, such as SQLExpress, you can always go SQLite .

+1
source

Thanks so much for all the answers. In fact, the data should be loaded during the whole time the application is running, so based on your answers, I think there is nothing better ... I could try an external database, but since I already have to deal with two other databases at that same time, I think this is not a good idea.

Do you consider it possible to use three databases simultaneously and not lose performance?

0
source

If you properly manage application resources, the actual memory used may not be what you see (if it is checked using the task manager).

The Garbage Collector will free up unused memory as much as possible. Usually it is not a good idea to make a collection ... see this post

"it is really necessary to download during the whole time the application is running" - why?

0
source

All Articles