Caching data of a large file in memory in java

Hi, I am working on a Spelling Corrector natural language correction project, and I assume that I read the data from a file the size of which is 6.2 MB 1 GB. Although it works fine, the problem I am facing is that every time I run a java program, I have to load data into memory, and every time it starts, it takes the same amount of time.

How can this data be cached into memory in java? Can someone offer me some work?

Basically, what I want to know is what is the procedure for storing the contents of a large file in memory so that I do not have to read it again? let's say the file has GB.

+4
source share
4 answers

I see here that loading / parsing data from a file and creating a cache causes some time delay, and you want to save time from this every time.

In this case, I suggest you use EHcache . EHcache (which is open source and apache source) will support cahce for you, prevent your application from making memory errors, and save cahce state on disk.

Thus, the next time you download your application, you can configure the application to directly download from the EHcahce data file, so this way you will not parse your file again and again.

You can still load everything you use in memory, only the difference is loaded, although the EHCache APIs.

+2
source

6.2 MB of data is likely to be stored in the cache of your operating system, as this is a relatively small amount of data, and therefore does not need to spend a lot of time downloading. You should investigate whether parsing this data is lengthy and it is possible to cache the parsed data in a binary file for fast loading.

+5
source

6.2 MB is not very big, and if it does not take a lot of time, and you cannot use the background stream to download the file, I would not worry about that.

You can use files with memory mapping, but working with them is not so simple. Memory mapped files are useful if you have 1 GB to 1 TB of data.

+4
source

If you intend to code / debug your program, and it seems like reloading resources for each change you make takes too much time, then consider JRebel Social (if it's a non-profit project, or JRebel if that is). This allows you to correct errors in the code or make some changes without restarting your virtual machine, so you can save downloaded data (for example, saved in a static variable), without using a cache or even rebooting your virtual machine. See My previous question: Resource loading after Java. But if it is for production and your goal is to save memory, and not save loading time (which in most cases is a problem limited only at startup), then EhCache or other caching libraries should be sufficient.

+1
source

All Articles