Are the files in memory the same size as the file system?

Question

Are the files in memory the same size as the file system?

I worked with large log files (~ 100 MB) in Java and noticed that gzip can compress them to about ~ 3 MB, making them 35 times smaller.

Therefore, I wonder: Do modern OSs compress files before loading them into memory? It seems silly to use 100 MB of RAM to store a file that really only has 3 MB of information.

Or is it the other way around? Is the process of reading a file (and processing encodings and something else) means that a file that takes 100 MB on disk is actually more than 100 MB in memory?

* bonus points: any preprocessing recommendations I could make for my files before downloading them to reduce the use of JVM memory? (Files have the same format as Apache logs.)

+5

java file filesystems memory centos

mchen.ja Aug 05 '15 at 1:24

source share

2 answers

You get only what you ask for. If you compress it, it will be compressed. Most of the time there will be a slight difference between the size in memory and the size on disk. But this is only due to the fact that the storage unit on the disk (sector) is larger. Even for 1 byte file, in most cases you use more than on disk, because the OS reserves a sector for this, and this will depend on the OS, you will mainly find a sector of 512, 2048 or 4096 bytes.

0

innoSPG Aug 05 '15 at 1:29

source share

Jason · Accepted Answer · 2015-08-05T01:29:15+0000

Can modern OSs compress files before loading them into memory? It seems silly to use 100 MB of RAM to store a file that really only has 3 MB of information.

This will depend on the application. Some applications may compress data stored in memory, others may not.

Or is it the other way around? Is the process of reading a file (and processing encodings and something else) means that a file that takes 100 MB on disk is actually more than 100 MB in memory?

Again, it is completely application dependent.

* bonus points: any preprocessing recommendations I could make with my files before downloading them to reduce the use of JVM memory? (Files have the same format as Apache server logs.)

Do not load any data into memory that you do not need for processing or display. Everything that is simply required to obtain an average value or amount can be temporarily loaded and added to the current quantity, and then can be discarded.

Are the files in memory the same size as the file system?

More articles: