Erlang File I / O: large binary files and gzip streaming

I have two questions regarding the Erlang i / o file; best way to reach in Erlang:

  • reading large binary files (many gigabytes) without copying the entire file to memory
  • reading gzipped binary as compressed stream

Thanks!

+6
file file-io erlang gzip
source share
2 answers
+4
source share

In my experience, the: read / 2 file will be very slow if it is often called with small amounts of data, despite read_ahead and raw . You must implement a binary buffer on top of this. If this is meant by block-oriented processing, I agree.

I am talking about the running time of several hours (with file: read / 2 only) compared to 2 minutes (with buffering implemented in pure Erlang).

Here are my measurements for reading several 10 bytes at once:

 %% Bufsize vs. runtime [ns] %% 50 169369703 %% 100 118288832 %% 1000 70187233 %% 10000 64615506 %% 100000 65087411 %% 1000000 64747497 

In this example, the performance does not actually exceed the buffer size of 10 KB, since the relative overhead for the file: reading becomes quite small.

+5
source share

All Articles