Erlang File I / O: large binary files and gzip streaming

Question

Erlang File I / O: large binary files and gzip streaming

I have two questions regarding the Erlang i / o file; best way to reach in Erlang:

reading large binary files (many gigabytes) without copying the entire file to memory
reading gzipped binary as compressed stream

Thanks!

+6

file file-io erlang gzip

Erlang Sep 27 '10 at 20:56

source share

2 answers

In my experience, the: read / 2 file will be very slow if it is often called with small amounts of data, despite read_ahead and raw . You must implement a binary buffer on top of this. If this is meant by block-oriented processing, I agree.

I am talking about the running time of several hours (with file: read / 2 only) compared to 2 minutes (with buffering implemented in pure Erlang).

Here are my measurements for reading several 10 bytes at once:

 %% Bufsize vs. runtime [ns] %% 50 169369703 %% 100 118288832 %% 1000 70187233 %% 10000 64615506 %% 100000 65087411 %% 1000000 64747497

In this example, the performance does not actually exceed the buffer size of 10 KB, since the relative overhead for the file: reading becomes quite small.

+5

tzp Mar 19 '12 at 13:58

source share

Hynek -Pichi- Vychodil · Accepted Answer · 2010-09-27T21:50:26+0000

See file:read/2 for sequential access to a block and file:pread/2,3 for random access.
See compressed in file:open/2 .

Erlang File I / O: large binary files and gzip streaming

More articles: