.NET binary read performance

I have a very large set of binaries in which several thousand raw video frames are sequentially read and processed, and now I'm trying to optimize it, since it seems to be more connected to the processor than binding to I / O.

Currently, frames are being read this way, and I suspect this is the biggest culprit:

private byte[] frameBuf; BinaryReader binRead = new BinaryReader(FS); // Initialize a new buffer of sizeof(frame) frameBuf = new byte[VARIABLE_BUFFER_SIZE]; //Read sizeof(frame) bytes from the file frameBuf = binRead.ReadBytes(VARIABLE_BUFFER_SIZE); 

Will it be of great importance in .NET to reorganize I / O to avoid creating all of these new byte arrays with each frame?

My understanding of .NETs memory allocation mechanism is weak, as I come from a clean C / C ++ background. My idea is to overwrite this to share a static buffer class that contains a very large shared buffer with an integer that keeps track of the actual frame size, but I like the simplicity and readability of the current implementation, and I would prefer to keep it if the CLR is already handles this in some way that I don't know about.

Any input is welcome.

+6
performance c # file binary
source share
2 answers

You do not need to initialize frameBuf if you use binRead.ReadBytes - you will return a new byte array that will overwrite the one you just created. However, this creates a new array for each read.

If you want to avoid creating an array of bytes, you can use binRead.Read , which will put the bytes in the array that you give it. However, if other threads use the array, they will see that its contents change right in front of them. Make sure you can ensure that you are done with the buffer before reusing it.

+7
source share

You need to be careful. It is very easy to get completely fictitious test results for such code, results that are never reproduced in real use. The problem is in the file system cache, it will cache the data that you read from the file. The problem starts when you run your test again and again, tweak the code, and look for improvements.

The second and subsequent time the test starts, data is no longer output from disk. It is still present in the cache; only memory with memory is required to copy it to memory. This is very fast, a microsecond or so overhead plus the time it takes to copy. Which works at speeds of at least 5 gigabytes per second on modern machines.

Now your test will show that you spend a lot of time allocating a buffer and processing data, relative to the amount of time spent reading data.

It is rarely reproduced in real use. The data will not yet be in the cache, now the disk drive should look for data (many milliseconds), and it must be read from the disk drive (at best, several tens of megabytes per second). Reading data now takes a good three out of four times longer. If you manage to make the processing step twice as fast, your program will work only 0.05% faster. Give or take.

+1
source share

All Articles