Writing Text Files - Performance?

We are going to start a new project, which completes the recording of about 5,000 files of different sizes at the end of the process. All files are plain text files, and I wonder how best (if anyone has experience) to write them.

I was thinking about using file templates (preloaded into memory) or direct file streams.

I wonder if anyone has experience and can share it with me. thanks

+7
source share
4 answers

I would suggest writing a prototype to check in advance if you can meet the performance requirements as you would like to implement the project. But do not forget that hard drives are sometimes difficult to evaluate (although their name is probably not related to this fact :-)): they have caches, and their performance can be very different from background processes, fragmentation, file system, etc. .

The rule of thumb is to reduce the number of files. This is usually the fastest if you first write everything to a memory buffer and then write that buffer to disk. (It would be very bad to write char in char.)

Depending on the file system, it may also be faster to write one large file instead of many small ones, so creating a ZIP archive may be an alternative.

The windows have the MultiMediaFile IO (native) API, which can be faster than standard input-output mechanisms (http://home.roadrunner.com/~jgglatt/tech/mmio.htm) in several cases, even if your the content is not "Multimedia."

+2
source

The curiosity is that the "best way" knows only you.

For example, writing a large file with small chunks may be an affordable solution, since you do not consume too much memory and perform your operation in slow-write mode. Bad : long I / O; Good : low memory

Or collect data in large chunks of data and perform single atomic records. Bad : requires a lot of memory. Good : the usually suggested choice is Open / Read-Write / Close as soon as possible,

Or using MemoryMappedFiles: use constant pointers (usually) for some file and averaging between consuming available performance and low memory . Usually a very good, if not the only, choice for very large files, such as multimedia file processing.

The choice is up to you.

As well as material for in-depth performance analysis, I would suggest an amazing source, such as the Rico Mariani blog

0
source

If you use the standard .Net libraries and do something like this (in the catch try block):

using (StreamWriter writer = new StreamWriter("filenumber1.txt")) { writer.Write("This is a test"); //Write no newline writer.WriteLine("This is a test"); //Write with newline } 

Performance must be reasonable. When writing to a file, just keep the lines in a decent size (read and write and fragments, if necessary) to avoid memory problems. For example, if the data making up the file is 10 gigabytes, you will need to write the lines in pieces.

I once had to read the 1000th drops in the database and push them to distribution servers in the file system. My initial approach was one read and write. That was good, then I used a multi-threaded approach and got decent performance.

First, I would take one approach to the operation and perform some actions. If it takes X the amount of time, and everyone is happy, done. If you need to do Y, implement a multi-thread approach.

Just notice, I would make the number of threads customizable so that performance can be implemented. Too many threads and it slows down. You need to find a sweet spot, so make it customizable. This usually depends on the hardware.

0
source

With such a large write to disk, I would look more at the disk layout (raid, etc.), since saving several CPU cycles may not be as useful as having a faster disk subsystem.

0
source

All Articles