Node: fs write () does not write the inner loop. Why not?

I want to create a write stream and write to it as my data comes in. However, I can create the file but nothing is written. In the end, the process ends from memory.

The problem is, I found that I am calling write () while inside the loop.

Here is a simple example:

'use strict' var fs = require('fs'); var wstream = fs.createWriteStream('myOutput.txt'); for (var i = 0; i < 10000000000; i++) { wstream.write(i+'\n'); } console.log('End!') wstream.end(); 

Nothing is written, even hello. But why? How to write a file in a loop?

+4
javascript fs
source share
2 answers

The problem is that you never give it the opportunity to drain the buffer. In the end, this buffer fills up and you run out of memory.

WriteStream.write returns a boolean value indicating whether the data was successfully written to disk. If the data was not successfully written, you should wait for the drain event , which indicates that the buffer has been deleted.

Here is one way to write code that uses the return value of write and the drain event:

 'use strict' var fs = require('fs'); var wstream = fs.createWriteStream('myOutput.txt'); function writeToStream(i) { for (; i < 10000000000; i++) { if (!wstream.write(i + '\n')) { // Wait for it to drain then start writing data from where we left off wstream.once('drain', function() { writeToStream(i + 1); }); return; } } console.log('End!') wstream.end(); } writeToStream(0); 
+5
source share

To add @MikeC's excellent answer , here are some important information from current documents (v8.4.0) for writable.write() :

If false returned, further attempts to write data to the stream should stop until the 'drain' event occurs.

Until the stream merges, write() calls will buffer chunk and return false . After all currently buffered chunks have been merged (accepted for delivery by the operating system), the 'drain' event will occur. It is recommended that after write() returns false , no more fragments will be written until the 'drain' event is 'drain' . When write() called in a stream that does not merge, it is allowed, Node.js will buffer all recorded fragments until the maximum memory usage occurs, after which it will be unconditionally canceled . Even before it stops, using high memory will result in poor garbage collection performance and high RSS (which usually doesn’t come back to the system, even after the memory is no longer required).

and for reverse compression in streams :

In any case, when the data buffer has exceeded the highWaterMark value or the write queue is currently busy, .write() will return false .

When false returned, the backpressure system starts.

Once the data buffer is empty, the .drain() event will occur and resume the flow of incoming data.

Once the queue is completed, backpressure will allow data to be sent again. The memory space that was used will be freed up and prepared for the next batch of data.

  +-------------------+ +=================+ | Writable Stream +---------> .write(chunk) | +-------------------+ +=======+=========+ | +------------------v---------+ +-> if (!chunk) | Is this chunk too big? | | emit .end(); | Is the queue busy? | +-> else +-------+----------------+---+ | emit .write(); | | ^ +--v---+ +---v---+ ^-----------------------------------< No | | Yes | +------+ +---v---+ | emit .pause(); +=================+ | ^-----------------------+ return false; <-----+---+ +=================+ | | when queue is empty +============+ | ^-----------------------< Buffering | | | |============| | +> emit .drain(); | ^Buffer^ | | +> emit .resume(); +------------+ | | ^Buffer^ | | +------------+ add chunk to queue | | <---^---------------------< +============+ 

Here are some visualizations (running a script with a 512 MB V8 heap memory with --max-old-space-size=512 ).

This visualization shows heap memory usage (red) and delta time (purple) for every 10,000 i steps (X axis shows i ):

 'use strict' var fs = require('fs'); var wstream = fs.createWriteStream('myOutput.txt'); var latestTime = (new Date()).getTime(); var currentTime; for (var i = 0; i < 10000000000; i++) { wstream.write(i+'\n'); if (i % 10000 === 0) { currentTime = (new Date()).getTime(); console.log([ // Output CSV data for visualisation i, (currentTime - latestTime) / 5, process.memoryUsage().heapUsed / (1024 * 1024) ].join(',')); latestTime = currentTime; } } console.log('End!') wstream.end(); 

Slow statistics

The script runs slower and slower when memory usage approaches the maximum limit of 512 MB until it fails when the limit is reached.


This visualization uses v8.setFlagsFromString() with --trace_gc to show the current memory usage (red) and runtime (purple) of each garbage collection (the X axis shows the total elapsed time in seconds):

 'use strict' var fs = require('fs'); var v8 = require('v8'); var wstream = fs.createWriteStream('myOutput.txt'); v8.setFlagsFromString('--trace_gc'); for (var i = 0; i < 10000000000; i++) { wstream.write(i+'\n'); } console.log('End!') wstream.end(); 

Slow-gc

Memory usage reaches about 80% in about 4 seconds, and the garbage collector abandons the Scavenge attempt and is forced to use Mark-sweep (more than 10 times slower) - see this article for more details.


For comparison, here are the same visualizations for the @MikeC code that waits for drain when the write buffer is write :

Fast - stats

Fast - GC

+3
source share

All Articles