Node.js: Best way to do some asynchronous operations and then do something else?

In the following code, I try to make several (about 10) HTTP requests and RSS analyzes at a time.

I use the standard forEach construct in the URI array that I need to get and parse the result.

the code:

 var articles; feedsToFetch.forEach(function (feedUri) { feed(feedUri, function(err, feedArticles) { if (err) { throw err; } else { articles = articles.concat(feedArticles); } }); }); // Code I want to run once all feedUris have been visited 

I understand that when calling a function once, I have to use a callback. However, the only way I could use the callback in this example is to call a function that counts how many times it has been called, and continues only when it has been called as many times as feedsToFetch.length , which seems hacked.

So my question is: what is the best way to handle this type of situation in node.js.

Preferably without any blockage! (I still want it to be high speed). Are these promises or something else?

Thanks Danny

+7
javascript promise
source share
3 answers

FREE SOLUTION

Promises for inclusion in the next version of JavaScript

Popular Promise libraries provide you with the .all() method for this exact use case (waiting for the set of asynchronous calls to complete and then doing something else). This is the perfect match for your scenario.

Bluebird also has .map() , which can take an array of values ​​and use it to run the Promise chain.

Here is an example using Bluebird .map() :

 var Promise = require('bluebird'); var request = Promise.promisifyAll(require('request')); function processAllFeeds(feedsToFetch) { return Promise.map(feedsToFetch, function(feed){ // I renamed your 'feed' fn to 'processFeed' return processFeed(feed) }) .then(function(articles){ // 'articles' is now an array w/ results of all 'processFeed' calls // do something with all the results... }) .catch(function(e){ // feed server was down, etc }) } function processFeed(feed) { // use the promisified version of 'get' return request.getAsync(feed.url)... } 

Note that you do not need to use closure to accumulate results.

Bluebird API Docs are really well written, with lots of examples, so this makes it easier to choose.

As soon as I recognized the Promise template, it made life a lot easier. I can not recommend it enough.

In addition, here is an excellent article on various approaches to working with asynchronous functions using promises, async and others

Hope this helps!

+9
source share

No hacks needed

I would recommend using the async module as it simplifies these things.

async provides async.eachSeries as a replacement for async for arr.forEach and allows you to pass the done callback function when it is complete. It will handle each item in the series, as forEach does. In addition, it will conveniently cause errors in your callback, so you do not need to have handler logic inside the loop. If you need / require parallel processing, you can use async.each .

There will be no blocking between the async.eachSeries call and the callback.

 async.eachSeries(feedsToFetch, function(feedUri, done) { // call your async function feed(feedUri, function(err, feedArticles) { // if there an error, "bubble" it to the callback if (err) return done(err); // your operation here; articles = articles.concat(feedArticles); // this task is done done(); }); }, function(err) { // errors generated in the loop above will be accessible here if (err) throw err; // we're all done! console.log("all done!"); }); 

Alternatively, you can build an array of asynchronous operations and pass them to async.series . A series will process your results in a series (not parallel) and call a callback when each function is complete. The only reason to use this over async.eachSeries would be if you prefer the familiar arr.forEach syntax.

 // create an array of async tasks var tasks = []; feedsToFetch.forEach(function (feedUri) { // add each task to the task array tasks.push(function() { // your operations feed(feedUri, function(err, feedArticles) { if (err) throw err; articles = articles.concat(feedArticles); }); }); }); // call async.series with the task array and callback async.series(tasks, function() { console.log("done !"); }); 

Or you can Roll Your Own ™

You may be feeling too ambitious or you may not want to rely on async addiction. Maybe you're just bored, like me. Anyway, I intentionally copied the async.eachSeries API to make it easy to understand how this works.

As soon as we delete the comments here, we have only 9 lines of code that can be reused for any array that we want to process asynchronously! It will not modify the original array, errors can be sent to a “short circuit” iteration, and a separate callback can be used. It will also work on empty arrays. Just a little functionality in just 9 lines :)

 // void asyncForEach(Array arr, Function iterator, Function callback) // * iterator(item, done) - done can be called with an err to shortcut to callback // * callback(done) - done recieves error if an iterator sent one function asyncForEach(arr, iterator, callback) { // create a cloned queue of arr var queue = arr.slice(0); // create a recursive iterator function next(err) { // if there an error, bubble to callback if (err) return callback(err); // if the queue is empty, call the callback with no error if (queue.length === 0) return callback(null); // call the callback with our task // we pass `next` here so the task can let us know when to move on to the next task iterator(queue.shift(), next); } // start the loop; next(); } 

Now let me create an example of an asynchronous function to use with it. We will fake a delay with setTimeout of 500 ms here.

 // void sampleAsync(String uri, Function done) // * done receives message string after 500 ms function sampleAsync(uri, done) { // fake delay of 500 ms setTimeout(function() { // our operation // <= "foo" // => "async foo !" var message = ["async", uri, "!"].join(" "); // call done with our result done(message); }, 500); } 

Ok, let's see how they work!

 tasks = ["cat", "hat", "wat"]; asyncForEach(tasks, function(uri, done) { sampleAsync(uri, function(message) { console.log(message); done(); }); }, function() { console.log("done"); }); 

Output (500 ms delay before each output)

 async cat ! async hat ! async wat ! done 
+9
source share

using a copy of the url list as a queue for tracking arrivals makes it simple: (all changes are commented out)

 var q=feedsToFetch.slice(); // dupe to censor upon url arrival (to track progress) feedsToFetch.forEach(function (feedUri) { feed(feedUri, function(err, feedArticles) { if (err) { throw err; } else { articles = articles.concat(feedArticles); } q.splice(q.indexOf(feedUri),1); //remove this url from list if(!q.length) done(); // if all urls have been removed, fire needy code }); }); function done(){ // Code I want to run once all feedUris have been visited } 

after all, it’s not so “messier” than promises, and it gives you the ability to reload incomplete URLs (the counter itself will not tell you which ones failed). for this simple parallel loading task, it will actually add more code to the implementation of your Promises project than a simple queue, and Promise.all () is not in the most intuitive place to stumble. As soon as you fall into subqueries or want to improve error handling than a train, I highly recommend using promises, but you don't need a booster to kill the protein ...

+1
source share

All Articles