New way to use ES7 / promises
Usually when you clean, you want to use some kind of method to
- Get a resource on a web server (usually in an html document)
- Read this resource and work with it as
- DOM / tree structure and make it accessible.
- parse it as a token document with something like SAS.
Both wood and marker parsing have advantages, but wood is usually much simpler. We will do it. Check out the request-promise , here is how it works:
const rp = require('request-promise'); const cheerio = require('cheerio'); // Basically jQuery for node.js const options = { uri: 'http://www.google.com', transform: function (body) { return cheerio.load(body); } }; rp(options) .then(function ($) { // Process html like you would with jQuery... }) .catch(function (err) { // Crawling failed or Cheerio
This uses cheerio , which is essentially a server-side jQuery-esque lightweight library (which doesn't need a window object, or jsdom).
Since you are using promises, you can also write this in an asynchronous function. It will look synchronous, but it will be asynchronous with ES7:
async function parseDocument() { let $; try { $ = await rp(options); } catch (err) { console.error(err); } console.log( $('title').text() );
Evan Carroll May 31 '16 at 2:17 2016-05-31 02:17
source share