Nodejs webpage scraped with authentication cookie

Recently, I am trying to clear information from a website ( kicktipp ) using Nodejs, the request module and cheerio. Since this site requires authentication to view most of its sites, I tried to log in through an email request and check if the user was logged in with the following code (I replaced the credentials with dummy data, but I use real data in my actual script):

var request = require('request'); var jar = request.jar(); var request = request.defaults({ jar: jar, followAllRedirects: true }); var jar = request.jar(); var cheerio = require('cheerio'); request.post({ url: 'http://www.kicktipp.de/info/profil/loginaction', headers: { 'content-type': 'application/x-www-form-urlencoded' }, method: 'post', jar: jar, body: ' kennung=test@example.com &passwort=1234567890&_charset_=UTF-8&submitbutton=Anmelden' }, function(err, res, body){ if(err) { return console.error(err); }; request.get({ url: 'http://www.kicktipp.de/', method: 'get', jar: jar }, function(err, res, body) { if(err) { return console.error(err); }; var $ = cheerio.load(body); var text = $('.dropdownbox > li > a').text(); console.log(text); var error = $('#kicktipp-content > div.messagebox.errors > p').text(); console.log(error); var cookies = jar.getCookies('http://www.kicktipp.de/'); console.log(cookies); }); }); 

The parameters submitted by the html form (as verified in the browser) look like this:

 kennung=test@example.com &passwort=1234567890&_charset_=UTF-8&submitbutton=Anmelden 

With this script, my cookie jar looks like this:

 [ Cookie="JSESSIONID=F650D7F5CD6AF4F6B0944B2190EE2D29.kt213; Path=/; hostOnly=true; aAge=1ms; cAge=179ms" ] 

JSESSIONID saved successfully, but the server will not be logged in, since console.log(text) prints Login , but it should print Logout if the user is correctly signed.

After checking the login request in the browser, I found out that the browser receives a new cookie every time a page in this domain is requested via set-cookie in the response header as follows:

 Set-Cookie: login=bS5zcGxpZXRob2V2ZXJAZ21haWwuY29tOjE0NzU0MDA3MjAxMjA6Mzg1NTI4OGY3ODgzN2FkMzllNTA0NWNkY2ZjMjBjZGM; Domain=.kicktipp.de; Expires=Sun, 02-Oct-2016 09:32:00 GMT; Path=/; HttpOnly 

However, I can’t (or just don’t know how) to get this cookie in my request bank and, therefore, visit the page as a registered user.

Is there something I’m missing here to stay in the system (or even enter the page)? Thanks in advance.

+7
javascript cookies request cheerio
source share
1 answer

The problem is that this page requires a specific cookie that you get when you first visit the page (in this case, it looks like a time zone cookie). To receive this cookie, you just need to visit the page (using the GET request) before sending the login request (POST) to the server. In this case, it is as simple as porting another GET request to the code above:

 var loginLink = 'http://www.kicktipp.de/info/profil/login'; // creating a clean jar var j = request.jar(); request.get({url: loginLink, jar: j}, function(err, httpResponse, html) { // place POST request and rest of the code here }); 
+4
source share

All Articles