Scrambling a password protected forum in r

Question

Scrambling a password protected forum in r

I have a problem logging into my script. Despite all the other good answers that I found on stackoverflow, none of the solutions worked for me.

I am scraping the web forum for my research on PhD, its URL is http://forum.axishistory.com .

The webpage I want to clear is a list of members - a page that lists links to all member profiles. When you log in, you can access only a member of the list. If you try to access the list of participants without logging in, it will show you the login form.

List Member URL: http://forum.axishistory.com/memberlist.php .

I tried the httr package:

library(httr) members <- GET("http://forum.axishistory.com/memberlist.php", authenticate("username", "password")) members_html <- html(members)

The withdrawal is a registration form.

Then I tried RCurl:

 library(RCurl) members_html <- htmlParse(getURL("http://forum.axishistory.com/memberlist.php", userpwd = "username:password")) members_html

Conclusion - this is the form of a journal - again.

Then I tried the list () function from this section - Clear the password protected website in R :

 handle <- handle("http://forum.axishistory.com/") path <- "ucp.php?mode=login" login <- list( amember_login = "username" ,amember_pass = "password" ,amember_redirect_url = "http://forum.axishistory.com/memberlist.php" ) response <- POST(handle = handle, path = path, body = login)

and again! Conclusion is a journal form.

The next thing I'm working on is RSelenium, but after all these attempts, I'm trying to figure out if I have something (maybe something completely obvious).

I looked at other relevant posts here, but couldn't figure out how to apply the code to my case:

How to use R to download an archived file from an SSL page that requires cookies .

Clear password protected website in R

How to use R to download an archived file from an SSL page that requires cookies .

https://stackoverflow.com/questions/27485311/scrape-password-protected-https-website-in-r

Password protected website using R

+8

r web-scraping rcurl httr rselenium

Anastasia Pupynina Sep 7 '15 at 8:44

source share

1 answer

Anastasia Pupynina · Accepted Answer · 2015-09-08T08:59:42+0000

Thanks to Simon, I found the answer here: Using rvest or httr to log in to non-standard forms on a web page

 library(rvest) url <-"http://forum.axishistory.com/memberlist.php" pgsession <-html_session(url) pgform <-html_form(pgsession)[[2]] filled_form <- set_values(pgform, "username" = "username", "password" = "password") submit_form(pgsession,filled_form) memberlist <- jump_to(pgsession, "http://forum.axishistory.com/memberlist.php") page <- html(memberlist) usernames <- html_nodes(x = page, css = "#memberlist .username") data_usernames <- html_text(usernames, trim = TRUE)

Scrambling a password protected forum in r

More articles: