How to select a specific css node by id

I am trying to use the rvest package to clear data from a web page. In simple format, the html code is as follows:

<div class="style"> <input id="a" value="123"> <input id="b"> </div> 

I want to get the value 123 from the first input. I tried the following R code:

 library(rvest) url<-"xxx" output<-html_nodes(url, ".style input") 

This will return a list of input tags:

 [[1]] <input id="a" value="123"> [[2]] <input id="b"> 

Next, I tried using html_node to refer to the first input tag by id:

 html_node(output, "#a") 

Here he returned a list of zeros instead of the required input tag.

 [[1]] NULL [[2]] NULL 

My question is: how can I refer to an input tag using its id?

+4
source share
2 answers

You can use xpath:

 require(rvest) text <- '<div class="style"> <input id="a" value="123"> <input id="b"> </div>' h <- read_html(text) h %>% html_nodes(xpath = '//*[@id="a"]') %>% xml_attr("value") 

The easiest way to get css- and xpath-selector is to use http://selectorgadget.com/ . For a specific attribute like yours, use the Chrome Developer Toolbar to get the xpath as follows: enter image description here

+15
source

This will work fine with direct CSS selectors:

 library(rvest) doc <- '<div class="style"> <input id="a" value="123"> <input id="b"> </div>' pg <- html(doc) html_attr(html_nodes(pg, "div > input:first-of-type"), "value") ## [1] "123" 
+1
source

All Articles