Find value by position in plain HTML in ruby

There are no classes in my Html file. I am trying to get no. from plain html

<html>
 <head></head>
  <body>
     PO Number : [4587958]   
  </body>
</html>

I can find out the PO Number test with

require 'rubygems'

require 'nokogiri'   

PAGE_URL = "a.html"

page = Nokogiri::HTML(open(PAGE_URL))

data = page.css("body").text
puts data 
test = data
ponumber = test.scan('PO Number')
puts ponumber

I can’t get no.

+4
source share
1 answer

You can get the number by scanning with a regular expression that matches the numbers:

page.css('body').text.scan(/\d+/)
# ["4587958"]

page.css('body').text.scan(/\d+/).first.to_i
# 4587958

scanreturns an array with all matches. If the document has several numbers, simply select the item you want to select:

# Example:
#   Invoice Number : [78945824] PO Number : [4587958]

page.css('body').text.scan(/\d+/)
# ["78945824", "4587958"]

page.css('body').text.scan(/\d+/)[1].to_i
# 4587958
+7
source

All Articles