I cannot remove spaces from a string processed by Nokogiri

I cannot remove spaces from a string.

My HTML:

<p class='your-price'> Cena pro Vรกs: <strong>139&nbsp;<small>Kฤ</small></strong> </p> 

My code is:

 #encoding: utf-8 require 'rubygems' require 'mechanize' agent = Mechanize.new site = agent.get("http://www.astratex.cz/podlozky-pod-raminka/doplnky") price = site.search("//p[@class='your-price']/strong/text()") val = price.first.text => "139 " val.strip => "139 " val.gsub(" ", "") => "139 " 

gsub , strip , etc. does not work. Why and how to fix it?

 val.class => String val.dump => "\"139\\u{a0}\"" ! val.encoding => #<Encoding:UTF-8> __ENCODING__ => #<Encoding:UTF-8> Encoding.default_external => #<Encoding:UTF-8> 

I am using Ruby 1.9.3, so Unicode should not be a problem.

+7
source share
1 answer

strip removes only ASCII spaces, and the character you have here is Unicode without spaces.

Deleting a character is easy. You can use gsub by providing a regular expression with a character code: gsub(/\u00a0/, '')

You can also call gsub(/[[:space:]]/, '') to remove all Unicode spaces. See documentation for more details.

+21
source

All Articles