How to use substring () with Import.io?

I am having problems with XPath and import.io and I hope you can help me. :)

Html code:

<a href="page.php?var=12345"> 

At the moment, I am able to extract the contents of href (page.php? Var = 12345) with this:

 ./td[3]/a[1]/@href 

Although, I would just like to collect: 12345

Substring

may be a solution, but it does not work on import.io since I use it ...

 substring(./td[3]/a[1]/@href,13) 

Any ideas on what the problem is?

Thank you very much in advance!

+5
source share
2 answers

Try using this for xpath: (Specify the field as text)

 .//*[@class='oeil']/a/@href 

Then use this for your regular expression:

 ([^=]*)$ 

This will give you the ISBN number you are looking for.

import.io only supports XPath functions when they return a list of node

+7
source

Your expression of the path is beautiful, but perhaps it should be

 substring(./td[3]/a[1]/@href,14) 

“It doesn't seem to work” is a not-so-clear description of what is wrong. Are you getting error messages? Is the output wrong? Do you have code surrounding a path expression that you could show?


You can use a substring, but using substring-after() will be even better.

 substring-after(/a/@href,'=') 

Assuming you are injecting a tiny fragment that you showed:

 <a href="page.php?var=12345"/> 

will choose

 12345 

and given the structure of your input

 substring-after(./td[3]/a[1]/@href,'=') 

Lead . in the path expression selects only the immediate child nodes td current context node. I hope you know what you are doing.

+1
source

All Articles