How to properly install xpaths in import.io to clean

Question

How to properly install xpaths in import.io to clean

I am trying to set up an extractor in Import.io and it is difficult for me to get an API for publishing. Every time he tells me that he cannot publish the API and possibly try using xpaths. Therefore, after some further research, I found that the xpath for the header links found on the Craig list page is contained in the span tag. The tag is as follows:

span[@class='pl']

I tried setting the following in the xpath area for import.io for the field

//span[@class='pl']

but to no avail. No matter what I try to do, I cannot get the API to publish. Although I can get the data to export to the dataset, I would really like to get an API for publishing.

I am wondering if anyone was able to succeed with import.io to make a small reduction to the Craigs list? And if so, what were the steps for publishing the API correctly?

Also, as a note, I read several articles about Scrapy, but I don't know anything about python how to install and run it, even if I found a specific piece of code that is directly related to this issue. Does anyone have an idea on how I can get Import.io to publish the API?

+4

xpath web scraping scrapy import.io

Mrtechie May 30 '15 at 8:57

source share

1 answer

Mrtechie · Accepted Answer · 2015-05-30T20:25:02+0000

So, for anyone looking for an answer to this question, the way to set the correct xpath to clear the headers in Craig List using Import.io is to set the extended xpath override to the following:

.//span[@class='pl']/.

Now my problem is about 403 errors returned from the Craig list, which means a ban.

How to properly install xpaths in import.io to clean

More articles: