I am trying to get data from the Bing search APIs, and since existing libraries seem to be based on older APIs that were discontinued, I would try using the request library, which seems to be the most common library for this. My code looks like
var SKEY = "myKey...." , ServiceRootURL = 'https://api.datamarket.azure.com/Bing/Search/v1/Composite'; function getBingData(query, top, skip, cb) { var params = { Sources: "'web'", Query: "'"+query+"'", '$format': "JSON", '$top': top, '$skip': skip }, req = request.get(ServiceRootURL).auth(SKEY, SKEY, false).qs(params); request(req, cb) } getBingData("bookline.hu", 50, 0, someCallbackWhichParsesTheBody)
Bing returns some JSON, and sometimes I can work with it , but if the response body contains a large number of non-ASCII characters JSON.parse , it states that the string is incorrect. I tried switching to the ATOM content type, but there was no difference, the xml was invalid. Checking the response body, available in the request() callback, actually shows bad code.
So, I tried the same query with some Python code, and this works fine all the time. For reference:
r = requests.get( 'https://api.datamarket.azure.com/Bing/Search/v1/Composite?Sources=%27web%27&Query=%27sexy%20cosplay%20girls%27&$format=json', auth=HTTPBasicAuth(SKEY,SKEY)) stuffWithResponse(r.json())
I can not reproduce the problem with smaller answers (for example, limiting the number of results) and could not identify the one result that causes the problem (by increasing the offset). My impression is that the answer is read in chunks, somehow transcoded and reassembled, which means that json / atom data becomes invalid if some multibyte character becomes split, which happens with large answers, but not in small ones.
Being a newbie to node, I'm not sure if I have to do anything (setting the encoding somewhere? Bing returns UTF-8, so this doesn't seem necessary).
Does anyone have an idea of ββwhat is going on?
FWIW, I'm on OSX 10.8, node is v0.8.20 installed via macports, request v2.14.0 is installed via npm.