UTF-8 auto encoding in Node.js HTTP client

There, I try to download XML content from a remote host using Node.js.

The problem is that the German umlaut, like Γ€, is broken. As with the browser, this is usually a simple coding problem. But since the XML content on the remote host is encoded in iso-8859-2, "I did not manage to get the letters back to work.

The functionality is very simple. I just use the default HTTP client integrated into Node.js to connect to a remote host with a simple request request.

Some environmental facts:

  • The remote system uses iso-8859-2 encoding.
  • The encoding is currently set in the response header.
  • Characters are not restored in the data (piece) received by response.onData(chunk)

Node.js runs on version 0.2 on the Debian server by default.

The code is based on the default HTTPClient, as described in the Node.js documentation.

I tried the following:

 response.defaultAsciiEncoding true/false response.encoding = UFT-8/ascii 

I used the UTF-8 encoder / decoder to encode / decode a block. After that I tried to encode / decode the whole response body.

I am not very familiar with using buffers, and I think the problem should be in that direction. Or Node.js (or httpClient) simply cannot handle other types of encoding by default, because this is my second premise. In this case, I need to write my own HTTP client using a clean library, I think. I just want to make sure that I don’t go in the wrong direction :)

+3
xml encoding response
source share
3 answers

I quickly got a Node.js file, and it looks like svick is right: Node.js does not support ISO encoding. However, you can get the response as a binary stream, and then either return it to the browser using your own encoding, or use node-iconv (again, as svick suggested).

Here is a small example: http://gist.github.com/576884

0
source

Try setting the encoding parameter in the XML declaration:

 <?xml version="1.0" encoding="iso-8859-2" ?> <xml> <!-- whatever --> </xml> 

XML files are used by default for UTF-8 unless you explicitly declare their encoding.

0
source

It seems to me that Node.js cannot work with encoding other than UTF-8. Perhaps something like node-iconv should work.

0
source

All Articles