How to convert Unicode strings (\ u00e2 etc.) to NSString for display?

I am trying to support arbitrary unicode from many international users. They already put a bunch of data into sqlite databases on their iPhones, and now I want to grab the data into the database and then send it back to my device. I am currently using a php page that sends data from a mysql database on the Internet. The data is saved in the mysql database correctly, but when it is sent back, it is displayed as Unicode text, for example

Frank \ u00e2 \ u0080 \ u0099s iPad

instead

Frank ipad

where the apostrophe really should be a curly apostrophe.

The answer to another question indicates that there are no Cocoa built-in methods for converting the "\ u00e2 \ u0080 \ u0099" part to a unicode string from the web server to the NSString object. Is it correct?

This seems really surprising (and disappointingly disappointing), because Cocoa definitely allows you to enter many different Unicode characters, and I need to support any arbitrary language that I have never heard of, and all possible characters. I save them in the local sqlite database and it’s just fine from it, but as soon as I send it to the web server, I can probably pull out different data, I want the data pulled from the web server to be formatted correctly.

+2
source share
3 answers

[...] there are no Cocoa built-in methods for converting [...]. It is right?

This is not true.

You may be interested in CFStringTransform and its features. This is a full-blown ICU transformation mechanism that can also perform the requested conversion.

See Using Objective C / Cocoa for unescape Unicode characters, i.e. \ u1234

+5
source

All NSStrings are Unicode.

The problem with the " Frank\u00e2\u0080\u0099s iPad " Frank\u00e2\u0080\u0099s iPad is not that it is Unicode; it is that he slipped away from ASCII. " Frank's iPad " is a valid Unicode in any UTF, and this is what you need.

So, you need to find out whether the database returns data that has been escaped, or if the PHP layer at some point eludes it. If so, correct it if you can; The PHP resource should return UTF-8/16/32. Only if this approach fails, you should try to cancel the line on the Cocoa side.

You are correct that there is no built-in way to unescape strings in Cocoa. If you get to this point, see if you can find open source code to do this; if not, you need to do it yourself, possibly using NSScanner.

+2
source

Make sure your web service response has the type and encoding of the content. Also that xml has an encoding. In PHP, you need to add the following before printing XML:

header ('Content-type: text / xml; charset = UTF-8'); print '<? xml version = "1.0" encoding = "UTF-8"? > ';

I assume no encoding is specified.

0
source

All Articles