NSData for NSString conversion problem!

I get the html file as NSData and have to parse it in order to extract some information. My approach was to convert it to NSString with UTF8 encoding (for example, html has non-English characters, for example, Russian) - this failed. I used something like this:

NSString *respData = [NSString stringWithUTF8String:[theData bytes]]; 

but he returned zero.

The only thing that actually worked was

 [NSString stringWithCString:[theData bytes] length:[theData length]]; 

but when he encounters Russian characters, for example, he returns jibrish.

Then my next approach was to parse an array of data bytes, extract the bytes I needed, and somehow convert them to NSString. I tried something like this:

 -(NSString *)UTF8StringFromData:(NSData *)theData{ Byte *arr = [theData bytes]; NSUInteger begin1 = [self findIndexOf:@"<li>" bArr:arr size:[theData length]]+4; NSUInteger end1 = [self findIndexOf:@"</li></ol>" bArr:arr size:[theData length]]; Byte *arr1 = (Byte *)malloc(sizeof(Byte)*((end1-begin1+1))); int j = 0; for (int i = begin1; i < end1; i++){ arr1[j] = arr[i]; j++; } arr1[j]='\0'; NSData *temp = [NSData dataWithBytes:arr1 length:j]; return [[NSString alloc] initWithData:temp encoding:NSUTF8StringEncoding]; } 
+6
objective-c iphone encoding nsstring nsdata
source share
3 answers

Suppose you received an NSURLResponse * response and NSData * data:

 CFStringEncoding cfEncoding = CFStringConvertIANACharSetNameToEncoding((CFStringRef) [response textEncodingName]); NSStringEncoding encoding = CFStringConvertEncodingToNSStringEncoding(cfEncoding); NSString* string = [[NSString alloc] initWithData:data encoding:encoding]; // Do stuff here.. [string release]; 
+11
source share

I am responding to the Martijn Thé stream above, here, as I could not put a readable snippet of code in a comment.

I found that if the response content type on the server is "text / plain", then (__bridge CFStringRef) [response textEncodingName] will be empty, and if you try to pass this to CFStringConvertIANACharSetNameToEncoding, you will get EXC_BAD_ACCESS.

If the response content type is set to 'text / html; charset = utf-8', then everything works as expected. To handle the text / plain content type, this is what I did:

 CFStringRef sRef = (__bridge CFStringRef)[response textEncodingName]; if (sRef) { CFStringEncoding cfEncoding = CFStringConvertIANACharSetNameToEncoding(sRef); encoding = CFStringConvertEncodingToNSStringEncoding(cfEncoding); } else { encoding = NSASCIIStringEncoding; } 
+1
source share

First of all, this is my code

 -(NSString *)UTF8StringFromData:(NSData *)theData{ Byte *arr = [theData bytes]; NSUInteger begin1 = [self findIndexOf:@"<li>" bArr:arr size:[theData length]]+4; NSUInteger end1 = [self findIndexOf:@"</li></ol>" bArr:arr size:[theData length]]; Byte *arr1 = (Byte *)malloc(sizeof(Byte)*((end1-begin1+1))); int j = 0; for (int i = begin1; i < end1; i++){ arr1[j] = arr[i]; j++; } arr1[j]='\0'; NSData *temp = [NSData dataWithBytes:arr1 length:j]; return [[NSString alloc] initWithData:temp encoding:NSUTF8StringEncoding]; } 

and second, I get the contents of the file from the Internet, so I can’t be sure about anything. This is the html of google translation if it helps ...

0
source share

All Articles