(# γ‚š γ‚š) is a 5 letter word. But on iOS, the length [@ "(# γ‚š γ‚š) is 7. Why?

(# γ‚š γ‚š) is a 5 letter word. But on iOS, the length of [@ "(# γ‚š γ‚š) is 7.

  • Why?

  • I use <UITextInput> to change the text in a UITextField or UITextView . When I make a UITextRange 5 characters long, it can just cover (# γ‚š D γ‚š). So why does this (# γ‚š D γ‚š) look like a 5-character word in UITextField and UITextView , but it looks like a 7-character word in NSString ???

  • How can I get the correct string length in this case?

+6
source share
2 answers

1) . As mentioned in the comments, your string consists of 5 composed sequences of characters (or clusters of characters, if you prefer). When you break into unichar as the NSString s length method, you get 7, which is the unichar number that is required to represent your string in memory.

2) . Apparently, UITextField and UITextView process strings in unichar mode. Good news, so are you. See No. 3.

3) . You can get the number of arranged character sequences using some NSString API that handles folded character sequences correctly. The quick example that I baked, really fast, is a small NSString category:

 @implementation NSString (ComposedCharacterSequences_helper) -(NSUInteger)numberOfComposedCharacterSequences{ __block NSUInteger count = 0; [self enumerateSubstringsInRange:NSMakeRange(0, self.length) options:NSStringEnumerationByComposedCharacterSequences usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop){ NSLog(@"%@",substring); // Just for fun count++; }]; return count; } @end 

Again, this is a quick code; but he must start you. And if you use it like this:

 NSString *string = @"(# ゚゚)"; NSLog(@"string length %i", string.length); NSLog(@"composed character count %i", [string numberOfComposedCharacterSequences]); 

You will see that you get the desired result.

For a detailed explanation of the NSString API, check out the WWDC 2012 Session 215 Video "Text and Linguistic Analysis"

+7
source

Both ゚ and ゚ represented by a character sequence of two Unicode characters (even if they are visually represented as one), -[NSString length] reports the number of Unicode characters:

The returned number includes individual characters character sequences, so you cannot use this method to determine whether the string will be visible when printing or how long it will appear .

If you want to see the byte representation:

 #import <Foundation/Foundation.h> NSString* describeUnicodeCharacters(NSString* str) { NSMutableString* codePoints = [NSMutableString string]; for(NSUInteger i = 0; i < [str length]; ++i){ long ch = (long)[str characterAtIndex:i]; [codePoints appendFormat:@"%0.4lX ", ch]; } return codePoints; } int main(int argc, char *argv[]) { @autoreleasepool { NSString *s = @" ゚゚"; NSLog(@"%ld unicode chars. bytes: %@", [s length], describeUnicodeCharacters(s)); } } 

Exit: 4 unicode chars. bytes: 0020 FF9F 0414 FF9F 4 unicode chars. bytes: 0020 FF9F 0414 FF9F .

2) and 3): what the NJones said.

+1
source

All Articles