String.IndexOf () does not recognize modified characters

When using IndexOf to search for char, followed by a large digit char (e.g. char 700, which is equal to '), then IndexOf will not be able to recognize the char you are looking for.

eg.

 string find = "abcʼabcabc"; int index = find.IndexOf("c"); 

In this code, the index should be 2, but it returns 6.

Is there any way around this?

+7
c # indexof
source share
2 answers

The construct is treated linguistically different from simple bytes. Use string comparison to compare bytes.

  string find = "abcʼabcabc"; int index = find.IndexOf("c", StringComparison.Ordinal); 
+7
source share

The Unicode letter 700 is an apostrophe modifier : in other words, it changes the letter c. Similarly, if you must use an “e” followed by the character 769 (0x301) , it really won't be 'anymore: e has been changed to be with a sharp accent. For this: é. You will see that the letter is actually two characters: copy it to notepad and press backspace (neatly, right?).

You need to perform the “Ordinal” (byte) comparison without any linguistic comparison. This will find "c" and ignore the linguistic fact that it is modified by the next letter. In my example, the “e” bytes are: (65) (769), so if you beat the bytes in search of 65, you will find it and ignore the fact that (65) (769) is linguistically (233): é. If you search (233) linguistically, it will find the "equivalent" (65) (769):

 string find = "abéabcabc"; int index = find.IndexOf("é"); //gives you '2' even though the "find" has two characters and the the "indexof" is one 

Hope this doesn't bother you too much. If you do this in real code, you should explain exactly what you are doing in the comments: for example, in my example “e” you would like to use semantic equivalence for user data and ordinal equivalence, for example. constants (which, I hope, would not be so different, since your successor preys on you with an ax ).

+14
source share

All Articles