Can comparison strings really differ depending on the culture when the string is guaranteed not to change?

I am reading encrypted credentials / connection strings from the configuration file. Resharper tells me: "String.IndexOf (string) is culture specific here" on this line:

if (line.Contains("host=")) { _host = line.Substring(line.IndexOf( "host=") + "host=".Length, line.Length - "host=".Length); 

... and therefore wants to change it to:

 if (line.Contains("host=")) { _host = line.Substring(line.IndexOf("host=", System.StringComparison.Ordinal) + "host=".Length, line.Length - "host=".Length); 

The value I'm reading will always be "host =" no matter where the application can be deployed. Is it reasonable to add this bit to System.StringComparison.Ordinal?

More importantly, can this damage anything (use it)?

+51
c # string-comparison resharper configuration-files cultureinfo
Jun 07 2018-12-12T00:
source share
3 answers

That's right. On MSDN ( http://msdn.microsoft.com/en-us/library/d93tkzah.aspx ),

This method executes the word (case sensitive and culture sensitive ). search using current culture.

Thus, you can get different results if you run it under a different culture (through regional and language settings on the control panel).

In this particular case, you probably won't have a problem, but type i in the search bar and run it in Turkey, and this will probably ruin your day.

See MSDN: http://msdn.microsoft.com/en-us/library/ms973919.aspx

These new recommendations and APIs exist to eliminate erroneous assumptions about the default behavior of standard APIs. A canonical example of errors that occur where non-linguistic string data is interpreted linguistically as a Turkish-I problem.

For almost all Latin alphabets, including English, character i (\ u0069) is a lowercase version of the character i (\ u0049). This break-in rule is quickly becoming standard for those who program in such a culture. However, in Turkish ("tr-TR") there is capital "i with a dot", a symbol (\ u0130), which is the capital version of i. Similarly, in Turkish there is a lowercase "I have no dot" or (\ u0131), which capitalizes on I. This behavior occurs in Azerbaijani culture ("az").

Therefore, the assumptions usually made about capitalizing the ego or the lower ego scale are not valid among all cultures. If the default value is used overloads for string comparison procedures, they will be subject to differences between cultures. For non-linguistic data, as in the following example, this can lead to undesirable results:

  Thread.CurrentThread.CurrentCulture = new CultureInfo("en-US") Console.WriteLine("Culture = {0}", Thread.CurrentThread.CurrentCulture.DisplayName); Console.WriteLine("(file == FILE) = {0}", (String.Compare("file", "FILE", true) == 0)); Thread.CurrentThread.CurrentCulture = new CultureInfo("tr-TR"); Console.WriteLine("Culture = {0}", Thread.CurrentThread.CurrentCulture.DisplayName); Console.WriteLine("(file == FILE) = {0}", (String.Compare("file", "FILE", true) == 0)); 

Due to the difference in comparison, the comparison results change when the flow culture changes. This is the Output:

 Culture = English (United States) (file == FILE) = True Culture = Turkish (Turkey) (file == FILE) = False 

Here is an example without a case:

 var s1 = "é"; //é as one character (ALT+0233) var s2 = "é"; //'e', plus combining acute accent U+301 (two characters) Console.WriteLine(s1.IndexOf(s2, StringComparison.Ordinal)); //-1 Console.WriteLine(s1.IndexOf(s2, StringComparison.InvariantCulture)); //0 Console.WriteLine(s1.IndexOf(s2, StringComparison.CurrentCulture)); //0 
+60
Jun 08 2018-12-12T00:
source share
— -

CA1309: UseOrdinalStringComparison

Do not use harm so as not to use it, but " by explicitly setting the parameter to either StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase, your code often gains speed, improves correctness, and becomes more reliable. ".




What is Ordinal and why does it matter for your case?

An operation using sort ordering performs a comparison based on the numeric value (Unicode code point) of each Char in a string. The usual comparison is quick, but culture insensitive. When you use sort orders to sort strings starting with Unicode characters (U +), the string U + xxxx precedes the string U + yyyy if xxxx is numerically less than yyyy.

And, as you stated ... the string value you are reading is not culturally sensitive, so it makes sense to use ordinal comparisons rather than word comparisons. Just remember, Ordinal means "it's not culturally sensitive."

+26
Jun 07 2018-12-12T00:
source share

To answer your specific question: No, but the static analysis tool will not be able to understand that your input value will never contain information related to the locale in it.

+5
Jun 07 2018-12-12T00:
source share



All Articles