Normalizing strings with String.ToUpperInvariant ()

I currently store normalized row versions in my SQL Server database in lower case. For example, in the My Users table, I have the UserName and LoweredUserName fields. Depending on the context, I either use the TOWER SQL LOWER () function or the C # String.ToLower () method to create a lower version of the username to populate the LoweredUserName field. According to Microsoft's Guide and Visual Studio CA1308 Code Analysis Rule , I should use C # String.ToUpperInvariant () instead of ToLower (). According to Microsoft, this is a problem of performance and globalization: converting to uppercase is safe, and converting to lowercase can lead to loss of information (for example, the problem of the Turkish "I" ).

If I move on to using ToUpperInvariant to normalize the string, I will also have to change the database schema, since my schema is based on Microsoft ASP.NET Membership (see this related question ), which normalizes strings in lowercase.

Does Microsoft not contradict itself by telling us that we use upper case normalization in C #, and our own code in tables and membership procedures is used in case of lower case normalization? Should I switch everything to upper case normalization or just keep using lower case normalization?

+10
c # sql-server code-analysis asp.net-membership
Apr 21 '09 at 17:32
source share
3 answers

To answer your first question, yes, Microsoft is a bit incompatible. To answer your second question, do not switch anything until you confirm that this causes a bottleneck in your application.

Think about how much ahead you can make on your project, and not waste time on everything. Your development time is much more valuable than the savings you received from such a change.

Remember:

Premature optimization is the root of all the evil (or at least most) of programming. - Donald Knuth

+3
Apr 21 '09 at 17:38
source share

According to CA1308 , the reason is that some characters cannot be converted to both ends from top to bottom. The important thing is that you always move in the same direction, so if your standard should always move to lower case, then there is no reason to change it.

+6
Apr 21 '09 at 17:44
source share

Continue to use line normalization. Only changes comply with Microsoft standards if a big problem occurs.

This is unsuccessful, but worth it. Unfortunately, Microsoft “standards” are generally poorly understood and somewhat less consistent; experience with them has shown that if there is no good reason, it is better to just stick to what works while it works. Please note that this, as a rule, does NOT apply to technologies other than Microsoft; but the arbitrariness of Microsoft "standards" makes them worthy of attention.

Edit: I have to clarify here; my opinion about Microsoft is very small, from many years of experience working with their standards. As noted in the comments, I have no specific links to point to “everyone else but Microsoft”; it comes only from my personal experience. Your mileage can vary greatly. This answer should really be considered my opinion. Sorry that this was not clear before.

-2
Apr 21 '09 at 17:44
source share



All Articles