I am doing bulk loading of information from a CSV file, and I need to replace this character without ascii "ï ½" for normal space. "
The character "�" corresponds to "\ uFFFD" for C / C ++ / JAVA, which seems to be called a CHANGE CHARACTER. In addition, in the official C # documentation there are types like U + FEFF, 205F, 200B, 180E, 202F.
I am trying to replace this method
public string Errors=""; public void test(){ string textFromCsvCell= ""; string validCharacters="^[0-9A-Za-z().:%-/ ]+$"; textFromCsvCell="This is my text from csv file";
The validation method shows me this text:
"This is my�texto from the csv file
I am also trying to find some solutions
Torture Solution 1: Using Trim
Regex.Replace(value.Trim(), @"[^\S\r\n]+", " ");
Try Solution 2: Use Replace
System.Text.RegularExpressions.Regex.Replace(str,@"\s+"," ");
Try Solution 3: Using Trim
String.Trim(new char[]{'\uFEFF','\u200B'});
Try Solution 4: Add [\ S \ r \ n] to validCharacters
string validCharacters="^[\S\r\n0-9A-Za-z().:%-/ ]+$";
Nothing works
Does anyone have an idea? How can i replace it? I will be very grateful for the help, thanks
Sources:
http://www.fileformat.info/info/unicode/char/0fffd/index.htm
Attempt to replace all spaces with one space
Marking bytes with Strip Byte from a string in C #
C # Regex - remove extra spaces, but keep new lines
EDITED
This is the source line:
"GLUCOSE CONTINUOUS MONITORING SYSTEM"
in 0x ... notations
SYSTEM OF0xA0MONITORING CONTINUED GLUCOSE
Decision
Go here, Unicode code converter: http://r12a.imtqy.com/apps/conversion/ Look at the conversions and replace
In my case, I am doing a simple replacement:
string value = "SYSTEM OF MONITORING CONTINUES OF GLUCOSE"; //value containt non-breaking whitespace //value is "SYSTEM OF�MONITORING CONTINUES OF GLUCOSE" string cleaned = ""; string pattern = @"[^\u0000-\u007F]+"; string replacement = " "; Regex rgx = new Regex(pattern); cleaned = rgx.Replace(value, replacement); if (Regex.IsMatch(cleaned,"^[0-9A-Za-z().:<>%-/ ]+$"){ //all code for insert else //Errors message
This expression represents all possible spaces: space, tab, page break, line break and carriage return
[ \f\n\r\t\v\u00a0\u1680\u180e\u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200a\u2028\u2029\u202f\u205f\u3000]
Links https://developer.mozilla.org/en/docs/Web/JavaScript/Guide/Regular_Expressions