How can I split (',') a string while ignoring commas between quotation marks?

I use the .Split(',') method for a string that, as I know, has comma-separated values, and I want these values ​​to be separated and placed in a string[] object. This works fine for lines like this:

78,969.82,GW440,

But the values ​​begin to look different if this second value exceeds 1000, for example, found in this example:

79,"1,013.42",GW450,...

These values ​​come from a spreadsheet control, where I use the controls built into the ExportToCsv(...) method, and this explains why the formatted version of the actual numerical value.

Question

Is there a way to force the .Split(',') method to ignore commas inside quotes? I really do not want the value "1,013.42" divided as "1 and 013.42" .

Any ideas? Thanks!

Update

I really would like to do this without the inclusion of a third-party tool, since my precedent does not really include many other things, besides this, and even if it is part of my working solution, having such a tool, At the moment it is really profitable. I was hoping there was something quick to solve this particular use case that I was missing, but now that it’s the weekend, I’ll see if I can give another update of this question on Monday with the solution I’m in eventually come with. Thank you all for your help, I will evaluate each answer further on Monday.

+8
string split c #
source share
3 answers

This is a fairly simple implementation of CSV Reader, which we use in several projects. Easy to use and handles the cases you are talking about.

CSV class first

 public static class Csv { public static string Escape(string s) { if (s.Contains(QUOTE)) s = s.Replace(QUOTE, ESCAPED_QUOTE); if (s.IndexOfAny(CHARACTERS_THAT_MUST_BE_QUOTED) > -1) s = QUOTE + s + QUOTE; return s; } public static string Unescape(string s) { if (s.StartsWith(QUOTE) && s.EndsWith(QUOTE)) { s = s.Substring(1, s.Length - 2); if (s.Contains(ESCAPED_QUOTE)) s = s.Replace(ESCAPED_QUOTE, QUOTE); } return s; } private const string QUOTE = "\""; private const string ESCAPED_QUOTE = "\"\""; private static char[] CHARACTERS_THAT_MUST_BE_QUOTED = { ',', '"', '\n' }; } 

Then a pretty nice Reader implementation - if you need it. You should be able to do what you need only with the CSV class above.

 public sealed class CsvReader : System.IDisposable { public CsvReader(string fileName) : this(new FileStream(fileName, FileMode.Open, FileAccess.Read)) { } public CsvReader(Stream stream) { __reader = new StreamReader(stream); } public System.Collections.IEnumerable RowEnumerator { get { if (null == __reader) throw new System.ApplicationException("I can't start reading without CSV input."); __rowno = 0; string sLine; string sNextLine; while (null != (sLine = __reader.ReadLine())) { while (rexRunOnLine.IsMatch(sLine) && null != (sNextLine = __reader.ReadLine())) sLine += "\n" + sNextLine; __rowno++; string[] values = rexCsvSplitter.Split(sLine); for (int i = 0; i < values.Length; i++) values[i] = Csv.Unescape(values[i]); yield return values; } __reader.Close(); } } public long RowIndex { get { return __rowno; } } public void Dispose() { if (null != __reader) __reader.Dispose(); } //============================================ private long __rowno = 0; private TextReader __reader; private static Regex rexCsvSplitter = new Regex(@",(?=(?:[^""]*""[^""]*"")*(?![^""]*""))"); private static Regex rexRunOnLine = new Regex(@"^[^""]*(?:""[^""]*""[^""]*)*""[^""]*$"); } 

Then you can use it like this.

 var reader = new CsvReader(new FileStream(file, FileMode.Open)); 

Note. This will open the existing CSV file, but it can be quite easily modified to take string[] as you need.

+7
source share

Since you are reading a CSV file, the best way would be to use an existing CSV reader. There's more to CSV than just commas between quotation marks. Finding all the cases you need to handle will be more work than it costs.

Here's a question from a CSV reader about SO.

+3
source share

You should probably read this article: Comma-based Regular Expression Ignoring Commas Inside Quotes Although this is for Java, the regular expression is the same.

0
source share

All Articles