How to split a delimited string like a pipe (which is not inside double quotes

I have a line as shown below which is split by pipes. it has double quotes around the string (for example: "ANI").

How do I split this into a separator of channels (which are not included in double quotes)?

511186|"ANI"|"ABCD-102091474|E|EFG"||"2013-07-20 13:47:19.556" 

And separable shoule values ​​will look like this:

 511186 "ANI" "ABCD-102091474|E|EFG" "2013-07-20 13:47:19.556" 

Any help would be appreciated!

EDIT

The answer I accepted did not work for those strings that have double quotes inside. Any idea what the problem is?

  using System.Text.RegularExpressions; string regexFormat = string.Format(@"(?:^|\{0})(""[^""]*""|[^\{0}]*)", '|'); string[] result = Regex.Matches("111001103|\"E\"|\"BBB\"|\"XXX\"|||10000009|153086649|\"BCTV\"|\"REV\"|||1.00000000|||||\"ABC-BT AD\"|\"\"\"ABC - BT\"\" AD\"|||\"N\"||\"N\"|||\"N\"||\"N",regexFormat) .Cast<Match>().Select(m => m.Groups[1].Value).ToArray(); foreach(var i in result) Console.WriteLine(i) 
+6
source share
4 answers

You can use a regular expression to match elements in a string:

 string[] result = Regex.Matches(s, @"(?:^|\|)(""[^""]*""|[^|]*)") .Cast<Match>() .Select(m => m.Groups[1].Value) .ToArray(); 

Explanation:

 (?: A non-capturing group ^|\| Matches start of string or a pipe character ) End of group ( Capturing group "[^"]*" Zero or more non-quotes surrounded by quotes | Or [^|]* Zero or more non-pipes ) End of group 
+1
source

Here is one way to do this:

 public List<string> Parse(string str) { var parts = str.Split(new[] {"|"}, StringSplitOptions.None); List<string> result = new List<string>(); for (int i = 0; i < parts.Length; i++) { string part = parts[i]; if (IsPartStart(part)) { List<string> sub_parts = new List<string>(); do { sub_parts.Add(part); i++; part = parts[i]; } while (!IsPartEnd(part)); sub_parts.Add(part); part = string.Join("|", sub_parts); } result.Add(part); } return result; } private bool IsPartStart(string part) { return (part.StartsWith("\"") && !part.EndsWith("\"")) ; } private bool IsPartEnd(string part) { return (!part.StartsWith("\"") && part.EndsWith("\"")); } 

This works by breaking up everything, and then putting together some parts that need to be connected, looking for parts that start with " and corresponding parts that end with " .

+1
source
 string.Split("|", inputString); 

... will provide you with individual parts, but will fail if any of the parts has a pipe separator in them.

If it is a CSV file, following all the usual CSV character escape rules, etc. (but using a pipe symbol instead of a comma), you should look at CsvHelper , a NuGet package for reading and writing CSV files. He does all the hard work and considers all the corner cases that you would have to do yourself.

0
source

This is how I do it. It is quite simple, and I think you will find it very fast. I have not had any tests, but I am sure it is faster than regular expressions.

 IEnumerable<string> Parse(string s) { int pos = 0; while (pos < s.Length) { char endChar = '|'; // Test for quoted value if (s[pos] == '"') { pos++; endChar = '"'; } // Extract this value int newPos = s.IndexOf(endChar, pos); if (newPos < 0) newPos = s.Length; yield return s.Substring(pos, newPos - pos); // Move to start of next value pos = newPos + 1; if (pos < s.Length && s[pos] == '|') pos++; } } 
0
source

All Articles