Relevant elements in the list of comma-separated values, which are not surrounded by single or double quotes

I want to compare the text of any instance in the list of comma-separated values. To do this, the following regular expression works fine:

/[^,]+/g 

( Demo Regex101 ).

The problem is that I want to ignore any commas that are in single or double quotes, and I'm not sure how to extend the selector above so that I can do this.

Here is an example line:

 abcd, efgh, ij"k,l", mnop, 'q,rs't 

I want to combine or five pieces of text, or compare the respective four commas (so I can restore the data using the split() instead of match() ):

  • abcd
  • efgh
  • ij"k,l"
  • mnop
  • 'q,rs't

Or:

 abcd, efgh, ij"k,l", mnop, 'q,rs't ^ ^ ^ ^ 

How can i do this?


There are three relevant questions, but none of them supports both ' and " in JavaScript:

+6
source share
3 answers

Good, so your respective groups may contain:

  • Just letters
  • Matching pair "
  • A matching pair '

So this should work:

 /((?:[^,"']+|"[^"]*"|'[^']*')+)/g 

Demo version RegEx101

As a great bonus, you can discard extra quotes inside double quotes and vice versa. However, you probably need a state machine to add escaped double quotes in double quotes (for example, aa \ "aa").

Unfortunately, it also matches the initial space - you have to crop matches.

+3
source

Using a double view to define a matched comma is external quotation marks:

 /(?=(([^"]*"){2})*[^"]*$)(?=(([^']*'){2})*[^']*$)\s*,\s*/g 
  • (?=(([^"]*"){2})*[^"]*$) argues that before the coincident point there is an even number of double quotes.
  • (?=(([^']*"){2})*[^']*$) does the same for the approval of a single quote.

PS: It does not handle the case of unbalanced, embedded or shielded quotes.

The demo version of the RegEx

+2
source

Try this in JavaScript

 (?:(?:[^,"'\n]*(?:(?:"[^"\n]*")|(?:'[^'\n]*'))[^,"'\n]*)+)|[^,\n]+ 

Demo

Add a group for more readable (delete? <Name> for Javascript)

 (?<has_quotes>(?:[^,"'\n]*(?:(?<double_quotes>"[^"\n]*")|(?<single_quotes>'[^'\n]*'))[^,"'\n]*)+)|(?<simple>[^,\n]+) 

Demo

Explanation:

(?<double_quotes>"[^"\n]*") " \ n] * ") matches " Anyone inside, but not " " = (1) (in double quotes)
(?<single_quotes>'[^'\n]*') ' \ n] * ') corresponds to ' Anyone inside, but not ' ' = (2) (in single quotes)
(?:(?<double_quotes>"[^"\n]*")|(?<single_quotes>'[^'\n]*')) > "[^" \ n] * ") | (<single_quotes> '[^' \ n] * ')) corresponds to (1) or (2) = (3)
[^,"'\n]* * matches any text, but not "', = (of w)
(?:(?:(?<double_quotes>"[^"\n]*")|(?<single_quotes>'[^'\n]*'))[^,"'\n]*) <double_quotes> "[^" \ n] * "?) | (<single_quotes> '[^' \ n] * ')) [^,"' \ n] *) matches ( 3) (w)
(?:(?:(?<double_quotes>"[^"\n]*")|(?<single_quotes>'[^'\n]*'))[^,"'\n]*)+ <double_quotes> "[^" \ n] * "?) | (<single_quotes> '[^' \ n] * ')) [^,"' \ n] *) + matches repetition of (3) (w) = (3w + )
(?<has_quotes>[^,"'\n]*(?:(?:(?<double_quotes>"[^"\n]*")|(?<single_quotes>'[^'\n]*'))[^,"'\n]*)+) " '\ n] * (: (: (<double_quotes>???" [^ "\ n] *") | (<single_quotes>?' [^ '\ n] *' (?<has_quotes>[^,"'\n]*(?:(?:(?<double_quotes>"[^"\n]*")|(?<single_quotes>'[^'\n]*'))[^,"'\n]*)+) n] *) +) corresponds to (w) (3w +) = ( 4) (available quotes)
[^,\n]+ corresponds to another case (5) (simple)
So, in the final, we have (4) | (5) (a quote or just)

entry

 abcd,efgh, ijkl abcd, efgh, ij"k,l", mnop, 'q,rs't 'q, rs't "'q,rs't, ij"k, l"" l", mnop, 'q, rs't abcd,efgh, ijkl abcd, efgh, ij"k,l", mnop, 'q,rs't 'q, rs't "'q,rs't, ij"k, l"" 

Output:

 MATCH 1 simple [0-4] `abcd` MATCH 2 simple [5-9] `efgh` MATCH 3 simple [10-15] ` ijkl` MATCH 4 simple [16-20] `abcd` MATCH 5 simple [21-26] ` efgh` MATCH 6 has_quotes [27-35] ` ij"k,l"` double_quotes [30-35] `"k,l"` MATCH 7 simple [36-41] ` mnop` MATCH 8 has_quotes [42-50] ` 'q,rs't` single_quotes [43-49] `'q,rs'` MATCH 9 has_quotes [51-59] `'q, rs't` single_quotes [51-58] `'q, rs'` MATCH 10 has_quotes [60-74] `"'q,rs't, ij"k` double_quotes [60-73] `"'q,rs't, ij"` MATCH 11 has_quotes [75-79] ` l""` double_quotes [77-79] `""` , l "` MATCH 1 simple [0-4] `abcd` MATCH 2 simple [5-9] `efgh` MATCH 3 simple [10-15] ` ijkl` MATCH 4 simple [16-20] `abcd` MATCH 5 simple [21-26] ` efgh` MATCH 6 has_quotes [27-35] ` ij"k,l"` double_quotes [30-35] `"k,l"` MATCH 7 simple [36-41] ` mnop` MATCH 8 has_quotes [42-50] ` 'q,rs't` single_quotes [43-49] `'q,rs'` MATCH 9 has_quotes [51-59] `'q, rs't` single_quotes [51-58] `'q, rs'` MATCH 10 has_quotes [60-74] `"'q,rs't, ij"k` double_quotes [60-73] `"'q,rs't, ij"` MATCH 11 has_quotes [75-79] ` l""` double_quotes [77-79] `""` l "` MATCH 1 simple [0-4] `abcd` MATCH 2 simple [5-9] `efgh` MATCH 3 simple [10-15] ` ijkl` MATCH 4 simple [16-20] `abcd` MATCH 5 simple [21-26] ` efgh` MATCH 6 has_quotes [27-35] ` ij"k,l"` double_quotes [30-35] `"k,l"` MATCH 7 simple [36-41] ` mnop` MATCH 8 has_quotes [42-50] ` 'q,rs't` single_quotes [43-49] `'q,rs'` MATCH 9 has_quotes [51-59] `'q, rs't` single_quotes [51-58] `'q, rs'` MATCH 10 has_quotes [60-74] `"'q,rs't, ij"k` double_quotes [60-73] `"'q,rs't, ij"` MATCH 11 has_quotes [75-79] ` l""` double_quotes [77-79] `""` rs't` MATCH 1 simple [0-4] `abcd` MATCH 2 simple [5-9] `efgh` MATCH 3 simple [10-15] ` ijkl` MATCH 4 simple [16-20] `abcd` MATCH 5 simple [21-26] ` efgh` MATCH 6 has_quotes [27-35] ` ij"k,l"` double_quotes [30-35] `"k,l"` MATCH 7 simple [36-41] ` mnop` MATCH 8 has_quotes [42-50] ` 'q,rs't` single_quotes [43-49] `'q,rs'` MATCH 9 has_quotes [51-59] `'q, rs't` single_quotes [51-58] `'q, rs'` MATCH 10 has_quotes [60-74] `"'q,rs't, ij"k` double_quotes [60-73] `"'q,rs't, ij"` MATCH 11 has_quotes [75-79] ` l""` double_quotes [77-79] `""` rs'` MATCH 1 simple [0-4] `abcd` MATCH 2 simple [5-9] `efgh` MATCH 3 simple [10-15] ` ijkl` MATCH 4 simple [16-20] `abcd` MATCH 5 simple [21-26] ` efgh` MATCH 6 has_quotes [27-35] ` ij"k,l"` double_quotes [30-35] `"k,l"` MATCH 7 simple [36-41] ` mnop` MATCH 8 has_quotes [42-50] ` 'q,rs't` single_quotes [43-49] `'q,rs'` MATCH 9 has_quotes [51-59] `'q, rs't` single_quotes [51-58] `'q, rs'` MATCH 10 has_quotes [60-74] `"'q,rs't, ij"k` double_quotes [60-73] `"'q,rs't, ij"` MATCH 11 has_quotes [75-79] ` l""` double_quotes [77-79] `""` , rs't, ij "k` MATCH 1 simple [0-4] `abcd` MATCH 2 simple [5-9] `efgh` MATCH 3 simple [10-15] ` ijkl` MATCH 4 simple [16-20] `abcd` MATCH 5 simple [21-26] ` efgh` MATCH 6 has_quotes [27-35] ` ij"k,l"` double_quotes [30-35] `"k,l"` MATCH 7 simple [36-41] ` mnop` MATCH 8 has_quotes [42-50] ` 'q,rs't` single_quotes [43-49] `'q,rs'` MATCH 9 has_quotes [51-59] `'q, rs't` single_quotes [51-58] `'q, rs'` MATCH 10 has_quotes [60-74] `"'q,rs't, ij"k` double_quotes [60-73] `"'q,rs't, ij"` MATCH 11 has_quotes [75-79] ` l""` double_quotes [77-79] `""` ` MATCH 1 simple [0-4] `abcd` MATCH 2 simple [5-9] `efgh` MATCH 3 simple [10-15] ` ijkl` MATCH 4 simple [16-20] `abcd` MATCH 5 simple [21-26] ` efgh` MATCH 6 has_quotes [27-35] ` ij"k,l"` double_quotes [30-35] `"k,l"` MATCH 7 simple [36-41] ` mnop` MATCH 8 has_quotes [42-50] ` 'q,rs't` single_quotes [43-49] `'q,rs'` MATCH 9 has_quotes [51-59] `'q, rs't` single_quotes [51-58] `'q, rs'` MATCH 10 has_quotes [60-74] `"'q,rs't, ij"k` double_quotes [60-73] `"'q,rs't, ij"` MATCH 11 has_quotes [75-79] ` l""` double_quotes [77-79] `""` 
0
source

All Articles