Javascript regex separates words in a comma-separated string

Question

Javascript regex separates words in a comma-separated string

I am trying to split a comma separated string using regex.

var a = 'hi,mr.007,bond,12:25PM'; //there are no white spaces between commas var b = /(\S+?),(?=\S|$)/g; b.exec(a); // does not catch the last item.

Any suggestion to catch all the items.

+4

javascript regex

kurro Feb 28 '13 at 19:19

source share

3 answers

Why not just use .split ?

 >'hi,mr.007,bond,12:25PM'.split(',') ["hi", "mr.007", "bond", "12:25PM"]

If you need to use regex for any reason:

 str.match(/(\S+?)(?:,|$)/g) ["hi,", "mr.007,", "bond,", "12:25PM"]

(note the inclusion of commas).

+5

Explosion pills Feb 28 '13 at 19:20

source share

If you are transferring a CSV file, some of your values may contain double quotes, so you might need something more complex. For instance:

 Pattern splitCommas = java.util.regex.Pattern.compile("(?:^|,)((?:[^\",]|\"[^\"]*\")*)"); Matcher m = splitCommas.matcher("11,=\"12,345\",ABC,,JKL"); while (m.find()) { System.out.println( m.group(1)); }

or in Groovy:

 java.util.regex.Pattern.compile('(?:^|,)((?:[^",]|"[^"]*")*)') .matcher("11,=\"12,345\",ABC,,JKL") .iterator() .collect { it[1] }

This code processes:

empty lines (no values or commas on them)
empty columns, including the last column being empty
processes values enclosed in double quotes, including commas inside double quotes
but does not handle two double quotes used to escape double quotes

The template consists of:

(?:^|,) matches the beginning of a line or comma after the last column, but does not add it to the group
((?:[^",]|"[^"]*")*) matches the value of the column and consists of:
- a collection group that collects zero or more characters that:
  - [^",] is a character that is not a comma or quote
  - "[^"]*" is a double quote followed by zero or more other characters ending in another double quote
- one or another together, using a non-collecting group: (?:[^",]|"[^"]*")
- use * to repeat the above number of times: (?:[^",]|"[^"]*")*
- and to the collection group to give the meaning of the columns: ((?:[^",]|"[^"]*")*)

Avoiding double quotes left to the reader as an exercise

0

Piran Jun 19 '18 at 15:12

source share

iamnotmaynard · Accepted Answer · 2013-02-28T19:24:46+0000

Use a negative character class:

 /([^,]+)/g

will match groups not commas.

 < a = 'hi,mr.007,bond,12:25PM' > "hi,mr.007,bond,12:25PM" < b=/([^,]+)/g > /([^,]+)/g < a.match(b) > ["hi", "mr.007", "bond", "12:25PM"]

Javascript regex separates words in a comma-separated string

More articles: