Validating Elements in a Regular Expression CSV

I have a CSV string that I am trying to test with a regex to make sure that it has only N elements. I tried the following template (which is looking for 2 elements):

/([^,]+){2}/ 

But this does not seem to work, I guess, because the internal template is not greedy enough.

Any ideas? Ideally, it should work with both PHP regular expression engines and Javscript.

Update:

For technical reasons, I really want to do this with a regex, rather than another solution. CSV is not quoted, and the values ​​will not contain commas, so this is not a problem.

 /([^,]*[,]{1}[^,]*){1}/ 

I now find that works, but is still a little ugly and has problems matching one element.

CSV looks like this:

 apples,bananas,pears,oranges,grapefruit 
+4
source share
7 answers

Got it.

 /^([^,]+([,]{1}|$)){1}$/ 

Specify the last {N} number of results or range {1,3} to check.

0
source

In PHP you will be much better off using this function:

http://www.php.net/manual/en/function.str-getcsv.php

He will deal with the likes of:

 a,"b,c" 

... which contains two elements, not three.

I do not know the equivalent function for javascript.

+5
source

Unconfirmed because I don't know what your input looks like:

 /^([^,]+,){1}([^,]+$)/ 

This requires two fields (one comma, so there is no comma after the last field).

+1
source

How about using the g ( global ) modifier to make RegExp greedier?

 var foobar = 'foo,bar', foobarbar = 'foo,bar,"bar"', foo = 'foo,', bar = 'bar'; foo.match(/([^,]+)/g).length === 2; //=> false bar.match(/([^,]+)/g).length === 2; //=> false foobar.match(/([^,]+)/g).length === 2; //=> true foobarbar.match(/([^,]+)/g).length === 2; //=> false 
+1
source
 var vals = "something,sthelse,anotherone,woohoo".split(','), maxlength = 4; return vals.length<=maxlength 

should work in js.

0
source

Depending on how the CSV is formatted, it can be divided by /\",\"/ (i.e. double_quote comma double_quote) and get the length of the resulting array.

Regular expressions are not very good for parsing, so if the string is complex, you may need to parse it in another way.

0
source

Check out this answer .

Quote:

 re_valid = r""" # Validate a CSV string having single, double or un-quoted values. ^ # Anchor to start of string. \s* # Allow whitespace before value. (?: # Group for value alternatives. '[^'\\]*(?:\\[\S\s][^'\\]*)*' # Either Single quoted string, | "[^"\\]*(?:\\[\S\s][^"\\]*)*" # or Double quoted string, | [^,'"\s\\]*(?:\s+[^,'"\s\\]+)* # or Non-comma, non-quote stuff. ) # End group of value alternatives. \s* # Allow whitespace after value. (?: # Zero or more additional values , # Values separated by a comma. \s* # Allow whitespace before value. (?: # Group for value alternatives. '[^'\\]*(?:\\[\S\s][^'\\]*)*' # Either Single quoted string, | "[^"\\]*(?:\\[\S\s][^"\\]*)*" # or Double quoted string, | [^,'"\s\\]*(?:\s+[^,'"\s\\]+)* # or Non-comma, non-quote stuff. ) # End group of value alternatives. \s* # Allow whitespace after value. )* # Zero or more additional values $ # Anchor to end of string. """ 

Or the form used (since JS cannot handle multi-line regular expression strings):

 var re_valid = /^\s*(?:'[^'\\]*(?:\\[\S\s][^'\\]*)*'|"[^"\\]*(?:\\[\S\s][^"\\]*)*"|[^,'"\s\\]*(?:\s+[^,'"\s\\]+)*)\s*(?:,\s*(?:'[^'\\]*(?:\\[\S\s][^'\\]*)*'|"[^"\\]*(?:\\[\S\s][^"\\]*)*"|[^,'"\s\\]*(?:\s+[^,'"\s\\]+)*)\s*)*$/; 

It can be called using RegEx.test ()

 if (!re_valid.test(text)) return null; 

The first match searches for valid single-quoted strings. The second match searches for valid strings with double quotes, the third searches for strings without quotes.

If you remove matches with a single quote, it is almost a 100% implementation of the working IETF RFC 4810 for CSV validation.

Note. It may be 100%, but I can’t remember if it can handle newline characters in values ​​(I think that [\ S \ s] is a javascript dependent hacker to check for newline characters).

Note. This implementation is for JavaScript only, there is no guarantee that the RegEx source string will work in PHP.

If you plan to do something non-trivial with CSV data, I suggest that you accept the existing library. This gets pretty ugly if you're looking for an RFC compliant implementation.

0
source

All Articles