Check out this answer .
Quote:
re_valid = r""" # Validate a CSV string having single, double or un-quoted values. ^ # Anchor to start of string. \s* # Allow whitespace before value. (?: # Group for value alternatives. '[^'\\]*(?:\\[\S\s][^'\\]*)*' # Either Single quoted string, | "[^"\\]*(?:\\[\S\s][^"\\]*)*" # or Double quoted string, | [^,'"\s\\]*(?:\s+[^,'"\s\\]+)* # or Non-comma, non-quote stuff. ) # End group of value alternatives. \s* # Allow whitespace after value. (?: # Zero or more additional values , # Values separated by a comma. \s* # Allow whitespace before value. (?: # Group for value alternatives. '[^'\\]*(?:\\[\S\s][^'\\]*)*' # Either Single quoted string, | "[^"\\]*(?:\\[\S\s][^"\\]*)*" # or Double quoted string, | [^,'"\s\\]*(?:\s+[^,'"\s\\]+)* # or Non-comma, non-quote stuff. ) # End group of value alternatives. \s* # Allow whitespace after value. )* # Zero or more additional values $ # Anchor to end of string. """
Or the form used (since JS cannot handle multi-line regular expression strings):
var re_valid = /^\s*(?:'[^'\\]*(?:\\[\S\s][^'\\]*)*'|"[^"\\]*(?:\\[\S\s][^"\\]*)*"|[^,'"\s\\]*(?:\s+[^,'"\s\\]+)*)\s*(?:,\s*(?:'[^'\\]*(?:\\[\S\s][^'\\]*)*'|"[^"\\]*(?:\\[\S\s][^"\\]*)*"|[^,'"\s\\]*(?:\s+[^,'"\s\\]+)*)\s*)*$/;
It can be called using RegEx.test ()
if (!re_valid.test(text)) return null;
The first match searches for valid single-quoted strings. The second match searches for valid strings with double quotes, the third searches for strings without quotes.
If you remove matches with a single quote, it is almost a 100% implementation of the working IETF RFC 4810 for CSV validation.
Note. It may be 100%, but I canβt remember if it can handle newline characters in values ββ(I think that [\ S \ s] is a javascript dependent hacker to check for newline characters).
Note. This implementation is for JavaScript only, there is no guarantee that the RegEx source string will work in PHP.
If you plan to do something non-trivial with CSV data, I suggest that you accept the existing library. This gets pretty ugly if you're looking for an RFC compliant implementation.