I have the following input text:
@"This is some text @foo=bar @name=""John \""The Anonymous One\"" Doe"" @age=38"
I would like to parse the values using the syntax @name = value as a name / value pair. Parsing the previous line should result in the following named captures:
name:"foo" value:"bar" name:"name" value:"John \""The Anonymous One\"" Doe" name:"age" value:"38"
I tried the following regex that almost bothered me:
@"(?:(?<=\s)|^)@(?<name>\w+[A-Za-z0-9_-]+?)\s*=\s*(?<value>[A-Za-z0-9_-]+|(?="").+?(?=(?<!\\)""))"
The main problem is that it captures the initial quote in "John \""The Anonymous One\"" Doe" . I feel it should be lookbehind instead of lookahead, but this does not seem to work at all.
Here are some rules to express:
The name must begin with a letter and contain any letter, number, underscore or hyphen.
Without quotes, there must be at least one character and can contain any letter, number, underscore or hyphen.
The quoted value can contain any character, including any space and escaped quotation marks.
Edit:
Here is the result of regex101.com :
(?:(?<=\s)|^)@(?<name>\w+[A-Za-z0-9_-]+?)\s*=\s*(?<value>(?<!")[A-Za-z0-9_-]+|(?=").+?(?=(?<!\\)")) (?:(?<=\s)|^) Non-capturing group @ matches the character @ literally (?<name>\w+[A-Za-z0-9_-]+?) Named capturing group name \s* match any white space character [\r\n\t\f ] = matches the character = literally \s* match any white space character [\r\n\t\f ] Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy] (?<value>(?<!")[A-Za-z0-9_-]+|(?=").+?(?=(?<!\\)")) Named capturing group value 1st Alternative: [A-Za-z0-9_-]+ [A-Za-z0-9_-]+ match a single character present in the list below Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy] AZ a single character in the range between A and Z (case sensitive) az a single character in the range between a and z (case sensitive) 0-9 a single character in the range between 0 and 9 _- a single character in the list _- literally 2nd Alternative: (?=").+?(?=(?<!\\)") (?=") Positive Lookahead - Assert that the regex below can be matched " matches the characters " literally .+? matches any character (except newline) Quantifier: +? Between one and unlimited times, as few times as possible, expanding as needed [lazy] (?=(?<!\\)") Positive Lookahead - Assert that the regex below can be matched (?<!\\) Negative Lookbehind - Assert that it is impossible to match the regex below \\ matches the character \ literally " matches the characters " literally
source share