Bash Compliance Email Regex

I am trying to match some email addresses using regular expressions in bash. Currently received expression

"^[a-zA-Z0-9!#\$%&'\*\+/=?^_\`{|}~-]+(\.[a-zA-Z0-9!#\$%&'\*\+/=?^_\`{|}~-]+)*@([a-zA-Z0-9]([a-zA-Z0-9-]*[a-zA-Z0-9])?\.)+[a-zA-Z0-9]([a-zA-Z0-9-]*[a-zA-Z0-9])?\$" 

Which successfully matches all the email addresses I need, but when I try to add the "To:" field, I cannot get any matches, and I'm not sure why. This is my code with the To field.

 "^To:\s[a-zA-Z0-9!#\$%&'\*\+/=?^_\`{|}~-]+(\.[a-zA-Z0-9!#\$%&'\*\+/=?^_\`{|}~-]+)*@([a-zA-Z0-9]([a-zA-Z0-9-]*[a-zA-Z0-9])?\.)+[a-zA-Z0-9]([a-zA-Z0-9-]*[a-zA-Z0-9])?\$" 

Which AFAIK should go well with "To: bob@bob.co.uk ", but not :( Any tips?

Code example

 Reply-To: " service@paypal.com " < service@paypal.com > To: bob@bob.co.uk Date: Mon, 21 Jun 2012 21:34:10 -0300 

Code used to find a file and add to an array

 regex="^[a-zA-Z0-9!#\$%&'\*\+/=?^_\`{|}~-]+(\.[a-zA-Z0-9!#\$%&'\*\+/=?^_\`{|}~-]+)*@([a-zA-Z0-9]([a-zA-Z0-9-]*[a-zA-Z0-9])?\.)+[a-zA-Z0-9]([a-zA-Z0-9-]*[a-zA-Z0-9])?\$" for i in $(cat mailbox.mbx); do if [[ $i =~ $regex ]]; then echo $i sortarray[$index]=$i index=$(($index+1)) fi done 
+4
source share
2 answers

bash regexes do not understand perl-ish \s . You must use posix-ish [[:space:]] . Also you should add a quantifier there

I see that you have bindings in $regex : have you disabled?

For massive regular expressions like this, I like to create them in parts:

 char='[[:alnum:]!#\$%&'\''\*\+/=?^_\`{|}~-]' name_part="${char}+(\.${char}+)*" domain="([[:alnum:]]([[:alnum:]-]*[[:alnum:]])?\.)+[[:alnum:]]([[:alnum:]-]*[[:alnum:]])?" begin='(^|[[:space:]])' end='($|[[:space:]])' # include capturing parentheses, # these are the ** 2nd ** set of parentheses (there a pair in $begin) re_email="${begin}(${name_part}@${domain})${end}" line="To: joe.smith@example.com " [[ $line =~ $re_email ]] && echo ${BASH_REMATCH[2]} # prints: joe.smith@example.com 

Of course, email addresses are surprisingly complex - http://www.w3.org/Protocols/rfc822/#z8 - and comments and spaces should be allowed anywhere. In fact, (hi there) "My First Name".lastname (another comment) @ domain.(really)invalid should be considered a valid address. There is a Perl Email :: Address module that generates this regular expression:

 $ perl -MEmail::Address -E 'say $Email::Address::addr_spec' (?-xism:(?-xism:(?-xism:(?-xism:(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|))*\s*\)\s*)))*\s*\)\s*)|\s+)*(?-xism:[^\x00-\x1F\x7F()<>\[\]:;@\\,."\s]+(?:\.[^\x00-\x1F\x7F()<>\[\]:;@\\,."\s]+)*)(?-xism:(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|))*\s*\)\s*)))*\s*\)\s*)|\s+)*)|(?-xism:(?-xism:(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|))*\s*\)\s*)))*\s*\)\s*)|\s+)*"(?-xism:(?-xism:[^\\"])|(?-xism:\\(?-xism:[^\x0A\x0D])))+"(?-xism:(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|))*\s*\)\s*)))*\s*\)\s*)|\s+)*))\@(?-xism:(?-xism:(?-xism:(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|))*\s*\)\s*)))*\s*\)\s*)|\s+)*(?-xism:[^\x00-\x1F\x7F()<>\[\]:;@\\,."\s]+(?:\.[^\x00-\x1F\x7F()<>\[\]:;@\\,."\s]+)*)(?-xism:(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|))*\s*\)\s*)))*\s*\)\s*)|\s+)*)|(?-xism:(?-xism:(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|))*\s*\)\s*)))*\s*\)\s*)|\s+)*\[(?:\s*(?-xism:(?-xism:[^\[\]\\])|(?-xism:\\(?-xism:[^\x0A\x0D]))))*\s*\](?-xism:(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|))*\s*\)\s*)))*\s*\)\s*)|\s+)*))) 
+3
source

This regular expression should match the desired line:

 "^To: ( .+@. +)$" 

Email is stored at $1

+1
source

All Articles