Regex lookahead discards a match

I am trying to make a regex that completely discards the lookahead.

\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)* 

This is a coincidence, and this is my regex101 test .

But when the email begins with - or _ or . , it should not fully correspond to it, and not just delete the original characters. Any ideas are welcome, I have been looking for the last half hour, but cannot figure out how to delete all the email when it starts with these characters.

+5
source share
2 answers

You can use the word border next to @ with a negative lookbehind to check if we are at the beginning of a line or immediately after a space, and then check if the first character is inside an undesirable class [^\s\-_.] :

 (?<=^|\s)[^\s\-_.]\w*(?:[-+.]\w+)*\ b@ \w+(?:[-.]\w+)*\.\w+(?:[-.]\w+)* 

See demo

Match List:

 support@github.com s.miller@mit.edu j.hopking@york.ac.uk steve.parker@soft.de info@company-hotels.org kiki@hotmail.co.uk no-reply@github.com s.peterson@mail.uu.net info-bg@software-software.software.academy 

Additional usage notes and alternative notations

Note that it’s best to use as few screens as possible in the regular expression, so [^\s\-_.] Can be written as [^\s_.-] , and the hyphen at the end of the character class still indicates a literal hyphen, and not a range. In addition, if you plan to use the pattern on other machines with regular expressions, you may encounter striping in lookbehind, and then you can replace (?<=\s|^) equivalent (?<!\S) . See this regex :

 (?<!\S)[^\s_.-]\w*(?:[-+.]\w+)*\ b@ \w+(?:[-.]\w+)*\.\w+(?:[-.]\w+)* 

Last but not least, if you need to use it in JavaScript or other languages ​​that do not support search queries, replace (?<!\S) / (?<=\s|^) a (non) capture group (\s|^) , wrap the entire part of the email template with a different set of capturing parentheses and use the language tools to capture the contents of group 1:

 (\s|^)([^\s_.-]\w*(?:[-+.]\w+)*\ b@ \w+(?:[-.]\w+)*\.\w+(?:[-.]\w+)*) 

See the demo of regex .

+2
source

I use this for multiple email addresses, separate from ';':

 ([A-Za-z0-9._%-] +@ [A-Za-z0-9.-]+\.[A-Za-z]{2,4};)* 

For one mail:

 [A-Za-z0-9._%-] +@ [A-Za-z0-9.-]+\.[A-Za-z]{2,4} 
0
source

All Articles