Lookaheads explanation in this regex

Question

Lookaheads explanation in this regex

I understand the correct expressions well enough, but I do not often use them to be an expert. I came across a regex that I use to check the password strength, but it contains some concepts of regex that I am not familiar with. Regular expression:

^(?=.*\d)(?=.*[az])(?=.*[AZ]).{6,}$

and in plain English means that the string must contain at least one lowercase character, one uppercase character and one number, and the string must contain at least six characters. Can someone break this for me to explain how this template describes this rule? I see the beginning of the string char ^ and the end of the string char $, three groups with lookaheads, matching any character. and the repetition of {6}}.

Thanks to any regular expression guru who can help me figure this out.

+7

regex

Rich miller Aug 6 '09 at 20:17

source share

5 answers

The lookahead group does not use input. Thus, the same characters are actually mapped to different viewing groups.

You can think of it this way: look for something ( .* ) Until you find the number ( \d ). If so, go back to the beginning of this group (vision concept). Now find something ( .* ) Until you find the lowercase letter. Repeat for uppercase letters. Now match any 6 or more characters.

+5

Sinan taifour Aug 6 '09 at 20:25

source share

To completely break it.

 ^ -- Match beginning of line (?=.*\d) -- The following string contains a number (?=.*[az]) -- The following string contains a lowercase letter (?=.*[AZ]) -- The following string contains an uppercase letter .{6,} -- Match at least 6, as many as desired of any character $ -- Match end of line

+4

Sean vieira Aug 6 '09 at 20:29

source share

I went and checked how it would look when using Perl:

 perl -Mre=debug -E'q[ abc 345 DEF ]=~/^(?=.*\d)(?=.*[az])(?=.*[AZ]).{6,}$/'

 Compiling REx "^(?=.*\d)(?=.*[az])(?=.*[AZ]).{6,}$" synthetic stclass "ANYOF[\0-\11\13-\377{unicode_all}]". Final program: 1: BOL (2) 2: IFMATCH[0] (9) 4: STAR (6) 5: REG_ANY (0) 6: DIGIT (7) 7: SUCCEED (0) 8: TAIL (9) 9: IFMATCH[0] (26) 11: STAR (13) 12: REG_ANY (0) 13: ANYOF[az] (24) 24: SUCCEED (0) 25: TAIL (26) 26: IFMATCH[0] (43) 28: STAR (30) 29: REG_ANY (0) 30: ANYOF[AZ] (41) 41: SUCCEED (0) 42: TAIL (43) 43: CURLY {6,32767} (46) 45: REG_ANY (0) 46: EOL (47) 47: END (0)

 floating ""$ at 6..2147483647 (checking floating) stclass ANYOF[\0-\11\13-\377{unicode_all}] anchored(BOL) minlen 6 Guessing start of match in sv for REx "^(?=.*\d)(?=.*[az])(?=.*[AZ]).{6,}$" against " abc 345 DEF " Found floating substr ""$ at offset 16... start_shift: 6 check_at: 16 s: 0 endpos: 11 Does not contradict STCLASS... Guessed: match at offset 0 Matching REx "^(?=.*\d)(?=.*[az])(?=.*[AZ]).{6,}$" against " abc 345 DEF " 0 <> < abc 345> | 1:BOL(2) 0 <> < abc 345> | 2:IFMATCH[0](9) 0 <> < abc 345> | 4: STAR(6) REG_ANY can match 16 times out of 2147483647... 16 <c 345 DEF > <> | 6: DIGIT(7) # failed... 15 <c 345 DEF > < > | 6: DIGIT(7) # failed... 14 <c 345 DEF> < > | 6: DIGIT(7) # failed... 13 <c 345 DE> <F > | 6: DIGIT(7) # failed... 12 <c 345 D> <EF > | 6: DIGIT(7) # failed... 11 <c 345 > <DEF > | 6: DIGIT(7) # failed... 10 <c 345> < DEF > | 6: DIGIT(7) # failed... 9 <c 34> <5 DEF > | 6: DIGIT(7) 10 <c 345> < DEF > | 7: SUCCEED(0) subpattern success... 0 <> < abc 345> | 9:IFMATCH[0](26) 0 <> < abc 345> | 11: STAR(13) REG_ANY can match 16 times out of 2147483647... 16 <c 345 DEF > <> | 13: ANYOF[az](24) # failed... 15 <c 345 DEF > < > | 13: ANYOF[az](24) # failed... 14 <c 345 DEF> < > | 13: ANYOF[az](24) # failed... 13 <c 345 DE> <F > | 13: ANYOF[az](24) # failed... 12 <c 345 D> <EF > | 13: ANYOF[az](24) # failed... 11 <c 345 > <DEF > | 13: ANYOF[az](24) # failed... 10 <c 345> < DEF > | 13: ANYOF[az](24) # failed... 9 <c 34> <5 DEF > | 13: ANYOF[az](24) # failed... 8 <bc 3> <45 DEF > | 13: ANYOF[az](24) # failed... 7 <abc > <345 DEF > | 13: ANYOF[az](24) # failed... 6 < abc > < 345 DEF > | 13: ANYOF[az](24) # failed... 5 < abc> < 345 DEF > | 13: ANYOF[az](24) # failed... 4 < ab> <c 345 DEF> | 13: ANYOF[az](24) 5 < abc> < 345 DEF > | 24: SUCCEED(0) subpattern success... 0 <> < abc 345> | 26:IFMATCH[0](43) 0 <> < abc 345> | 28: STAR(30) REG_ANY can match 16 times out of 2147483647... 16 <c 345 DEF > <> | 30: ANYOF[AZ](41) # failed... 15 <c 345 DEF > < > | 30: ANYOF[AZ](41) # failed... 14 <c 345 DEF> < > | 30: ANYOF[AZ](41) # failed... 13 <c 345 DE> <F > | 30: ANYOF[AZ](41) 14 <c 345 DEF> < > | 41: SUCCEED(0) subpattern success... 0 <> < abc 345> | 43:CURLY {6,32767}(46) REG_ANY can match 16 times out of 2147483647... 16 <c 345 DEF > <> | 46: EOL(47) 16 <c 345 DEF > <> | 47: END(0) Match successful! Freeing REx: "^(?=.*\d)(?=.*[az])(?=.*[AZ]).{6,}$"

I slightly changed the output

+1

Brad gilbert Aug 6 '09 at 20:41

source share

Perspective statements are used to ensure that at least one digit, one lowercase, and one uppercase letter are in the line.

-one

Gumbo Aug 6 '09 at 20:20

source share

Richiehindle · Accepted Answer · 2009-08-06T20:23:56+0000

Under normal circumstances, a piece of regular expression matches part of the input string and consumes that part of the string. The next fragment of the expression corresponds to the next fragment of the string, etc.

Lookahead statements do not consume any string, so your three statements:

(?=.*\d)
(?=.*[az])
(?=.*[az])

each average "This pattern (everything followed by a digit, lowercase letter, uppercase letter, respectively) should appear somewhere on the line," but they do not move the current matching position forward, so the rest of the expression:

.{6,}

(which means "six or more characters") should still match the entire input string.

Lookaheads explanation in this regex

More articles: