How can I fix this RegEx to possibly capture the file extension?
I am trying to match a string with an optional component, but something seems to be wrong. (The corresponding lines are taken from the printer log.)
My RegEx (.NET Flavor) is as follows:
.*(header_\d{10,11}_).*(_.*_\d{8}).*(\.\w{3,4}).* ------------------------------------------- .*
I expect this to match strings like:
str1 = "header_0000000602_t_mc2e1nrobr1a3s55niyrrqvy_20081212[1].doc [Compatibility Mode]" str2 = "Microsoft PowerPoint - header_00000000076_d_al41zguyvgqfj2454jki5l55_20071203[1].txt" str3 = "header_00000000076_d_al41zguyvgqfj2454jki5l55_20071203[1]"
Where capture groups return something like:
$1 = header_0000000602_ $2 = _mc2e1nrobr1a3s55niyrrqvy_20081212 $3 = .doc
Where $ 3 may be empty if the file extension is not found. $ 3 is an optional part, as you can see in str3 above.
If I add "?" until the end of the third capture group "(. \ w {3,4})?", RegEx no longer grabs $ 3 for any row. If I add "+" instead of "(. \ W {3,4}) +", RegEx no longer commits str3 at all, as you would expect.
I feel like using "?" at the end of the third capture group - this is a suitable thing, but it does not work, as I expect. I'm probably too naive in the ". *" Sections, which I use to ignore parts of a string.
Doesn't work as expected:
.*(header_\d*_).*(_.*_.{8}).*(\.\w{3,4})?.*