Why does this RegEx work the way I want it?

I have a RegEx that works for me, but I don't know WHY it works for me. I will explain.

RegEx: \s*<in.*="(<?.*?>)"\s*/>\s* 


The text he finds (he finds an empty space before and after the input tag):

 <td class="style9"> <input name="guarantor4" id="guarantor4" size="50" type="text" tabindex="10" value="<?php echo $data[guarantor4]; ?>" /> </td> </tr> 


The part I don't understand:

 <in.*=" <--- As I understand it, this should only find up to the first =" as in it should only find <input name=" It actually finds: <input name="guarantor4" id="guarantor4" size="50" type="text" tabindex="10" value=" which happened to be what I was trying to do. 

What I do not understand about this RegEx?

+4
source share
4 answers

It looks like you are using greedy matching.

A greedy coincidence says: "Eat as much as you can to make this work."

try

 <in[^=]*= 

for starters, this will stop its matching "=" as part of ". *"

but in the future you can read on

 .*? 

and

 .+? 

which stops at the first possible condtion, which matches instead of the last.

Using the non-greedy syntax would be better if you tried to stop only when you saw TWO strong> characters ,

those.:

 <in.*?=id 

which would stop at the first "= id" regardless of whether "=" exists between them.

+8
source

.* greedy. Do you want .*? found only the first = .

+7
source

. * greedy, so he will find to the last =. If you want it to be non-greedy, add a question mark, for example :. *?

+4
source

As I understand it, this should find before the first = "how it should only find <input name ="

You do not say what language you write, but almost all regular expression systems are β€œgreedy helpers,” meaning they correspond to the longest input substring. In your case, this means everything from the beginning of the input tag to the last sequence with equal quotation marks.

Most regex systems have a way to indicate that patter matches the shortest possible substring, rather than the longest - a "non-greedy match".

As a side, do not assume that the first parameter will be name = unless you have complete control over the input design. Both HTML and XML attributes allow you to specify attributes in any order.

+2
source

All Articles