Regex matches tags like <a>, <bb>, <ccc>, but not <abc>

I need a regex to match tags that look like <A> , <BB> , <CCC> but not <ABC> , <aaa> , <> . therefore, the tag must consist of the same capital letter, repeated. I tried <[AZ]+> but this does not work. of course, I can write something like <(A+|B+|C+|...)> and so on, but I wonder if there is a more elegant solution.

+6
regex
Jun 24 '10 at 13:23
source share
1 answer

You can use something like this ( see this at rubular.com ):

 <([AZ])\1*> 

In this case, the capture group and backlink are used. Mostly:

  • You use (pattern) to "capture" a match
  • Then you can use \n in your template, where n is the group number to β€œlink back” to what matches that group.

So in this case:

  • Group 1 captures ([AZ]) , an uppercase letter immediately after <
  • Then we see if we can match \1* , i.e. zero or more of the same letters

References

+8
Jun 24 2018-10-06T00:
source share



All Articles