Java Regex checks if a string contains an XML tag

I am trying to determine if a string contains at least one XML tag using a function String.match(). Due to how the project is set up, I would prefer that I don't have to use it Pattern.

I am currently using this regex:

<[A-Za-z0-9]+>

Which, obviously, checks to see if there are arrows in the line on the right and left that contain the text. I need to check if a string has only one XML tag with Regex, for example, input:

blah <abc foo="bar">blah</abc> blah
blah <abc foo="bar"/>

but not entered as:

blah <abc> blah
blah <abc </abc> blah

Is it possible?

+4
source share
3 answers

It:

if (input.matches("(?s).*(<(\\w+)[^>]*>.*</\\2>|<(\\w+)[^>]*/>).*"))

matches tags of both types (standard and self-closing):

<abc foo="bar">blah</abc>
<abc foo="bar"/>

without matching incomplete tags, for example:

<abc>

Watch the live regex demo .

+5
source

:

if (input.matches("(?s).*?<(\\S+?)[^>]*>.*?</\\1>.*")) {
    // String has a XML tag
}

(?s) - DOTALL, DOT.

- RegEx

+3

Well, this regex will match most html / xml tags.
Perhaps only node tags are needed, the rest can be removed.

Just node tags (final editing) -

 # "(?s)<(?:/?[\\w:]+\\s*|[\\w:]+(?:\".*?\"|'.*?'|[^>]*?)+)>"

 (?s)
 <
 (?:
      /?
      [\w:]+ 
      \s* 
   |  
      [\w:]+ 
      (?: " .*? " | ' .*? ' | [^>]*? )+
 )
 >

Full -

Formatted by:

 # "<(?:(?:/?[\\w:]+\\s*/?)|(?:[\\w:]+\\s+(?:(?:(?:\"[\\S\\s]*?\")|(?:'[\\S\\s]*?'))|(?:[^>]*?))+\\s*/?)|\\?[\\S\\s]*?\\?|(?:!(?:(?:DOCTYPE[\\S\\s]*?)|(?:\\[CDATA\\[[\\S\\s]*?\\]\\])|(?:--[\\S\\s]*?--)|(?:ATTLIST[\\S\\s]*?)|(?:ENTITY[\\S\\s]*?)|(?:ELEMENT[\\S\\s]*?))))>"

 <
 (?:
      (?:
           /? 
           [\w:]+ 
           \s* 
           /? 
      )
   |  
      (?:
           [\w:]+ 
           \s+ 
           (?:
                (?:
                     (?: " [\S\s]*? " )
                  |  (?: ' [\S\s]*? ' )
                )
             |  (?: [^>]*? )
           )+
           \s* 
           /? 
      )
   |  
      \?
      [\S\s]*? 
      \?
   |  
      (?:
           !
           (?:
                (?:
                     DOCTYPE
                     [\S\s]*? 
                )
             |  (?:
                     \[CDATA\[
                     [\S\s]*? 
                     \]\]
                )
             |  (?:
                     --
                     [\S\s]*? 
                     --
                )
             |  (?:
                     ATTLIST
                     [\S\s]*? 
                )
             |  (?:
                     ENTITY
                     [\S\s]*? 
                )
             |  (?:
                     ELEMENT
                     [\S\s]*? 
                )
           )
      )
 )
 >
+1
source

All Articles