In your RegEx, try changing this:
"(.*)"
:
"([^<]*)"
So, instead of matching ANY character, you match any characters until (but not including) the next one less than a character.
You can also change this:
"</" + htmlTag + ">"
to that
"</ ?" + htmlTag + ">"
To resolve the space after the slash (you can ignore this second sentence if you have full control over the HTML documents and know exactly how they were encoded)
source share