HTML string validation for undisclosed tags

I have a string as an HTML source, and I want to check if the HTML source, which is a string, contains a tag that does not open.

For example, the line below contains </u>after WAVEFORM, which has no opening <u>.

WAVEFORM</u> YES, <u>NEGATIVE AUSCULTATION OF EPIGASTRUM</u> YES,

I just want to test these types of undisclosed tag, and then I need to add the open tag to the beginning of the line?

+5
source share
2 answers

In this particular case, you can use the HTML Agility Pack to claim that the HTML is well-formed or that you have tags that don't open.

var htmlDoc = new HtmlDocument();

htmlDoc.LoadHtml(
    "WAVEFORM</u> YES, <u>NEGATIVE AUSCULTATION OF EPIGASTRUM</u> YES,");

foreach (var error in htmlDoc.ParseErrors)
{
    // Prints: TagNotOpened
    Console.WriteLine(error.Code);
    // Prints: Start tag <u> was not found
    Console.WriteLine(error.Reason); 
}
+6
source

. HTML, HTML, , HTML.

, , , , . - :

<(\w+)(?:\s+[-\w]+(?:\s*(?:=\s*(?:"[^"]*"|'[^']*'|[^'">\s][^>\s]*)))?)*\s*>
|</(\w+)\s*>
|<!--.*?-->

. 1 2, , . ( , .)

, , , .. EMPTY, <img>. EMPTY, , . ( XHTML, .)

, regex . , - ( , , . , .

, close , .

(, HTML . , , . - , , .)

0

All Articles