Library for parsing XHTML files with XLINQ

Question

Library for parsing XHTML files with XLINQ

When I realized that I needed to create an index for about 50 XHTML pages that could be added / deleted / renamed / moved in the future, I thought: "No problem - I will write a quick index generator using LINQ to XML, since XHTML is definitely considered XML ".

Of course, as soon as I tried to run it, I learned that XLINQ was choking on XHTML objects such as & nbsp ;. I went around it using the following algorithm:

Read the XHTML file in line.
Use the search and replace regular expressions in this line to add a section in DOCTYPE that defines all the relevant objects (because I only care about the "title" attribute in the files I read and my output file is not using any objects right now, it just sets them all to empty, but I can add the actual values later).
Parses the result in an XDocument.

To save the file, I do the opposite:

Save XDocument in line.
Separate entity definitions.
Save to file.

My question is: are there any libraries (especially the built-in .Net) that I can use that will read XHTML files in XDocuments? The code I wrote achieved its goal (to generate the current index and to check the rest of the generator program), and I would prefer not to waste time testing it if someone else wrote and tested the same thing.

Thank you for your time,

Ria.

Edit: Thank you so much; it works! I still need to process the strings a bit when I save XHTML (I think the library wasn’t actually made for this :)), and I had to play a little with the source of the Agility Pack to make it stop indiscriminately CDATA section around internal elements each style attribute (even if it already had one present), but what is the point of Open Source, right?

+6

xml .net xhtml linq linq-to-xml

Ria Jan 28 '09 at 9:09

source share

1 answer

Gonzalo quero · Accepted Answer · 2009-01-28T11:39:48+0000

This may be useful: LINQ and Lambda, Part 3: Html Flexibility Package for LINQ to XML Converter

Library for parsing XHTML files with XLINQ

More articles: