How can I load only certain elements in AngleSharp?

I am using AngleSharp for parsing HTML5 at the moment that I do is wrapping the elements that I want to parse with a bit of HTML to make it HTML5 valid and then using a parser on this, is it better to do this? meaning, parsing individual elements directly and verifying that the structure is really HTML5?

+4
source share
1 answer

Hm, a small example would be nice. But AngleSharp supports fragment parsing, which is similar to what you want. In the general case, fragment parsing is also used when setting properties such as InnerHtmlthose that convert strings to DOM nodes.

You can use the ParseFragmentclass method HtmlParserto get the list of nodes contained in this source code. Example:

using AngleSharp.Parser.Html;
// ...

var source = "<div><span class=emphasized>Works!</span></div>";
var parser = new HtmlParser();
var nodes = parser.ParseFragment(source, null);//null = no context given

if (nodes.Length == 0)
    Debug.WriteLine("Apparently something bad happened...");

foreach (var node in nodes)
{
    // Examine the node
}

Usually all nodes will be ITextor IElement. Comments are also possible ( IComment). You will never see nodes IDocumentor IDocumentFragmentattached to such INodeList. However, since HTML5 is fairly robust, it is very likely that you will never experience "errors" using this method.

, () . IConfiguration, , . ( / ) :

using AngleSharp.Events;
// ...

class SimpleEventAggregator : IEventAggregator
{
    readonly List<HtmlParseErrorEvent> _errors = new List<HtmlParseErrorEvent>();

    public void Publish<TEvent>(TEvent data)
    {
        var error = data as HtmlParseErrorEvent;

        if (error != null)
            _errors.Add(error);
    }

    public List<HtmlParseErrorEvent> Errors
    {
        get { return _errors; }
    }

    public void Subscribe<TEvent>(ISubscriber<TEvent> listener) { }

    public void Unsubscribe<TEvent>(ISubscriber<TEvent> listener) { }
}

- () Configuration. .

using AngleSharp;
// ...

var errorEvents = new SimpleEventAggregator();
var config = new Configuration(events: errorEvents);

: , , ( W3C). , , - .

, . , , , .

AngleSharp.

+8

All Articles