Remove html node from htmldocument: HTMLAgilityPack

In my code, I want to remove the img tag, which does not matter src. I am using HTMLAgilitypack HtmlDocument . I find img which does not matter src and tries to delete it .. but it gives me an error. The collection has been modified; an enumeration operation may not be performed. Can someone help me with this? The code I used is:

foreach (HtmlNode node in doc.DocumentNode.DescendantNodes()) { if (node.Name.ToLower() == "img") { string src = node.Attributes["src"].Value; if (string.IsNullOrEmpty(src)) { node.ParentNode.RemoveChild(node, false); } } else { ..........// i am performing other operations on document } } 
+9
source share
4 answers

What I've done:

  List<string> xpaths = new List<string>(); foreach (HtmlNode node in doc.DocumentNode.DescendantNodes()) { if (node.Name.ToLower() == "img") { string src = node.Attributes["src"].Value; if (string.IsNullOrEmpty(src)) { xpaths.Add(node.XPath); continue; } } } foreach (string xpath in xpaths) { doc.DocumentNode.SelectSingleNode(xpath).Remove(); } 
+6
source

It seems you are HtmlNode.RemoveChild collection while enumerating using the HtmlNode.RemoveChild method.

To fix this, you need to copy your nodes to a separate list / array by calling, for example. Enumerable.ToList<T>() or Enumerable.ToArray<T>() .

 var nodesToRemove = doc.DocumentNode .SelectNodes("//img[not(string-length(normalize-space(@src)))]") .ToList(); foreach (var node in nodesToRemove) node.Remove(); 

If I am right, the problem will disappear.

+22
source
 var emptyImages = doc.DocumentNode .Descendants("img") .Where(x => x.Attributes["src"] == null || x.Attributes["src"].Value == String.Empty) .Select(x => x.XPath) .ToList(); emptyImages.ForEach(xpath => { var node = doc.DocumentNode.SelectSingleNode(xpath); if (node != null) { node.Remove(); } }); 
+2
source
 var emptyElements = doc.DocumentNode .Descendants("a") .Where(x => x.Attributes["src"] == null || x.Attributes["src"].Value == String.Empty) .ToList(); emptyElements.ForEach(node => { if (node != null){ node.Remove();} }); 
0
source

All Articles