How to remove specific html class with content using Java Html Class

Recently, I am working on an Android project. I am parsing data from a wordpress api. But detailed posts are posted in html formet. I need to remove html tags. Using the html.fromHtml () method. ToString () java, I deleted all the tags. But there are some images that I have to delete. To remove the title, I have to find the tag class. So, how can I remove this content using the Html class?

<p class="wp-caption-text">android m marshmallow</ 

EDIT:

Using regex, I solved my problem.

Paste your Html into Regex and you will get your regular expression.

  yourHtml = yourHtml.replaceAll("Your_Regular_Expression",""); yourHtml = Html.fromHtml(yourHtml).toString(); 
+6
source share
1 answer

If you want a match, you can try the following:

 <(\w+).*?class="wp-caption-text".*?>[\s\S]*?<\/\1> 

Regex101

I would like to mention that this is not an ideal solution. Regular expressions are not very good at parsing html, because the structures in this markup language are actually too complex to be parsed 100% with regular expressions. See here

+2
source

All Articles