Goal C: Removing HTML Attributes from a String

There are many answers for removing HTML descriptors from a string, but I would like to remove only a specific attribute: style. The HTML I'm dealing with has some serious nasty inline styles and often looks something like this:

<p class="someclass" style="margin-left:2cm;text-indent:-36.0pt">Blah.</p> 

In order to customize the display for my application, I need to uncheck this style attribute. Is there a quick way to process a document for this? It should work in iOS.

Thanks!

+4
source share
3 answers

Ultimately, I went with a combination of ElementParser and regular expressions (using RegExKitLite ), and striking out the tags that I didn't want, and replacing them with the ones I made, as needed. Given that my HTML comes from a reliable source, this should be fine.

This is far from ideal, but it works. :-)

+1
source

Well, perhaps the simplest (but also quite expensive (intensive processor)) is to use NSAttributedString + HMTL to turn it into an NSAttributedString. Then you can get NSString.

Something like that.

  NSAttributedString *attrstring = [NSAttributedString attributedStringWithHTML:[htmlString dataUsingEncoding:NSASCIIStringEncoding allowLossyConversion:YES] options:nil]; //Access the string itself like this. [attrstring string]; 

[Warning: although this is the easiest way (for you), it may not be the best way, since it is quite expensive todo and blocks your user interface if this is done in the main thread (for obvious reasons)]

0
source

All Articles