There are many questions about how to break html tags, but not many functions / methods to close them.
Here is the situation. I have a 500 character message summary (including html tags), but I only need the first 100 characters. The problem is that if I truncate the message, it may be in the middle of the html tag ... which messed up the stuff.
Assuming html looks something like this:
<div class="bd">"Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. <br/>
<br/>Some Dates: April 30 - May 2, 2010 <br/>
<p>Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. <em>Duis aute irure dolor in reprehenderit</em> in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. <br/>
</p>
For more information about Lorem Ipsum doemdloe, visit: <br/>
<a href="http://www.somesite.com" title="Some Conference">Some text link</a><br/>
</div>
How will I take the first ~ 100 characters? (Although, ideally, this would be the first approximately 100 characters of "CONTENT" (between html tags).
I guess the best way to do this would be a recursive algorithm that tracks html tags and adds tags to be truncated, but this might not be the best approach.
, 100 , "<" html-, .
, . html, .
. , , html . , WYSIWYG.
EDIT:
(, ). , . , ... , , ( , , ),