How to optimize HTML text copied from MS Word using GWT?

I have a problem with RichTextArea s, so my problem is: when I paste copied text from Ms Word or OpenOffice into RichTextArea, it saves all text styles, and this is fine, but one bad thing is that the HTML text is huge enough :(. And the size of the database increases due to unnecessary HTML tags.

My question is: "How easy is it to optimize this HTML text?"

Thanks!!!

+8
java copy-paste richtextbox gwt
source share
3 answers

Finally, I understood the answer to my question: I found TinyMCE for GWT , good enough for me, it has a copy from ms word option and its HTML optimization is excellent.

0
source share

RichTextArea based on contentEditable browser contentEditable . This means that the HTML tag soup you come across will be platform, source and browser specific. When you say “optimize,” what is your ultimate goal? What part of the original formatting do you want to keep? Besides the simple trivial minimization of the HTML that is inserted, any significant reduction in HTML complexity is likely to result in a loss of visual fidelity.

Utilities such as HTML Tidy or any derivative of it may help you in the aspect of minimization. If your goal is to reduce HTML complexity, you can use HTMLUnit as a captured server browser to render pasted content in memory, and then retrieve attributes that you find useful from the HTMLUnit DOM. FWIW, this is one way to make search engine AJAX applications crawl.

While lowering visual accuracy can be a bit confusing for the original user, this gives you the ability to unify the visual style of all inserted content. If you build a site based on the input of many users, this uniformity reduces the amount of mental effort required to orient (i.e. see what you see) the content.

+1
source share

Question related to us

Html Tidy has an API that you can use in Java programs.

0
source share

All Articles