Defining HTML space rules?

I am looking for this definition to make my HTML rendering a little better. He is currently guessing which spaces to keep in order to collapse and what to quit. The SGML standard is hard to find, and the HTML standard does not seem to apply to the subject with the necessary depth for my needs.

Currently, my rendering parses HTML in a tree, and then a recursive layout passes the positions of all the elements and their contents. I am experimenting with the fact that you throw some gaps in the analysis stage, i.e. Without highlighting spaces only text fragments in certain circumstances. Which species works in most cases, but there are a few rare cases that are difficult to handle.

(I am also working on a subclass of the HTML control editor, and layout time decisions are a little complicated in the editor, so I am working on getting them to the parsing stage. Available until re-melting, some time after editing the document.)

Remove bond / flame.

+6
html sgml whitespace
source share
3 answers

I think section 9.1 The white space in the HTML 4 specification is what you are looking for.

+5
source share

If you are writing your own HTML parser, I highly recommend that you use the parsing algorithm in the HTML 5 specification. Http://www.whatwg.org/html5 It covers a large number of edge and corner cases and the overall weirdness of the browser. Browsers do not follow SGML rules, but they all believe that they do what the HTML 5 specification says, or the functional equivalent of this. There are several open source open source parsers that implement the algorithm, so it should have everything you need.

+3
source share

So, I think the next one I'm going to get an answer to this here: http://www.w3.org/TR/CSS2/text.html#white-space-model

+3
source share

All Articles