HTML validation error: characters without spaces found before DOCTYPE

I have a blog (based on WordPress). And try to validate the w3c validator on one of my pages. First mistake:

Line 1, Column 1: Non-space characters found without seeing a doctype first. Expected <!DOCTYPE html>. <!DOCTYPE html><!-- HTML 5 --> 

In addition, DebugBar (http://www.my-debugbar.com/wiki/IETester/HomePage) agrees and displays two invisible characters before <! when I open the same page from the HTML Validation tab inside this tool, BUT !!

  • This line of HTML comes from the header.php file in my wordpress theme.
  • I download this file from my hoster to the local hard drive.
  • The first line of header.php is <!DOCTYPE html><!-- HTML 5 -->
  • When I open header.php in RJ TextEd (advanced text editor only), it says: current encoding header.php is UFT-8 without (!).
  • When I open header.php in a HEX-viewer, I see that bytes 0 and 1 are 3c, 21 - so this is exactly <! .

So, everything has been examined, why and where do I get these “odd characters” from?

+8
html utf-8 wordpress byte-order-mark w3c-validation
source share
1 answer

I found the root of the problem. General rule:

If any (absolutely any!) File participating in building the code of the final HTML page (the one that will be sent to the client) is encoded with BOM, the final HTML page WILL BE UTF-8-BOM. That is: the whole site should NOT contain even 1 file using the specification.

In my case, I have the 1.3K files that make up my site. Only 4 files were added:

  • wp-config.php (at the root of the site)
  • jquery.query.js (in the include folder)
  • cyr-to-lat.php (in the plugin folder)
  • footer.php (in the folder with the root folder)

And I was forced to re-save all and all of these 4 files as “UFT-8 without specification” in order to get rid of the “Non-space characters” validation error. When I did this (reinstall files) - the error disappeared.

+17
source share

All Articles