Nitpick 3: The <meta> tag is only a band-aid for shitty webhosting where you cannot access the webserver config to make it send the correct Content-Type in the actual HTTP response headers. The modern <!DOCTYPE html> instead implies a default of UTF-8 which works well for most.
Nitpick nitpick: the html doctype doesn't imply UTF-8. Valid modern HTML documents must be encoded using UTF-8, but the standard also requires that the encoding be specified somehow.
> The Encoding standard requires use of the UTF-8 character encoding and requires use of the "utf-8" encoding label to identify it... If an HTML document does not start with a BOM, and its encoding is not explicitly given by Content-Type metadata, and the document is not an iframe srcdoc document, then the encoding must be specified using a meta element with a charset attribute or a meta element with an http-equiv attribute in the Encoding declaration state.
Oh I actually quoted the website's source. They have that DTD meta crap in there.
But I think you can just do <html> nowadays and it empirically just works. Seriously, screw the anti-DRY people that want me to put some !DOCTYPE or xmlns tags with some W3C links or some DTD nonsense inside ... I should only have to specify "html" exactly once, no more.
If I had designed the spec I would have just made it
Incredibly more readable, and memorizable. A markup language (literally), by virtue of being a markup language, should not be impossible to memorize. Making scary strings like "-//W3C///DTD" part of the spec is counterproductive.
That's SGML tag inference at work (theoretically at least, since browsers have HTML parsing rules hardcoded). SGML knows, by the DOCTYPE declaration, that the document must start with an "html" element, so it infers it if it isn't there. Next, by the content model declared for the "html" element (normally obtained via the ugly public identifier that sibling comments complain about), a "head" element is expected, so SGML infers it as well if it's declared omissible, and so on.
In the "old days", web pages were often just the bare content (no html, head, body containers, no DOCTYPE declaration). A few sites also featured just the body tag (and respective content) for setting the background attribute for the page background color.
E.g., this is the entire code of Netscape's first home page:
<TITLE>Welcome to Mosaic Communications Corporation!</TITLE>
<CENTER>
<A HREF="MCOM/index2.html"><IMG SRC="MCOM/images/mcomwelcome1.gif" BORDER=1></A>
<H3>
<A HREF="MCOM/index2.html">Click on the Image or here to advance</A>
</H3>
</CENTER>