The HTML Living Standard - Book Review

Notes

Section 1 - Introduction

Notes

The Living Standard is generally what folks mean when they refer to HTML5 (which is really just a marketing term)
You'll probably want to be familiar with some HTML before trying to dig into this do
Gives a little history of HTML with the side turns into XHTML.
Makes mention of XForms (which I'd never heard of before)
There ended up being two organizations working on an HTML standard with a differnet goal. W3C wanted to "finish HTML", which WHATWG wanted to do continually evolve it. In 2019 they two groups signed an agreement to work on the living standard (which is this book/document) together.
The design of HTML and various related APIs are kinda all over the place. That happens when you have decades long development with multiple folks who sometimes didn't know about each other.
There are times when the HTML spec does something in violation of other related specs. This is based on conflicting goals that can show up with such large spec surface areas. When those conflicts occur they are called out as "willful violation"
Extensibility:

The class attribute can extend elements to effectively create new ones. (mentions microformats here)

Can freely use data-* attributes without any worry that the browser will mess with them (or be thrown off by them) since they get ignored at the browser level. Great way to allow scripts to include data on HTML elements that can then be looked for by the same or other scripts

Can use <meta name="" content=""> to add page wide metadata (I've never used this for anything, but it's an interesting idea to think about assuming scripts can access it)

Can use the rel="" format to work with other link types (microformats is mentioned here too)

Can use <script type=""> to embed custom kinds of scripts

APIs can be extended with JavaScript's prototyping mechanisms (widely used in script libraries)

Can use itemscope and itemprop to embed nested name-value pairs (I'd never heard of this one)

Can make custom elements (e.g. web components) that use a - in the name that won't conflict with future versions of HTML, SVG, or MathML because they won't ever add a standard element with a - in the name
HTML v XML

The in-memory representation of a pages resources are know as the "DOM HTML" or just "the DOM" for short

There are two concrete syntaxes for DOMs in this spec: HTML and XML.

HTML is recommended for most cases.

Anything sent with a text/html MIME type will be treated as HTML

The XML MIME type is application/xhtml+xml. Anything sent with that MIME type is parsed by an XML processor.

Biggest thing to remember is that minor syntax errors will prevent an XML doc from rendering while the same type of errors are ignored in HTML and the page generally can render regardless.

Not everything that can be represented in HTML can be represented in the DOM or XML. E.g. namespaces can be defined in the DOM and XML, but not HTML. noscript stuff can be represented in HTML but not in the DOM or XML. Comments with the string --> can be added to the DOM but not HTML.
Document Structure And Conventions

Lays out the structure of the document and how type is used to identify things
Quick Intro

HTML docs consist of a tree or elements and text. Elements generally have start and end tags (e.g. <p> and </p>) though they can sometimes be omitted.

Tags are always nested without overlapping.

For example, this is valid

<strong><em>text</em></strong>

While this is not:

<strong><em>text</strong></em>

There's a standard set of HTML elements.

Elements can have attributes that effect how they work.

Attributes go inside the start tag and generally consist of a name and value.

Attribute values are generally quoted but can be used without quotes as long as there is no whitespace or other conflicting character (i.e. double quote, single quote, backtick, equal sign, greater than, or less than)

Attributes don't have to have a value (e.g. disabled)

The HTML is parsed and turned into a DOM tree (which is an in memory representation of the document.

DOM trees contain several types of nodes. (e.g. DocumentType, Element, Text, Comment, ProcessingInstruction)

The "document element" of HTML documents is the html element itself.

There tend to be more Text nodes that you'd initially expect because spaces and newlines all end up making Text nodes between other element tags.

Exceptions are any whitespace before the <head> start tag is silently dropped. And, all whitespace after the </body> end tag gets placed at the end of the body instead.

DOM trees can be manipulated from scripts on the page.

Scripts can be embedded using <script> tags or with "event handler content attributes" (which we'll see later)

Most of the time this document will refer to DOM trees instead of directly about the markup since DOM trees are what's used by browsers.

HTML docs can be rendered on screen, through a speech synthesizer, a braille reader, etc.... CSS can be used to affect how the various renderings take place

Love this line: "the novice author is cautioned that this specification, by necessity, defines the language with a level of detail that might be difficult to understand at first."
Writing Secure Apps With HTML

This document can't fully cover security. Authors are encouraged to seek out and study possible security issues in detail.

A few of the common pitfalls to watch out for:

The security model is based on the concept of "origins" and many attacks come from cross-origin actions. Problems can include

Not validating user input, cross site scripting (XSS), SQL injection.

You should make sure to do things like safelist things like attributes if you accept tags.

Make sure you don't accept URLs with javascript: protocol.

Allowing a base element to be inserted in a document means any script elements with relative links can be hijacked. Same goes for forms.

Study up on Cross-Site Request Forgery (CSRF)

Study up on Clickjacking. Part of which states:

"sites that do not expect to be used in frames are encouraged to only enable their interface if they detect that they are not in a frame (e.g. by comparing the window object to the value of the top attribute)
Common Scripting Pitfalls

Scripts in HTML are "run-to-completion". That is, the browser will usually run the script uninterrupted before doing something else (e.g. firing other events or continuing to parse the document)

HTML parsing is different. It's incremental. Meaning the parser can pause at any point to let scripts run. That means one thing to watch out for is to avoid adding event handlers after the event could have fired.

Two techniques for that are to use "event handler content attributes", or to create the element and the handler in the same script. Doing thing in the same script is safest because scripts are run to completion before other events can fire.
Catching Mistakes

Check out: https://wahtwg.org/validator/

for validators.
Conformance Requirements

This doc specifices a lot of processing for invalid docs as well as valid ones.
Presentaiton Markup

The majority of presentation markup from prior versions of HTML are no longer allowed.

They were removed because of e.g. poor accessability (it added complexity to figure out how Assistive Tech should handle them).

By using media/presentation-independent markup it's easier to author documents that work for more users (e.g. text based browsers)

It's also generally a lot easier to maintain markup when it's style-independent. (e.g. changing the text color on a site that uses <font color=""> everywhere is a lot more work than changing a single value in a global stylesheet.

Reducing the overall document size is another reason to remove the presentation stuff.

The only presentation stuff left in the HTML spec is the <style> element and style="" attribute. Using the attribute is generally discoraged in prod but can work nicely for rapid prototyping or when an unusal cases when adding another styelsheet would be invonvenient.

The b, i, hr, s, small, and u tags all used to be presentational. They are now redefined to be media independent (though the implementaitons may affect presentation as well)
Syntax Errors

Bascially, shit can get weird with the DOM tree, but browsers are pretty good at figuring out what you meant.
Content Model and Attriubte Value Restrctions

Again, shit can get weird, but browsers do their best.
Further Reading

Other things that spec readers might be interested in:

Character Model for the World Wide Web 1.0: Fundamentals

Unicode Security Considerations

Web Content Accessibility Guidelines (WCAG)

Common Infrastructure

Mainly notes about wording in the document. There's a bunch of language stuff and all makes sense to me intuitively. A few other notes are:

Notes

If an attribute value on a DOM notes is set to be the same value it already is it is not considered to have changed.
HTML elements can have: element insertion steps, post-connections steps, and element removing steps. The doc goes into details there that feel like browser implementation level stuff.
If DOM objects are live any references to them are their current data, not a snapshot of the data (Possibly this is is why values sometimes change in the console)
Pluging are anything that doesn't mach a child navigable of the Document or introduce Node objects into the DOM (e.g. a PDF viewer)
The spec doesn't require supporting plugins though
character encodings are ways to convert byte streams into Unicode strings and vice verse.
There's a list of other specs this spec depends on. It takes up most of the document.
Vendor specific proprietary user agent extensions to the spec are strongly discouraged. Doing so fragments the user base since only those users with the given UA can use the feature.

Common microsyntaxes

Notes

For boolean attributes, if it's there it's true, if not, it's false.
If it's there and has a value, the value must either be an empty string or a value that is an ASCII case-insensitive match for the attribute's canonical name with no surrounding whitespace
The values "true" and false are not allowed on booleans.
To represent "false" the attribute must be omitted altogether.
Walks through the details on how keyword based states are returned. There's a several step process involved that includes thing for if a value is missing or if it's invalid.
Integers are represented as base 10 numbers with our without a leading U+002D HYPHEN-MINUS character
Floating point numbers can have E for exponents
Lists of floating point numbers are a valid thing. There can't be whitespace between them
Covers date and time details which I'm not going to try to summarize here.
Mentions that space separated and comma separated tokens are available
There are also References, Media Queries, and Unique Internal Values which have their own links out.

2.4 URLs

Notes

Some of the stuff here feels like it should have examples, but they don't.
Goes through the definition of valid URL strings
You can use about:legacy-compact as a reserved URL for the DOCTYPE of HTML Documents when they need compatibility with XML Tools.
A little touch on CORS in fetch requests but nothing too deep.
Touches on Referrer policy attributes and how they are calculated.
Touches on None attributes which are hashes that are used to determine if a request can proceed.
If a nonce is in play there are mechanism to make sure it's only available to scripts and not CSS
The Lazy Loading attribute can be either "lazy" or "eager". "eager" is the default. If it's "eager" the browser tries to fetch it immediately. If it's "lazy" it waits for some conditions the user agent has associated with the element
A Blocking attribute explicitly indicates certain things should be blocked on the fetching of an external resource.
Currently the only blocking token (which I think means value) is "render". It doesn't really go into how to use this (blocking="render" perhaps? TBD)
There's also fetch priority attributes. They can have the following states: "high", "low", or "auto". With "auto" being the default. No examples are shown of how to use it.

-- end of line --