What, exactly, is the DOM?

The Document Object Model, or the “DOM”, is an interface to web pages. It is essentially an API to the page, allowing programs to read and manipulate the page’s content, structure, and styles. Let’s break this down.

How is a web page built?

How a browser goes from a source HTML document to displaying a styled and interactive page in the viewport is called the “Critical Rendering Path”. Although this process can be broken down into several steps, as I cover in my article on Understanding the Critical Rendering Path, these steps can be roughly grouped into two stages. The first stage involves the browser parsing the document to determine what will ultimately be rendered on the page, and the second stage involves the browser performing the render.

HTML-to-Render-Tree-to-Final

The result of the first stage is what is called a “render tree”. The render tree is a representation of the HTML elements that will be rendered on the page and their related styles. In order to build this tree, the browser needs two things:

  1. The CSSOM, a representation of the styles associated with elements
  2. The DOM, a representation of the elements

How is the DOM created (and what does it look like)?

The DOM is an object-based representation of the source HTML document. It has some differences, as we will see below, but it is essentially an attempt to convert the structure and content of the HTML document into an object model that can be used by various programs.

The object structure of the DOM is represented by what is called a “node tree”. It is so called because it can be thought of as a tree with a single parent stem that branches out into several child branches, each which may have leaves. In this case, the parent “stem” is the root <html> element, the child “branches” are the nested elements, and the “leaves” are the content within the elements.

Let’s take this HTML document as an example:

<!doctype html>
<html lang="en">
 <head>
   <title>My first web page</title>
  </head>
 <body>
    <h1>Hello, world!</h1>
    <p>How are you?</p>
  </body>
</html>

This document can be represented as the following node tree:

  • html
    • head
      • title
        • My first web page
    • body
      • h1
        • Hello, world!
      • p
        • How are you?

What the DOM is not

In the example I gave above, it seems like the DOM is a 1-to-1 mapping of the source HTML document or what you see your DevTools. However, as I mentioned, there are differences. In order to fully understand what the DOM is, we need to look at what it is not.

The DOM is not your source HTML

Although the DOM is created from the source HTML document, it is not always exactly the same. There are two instances in which the DOM can be different from the source HTML.

1. When the HTML is not valid

The DOM is an interface for valid HTML documents. During the process of creating the DOM, the browser may correct some invalidities in the HTML code.

Let’s take this HTML document for example:

<!doctype html>
<html>
Hello, world!
</html>

The document is missing a <head> and <body> element, which is a requirement for valid HTML. If we look at the resulting DOM tree, we will see that this has been corrected:

  • html
    • head
    • body
      • Hello, world!

2. When the DOM is modified by Javascript

Besides being an interface to viewing the content of an HTML document, the DOM can also be modified, making it a living resource.

We can, for example, create additional nodes to the DOM using Javascript.

var newParagraph = document.createElement("p");
var paragraphContent = document.createTextNode("I'm new!");
newParagraph.appendChild(paragraphContent);
document.body.appendChild(newParagraph);

This will update the DOM, but of course not our HTML document.

The DOM is not what you see in the browser (i.e., the render tree)

What you see in the browser viewport is the render tree which, as I mentioned, is a combination of the DOM and the CSSOM. What really separates the DOM from the render tree, is that the latter only consists of what will eventually be painted on the screen.

Because the render tree is only concerned with what is rendered, it excludes elements that are visually hidden. For example, elements that have display: none styles associated to them.

<!doctype html>
<html lang="en">
  <head></head>
  <body>
    <h1>Hello, world!</h1>
    <p style="display: none;">How are you?</p>
  </body>
</html>

The DOM will include the <p> element:

  • html
    • head
    • body
      • h1
        • Hello, world!
      • p
        • How are you?

However, the render tree, and therefore what is seen in the viewport, will not include that element.

  • html
    • body
      • h1
        • Hello, world!

The DOM is not what is in DevTools

This difference is a bit more minuscule because the DevTools element inspector provides the closest approximation to the DOM that we have in the browser. However, the DevTools inspector includes additional information that isn’t in the DOM.

The best example of this is CSS pseudo-elements. Pseudo-elements created using the ::before and ::after selectors form part of the CSSOM and render tree, but are not technically part of the DOM. This is because the DOM is built from the source HTML document alone, not including the styles applied to the element.

Despite the fact that pseudo-elements are not part of the DOM, they are in our devtools element inspector.

Pseudo-element-in-devtools-inspector

This is why pseudo-elements cannot be targetted by Javascript, because they are not part of the DOM.

Recap

The DOM is an interface to an HTML document. It is used by browsers as a first step towards determining what to render in the viewport, and by Javascript programs to modify the content, structure, or styling of the page.

Although similar to other forms of the source HTML document, the DOM is different in a number of ways:

  • It is always valid HTML
  • It is a living model that can be modifed by Javascript
  • It doesn't include pseudo-elements (e.g. ::after)
  • It does include hidden elements (e.g. with display: none)
blog comments powered by Disqus