Understanding the Virtual DOM

I’ve recently been writing about what exactly the DOM and the shadow DOM are and how they differ. To recap, the Document Object Model is an object-based representation of an HTML document and an interface to manipulating that object. The shadow DOM can be thought of as a “lite” version of the DOM. It is also an object-based representation of HTML elements, but not of a full standalone document. Instead, the shadow DOM allows us to separate our DOM into smaller, encapsulated bits that can be used across HTML documents.

Another similar term you may have come across is the “virtual DOM”. Although the concept has been around for several years, it was made more popular by its use in the React framework. In this article, I will cover exactly what the virtual DOM is, how it differs from the original DOM, and how it is used.

Why do we need a virtual DOM?

To understand why the concept of the virtual DOM arose, let’s revisit the original DOM. As I mentioned, there are two parts to the DOM - the object-based representation of the HTML document and the API to manipulating that object.

For example, let’s take this simple HTML document with an unordered list and one list item.

<!doctype html>
<html lang="en">
 <head></head>
 <body>
    <ul class="list">
        <li class="list__item">List item</li>
    </ul>
  </body>
</html>

This document can be represented as the following DOM tree:

  • html
    • head lang="en"
    • body
      • ul class="list"
        • li class="list__item"
          • "List item"

Let’s say we want to modify the content of the first list item to "List item one" and also add a second list item. To do this, we will need use the DOM APIs to find the elements we want to update, create the new elements, add attributes and content, then finally update the DOM elements themselves.

const listItemOne = document.getElementsByClassName("list__item")[0];
listItemOne.textContent = "List item one";

const list = document.getElementsByClassName("list")[0];
const listItemTwo = document.createElement("li");
listItemTwo.classList.add("list__item");
listItemTwo.textContent = "List item two";
list.appendChild(listItemTwo);

The DOM wasn’t made for this…

When the first specification for the DOM was released in 1998, we built and managed web pages in very differently. There was far less reliance on the DOM APIs to create and update the page content as frequently as we do today.

Simple methods such as document.getElementsByClassName() are fine to use on a small scale, but if we are updating multiple elements on a page every few seconds, it can start to become really expensive to constantly query and update the DOM.

Even further, because of the way the APIs are setup, it is usually simpler to perform more expensive operations where we update larger parts of the document than to find and update the specific elements. Going back to our list example, it is in some ways easier to replace the entire unordered list with a new one than to modify the specific elements.

const list = document.getElementsByClassName("list")[0];
list.innerHTML = `
<li class="list__item">List item one</li>
<li class="list__item">List item two</li>
`;

In this particular example, the performance difference between the methods is probably insignificant. However, as the size of the web page grows, it becomes more important to only select and update what is needed.

… but the virtual DOM was!

The virtual DOM was created to solve these problems of needing to frequently update the DOM in a more performant way. Unlike the DOM or the shadow DOM, the virtual DOM isn't an official specification, but rather a new method of interfacing with the DOM.

A virtual DOM can be thought of as a copy of the original DOM. This copy can be frequently manipulated and updated, without using the DOM APIs. Once all the updates have been made to the virtual DOM, we can look at what specific changes need to be made to the original DOM and make them in a targetted and optimised way.

What does a virtual DOM look like?

The name "virtual DOM" tends to add to the mystery of what the concept actually is. In fact, a virtual DOM is just a regular Javascript object.

Let’s revisit the DOM tree we created earlier:

  • html
    • head lang="en"
    • body
      • ul class="list"
        • li class="list__item"
          • "List item"

This tree can also be represented as a Javascript object.

const vdom = {
    tagName: "html",
    children: [
        { tagName: "head" },
        {
            tagName: "body",
            children: [
                {
                    tagName: "ul",
                    attributes: { "class": "list" },
                    children: [
                        {
                            tagName: "li",
                            attributes: { "class": "list__item" },
                            textContent: "List item"
                        } // end li
                    ]
                } // end ul
            ]
        } // end body
    ]
} // end html

We can think of this object as our virtual DOM. Like the original DOM, it is an object-based representation of our HTML document. But since it is a plain Javascript object, we can manipulate it freely and frequently without touching the actual DOM until we need to.

Instead of using one object for the entire object, it is more common to work with small sections of the virtual DOM. For example, we may work on a list component, which would corespond to our unordered list element.

const list = {
    tagName: "ul",
    attributes: { "class": "list" },
    children: [
        {
            tagName: "li",
            attributes: { "class": "list__item" },
            textContent: "List item"
        }
    ]
};

Under the hood of the virtual DOM

Now that we’ve seen what a virtual DOM looks like, how does it work to solve the performance and usability problems of the DOM?

As I mentioned, we can use the virtual DOM to single out the specific changes that need to be made to the DOM and make those specific updates alone. Let's go back to our unordered list example and make the same changes we did using the DOM API.

The first thing we would do is make a copy of the virtual DOM, containing the changes we want to make. Since we don't need to use the DOM APIs, we can actually just create a new object alltogether.

const copy = {
    tagName: "ul",
    attributes: { "class": "list" },
    children: [
        {
            tagName: "li",
            attributes: { "class": "list__item" },
            textContent: "List item one"
        },
        {
            tagName: "li",
            attributes: { "class": "list__item" },
            textContent: "List item two"
        }
    ]
};

This copy is used to create what is called a “diff” between the original virtual DOM, in this case the list, and the updated one. A diff could look something like this:

const diffs = [
    {
        newNode: { /* new version of list item one */ },
        oldNode: { /* original version of list item one */ },
        index: /* index of element in parent's list of child nodes */
    },
    {
        newNode: { /* list item two */ },
        index: { /* */ }
    }
]

This diff provides instructions for how to update the actual DOM. Once all the diffs are collected, we can batch changes to the DOM, making only the updates that are needed.

For example we could loop through each diff and either add a new child or update an old one depending on what the diff specifies.

const domElement = document.getElementsByClassName("list")[0];

diffs.forEach((diff) => {

    const newElement = document.createElement(diff.newNode.tagName);
    /* Add attributes ... */
    
    if (diff.oldNode) {
        // If there is an old version, replace it with the new version
        domElement.replaceChild(diff.newNode, diff.index);
    } else {
        // If no old version exists, create a new node
        domElement.appendChild(diff.newNode);
    }
})

Note that this is a really simplified and stripped-back version of how a virtual DOM could work and there are lot of cases I didn't cover here.

The virtual DOM and frameworks

It's more common to work with the virtual DOM via a framework, rather than interfacing with it directly as I showed in the example above.

Frameworks such as React and Vue use the virtual DOM concept to make more performant updates to the DOM. For example, our list component can be written in React in the following way.

import React from 'react';
import ReactDOM from 'react-dom';

const list = React.createElement("ul", { className: "list" },
    React.createElement("li", { className: "list__item" }, "List item")
);

ReactDOM.render(list, document.body);

If we wanted to update our list, we could just rewrite the entire list template, and call ReactDOM.render() again, passing in the new list.

const newList = React.createElement("ul", { className: "list" },
    React.createElement("li", { className: "list__item" }, "List item one"),
    React.createElement("li", { className: "list__item" }, "List item two");
);

setTimeout(() => ReactDOM.render(newList, document.body), 5000);

Because React uses the virtual DOM, even though we are re-rendering the entire template, only the parts that actually change are updated. If we look at our developer tools when the change happens, we will see the specific elements and the specific parts of the elements that change.

The DOM vs the virtual DOM

To recap, the virtual DOM is a tool that enables us to interface with DOM elements in an easier and more performant way. It is a Javascript object representation of the DOM, which we can modify as frequently as we need to. Changes made to this object are then collated, and modifications to the actual DOM are targetted and made less often.

blog comments powered by Disqus