Monday, November 13, 2006

W3C DOM -Introduction

The Document Object Model (DOM) is the model that describes how all elements in an HTML page, like input fields, images, paragraphs etc., are related to the topmost structure: the document itself. By calling the element by its proper DOM name, we can influence it.

On this page I give an introduction to the W3C Level 1 DOM that has been implemented in the newest generation of browsers. It will give you an overview of how the DOM works and what you can do with it.

First some words about the DOM Recommendation and the purpose of the DOM, then I teach you what nodes are and how you can walk the DOM tree from node to node. Then it's time to learn how to get a specific node and how to change its value or attributes. Finally, I'll teach you how to create nodes yourself, the ultimate purpose of the new DOM.

The Recommendation

The Level 1 DOM Recommendation has been developed by the W3C to provide any programming language with access to each part of an XML document. As long as you use the methods and properties that are part of the recommendation, it doesn't matter if you parse an XML document with VBScript, Perl or JavaScript. In each language you can read out whatever you like and make changes to the XML document itself.

As some of you might have guessed, this paragraph describes an ideal situation and differences (between browsers, for instance) do exist. Generally, however, they're far smaller than usual so that learning to use the W3C DOM in JavaScript will help you to learn using it in another programming language.

In a way HTML pages can be considered as XML documents. Therefore the Level 1 DOM will work fine on an HTML document, as long as the browser can handle the necessary scripts.

You can read out the text and attributes of every HTML tag in your document, you can delete tags and their content, you can even create new tags and insert them into the document so that you can really rewrite your pages on the fly, without a trip back to the server.

Because it is developed to offer access to and change every aspect of XML documents, the DOM has many possibilities that the average web developer will never need. For instance, you can use it to edit the comments in your HTML document, but I don't see any reason why you would want to do so. Similarly, there are sections of the DOM that deal with the DTD/Doctype, with DocumentFragments (tiny bits of a document) or the enigmatic CDATA. You won't need these parts of the DOM in your HTML pages, so I ignore them and concentrate instead on the things that you'll need in your daily work.


The DOM is a Document Object Model, a model of how the various objects of a document are related to each other. In the Level 1 DOM, each object, whatever it may be exactly, is a Node. So if you do

This is a paragraph

you have created two nodes: an element P and a text node with content 'This is a paragraph'. The text node is inside the element, so it is considered a child node of the element. Conversely, the element is considered the parent node of the text node.


<-- element node
This is a paragraph <-- text node

If you do

This is a paragraph

the element node P has two children, one of which has a child of its own:


| |
This is a

Finally there are attribute nodes. (Confusingly, they are not counted as children of an element node. In fact, while writing this pages I've done a few tests that seem to indicate that Explorer 5 on Windows doesn't see attributes as nodes at all.) So

This is a paragraph

would give something like


| |
-------------- ALIGN
| | |
This is a |
| right

So these are element nodes, text nodes and attribute nodes. They constitute about 99% of the content of an HTML page and you'll usually busy yourself with manipulating them. There are more kinds of nodes, but I skip them for the moment.

As you'll understand, the element node P also has its own parent, this is usually the document, sometimes another element like a DIV. So the whole HTML document can be seen as a tree consisting of a lot of nodes, most of them having child nodes (and these, too, can have children).

| |

---------------- lots more nodes
| |
-------------- ALIGN
| | |
This is a |
| right

Walking through the DOM tree

Knowing the exact structure of the DOM tree, you can walk through it in search of the element you want to influence. For instance, assume the element node P has been stored in the variable x (later on I'll explain how you do this). Then if we want to access the BODY we do


We take the parent node of x and do something with it. To reach the B node:


childNodes is an array that contains all children of the node x. Of course numbering starts at zero, so childNodes[0] is the text node 'This is a' and childNodes[1] is the element node B.

Two special cases: x.firstChild accesses the first child of x (the text node), while x.lastChild accesses the last child of x (the element node B).

So supposing the P is the first child of the body, which in turn is the first child of the document, you can reach the element node B by either of these commands:


or even (though it's a bit silly)


Getting an element

However, walking through the document in this way is quite cumbersome. You'll need to be absolutely certain of the structure of the entire DOM tree and since the whole purpose of the Level 1 DOM is to allow you to modify the DOM tree, this could lead to problems really quickly.

Therefore there are several ways of jumping directly to an element of your choice. Once you have arrived there, you can walk the last bit of the DOM tree to where you want to be.

So let's continue with our example. You want to access the element node B. The very simplest way is to directly jump to it. By the method document.getElementsByTagName you can construct an array of all tags B in the document and then go to one of them. Let's assume that this B is the first one in the document, then you can simply do

var x = document.getElementsByTagName('B')[0]

and x contains the element node B. First you order the browser to get all elements B in the document (document.getElementsByTagName('B')), then you select the first of all B's in the document ([0]) and you've arrived where you want to be.

You could also do

var x = document.getElementsByTagName('P')[0].lastChild;

Now you go to the first paragraph in the document (we assume that our P is the first one) and then go to its lastChild.

The best way, the only way to be certain that you reach the correct element regardless of the current structure of the DOM tree, is to give the B an ID:

This is a paragraph

Now you can simply say

var x = document.getElementById('hereweare');

and the element node B is stored in x.

Changing a node

Now that we have reached the node, we want to change something. Suppose we want to change the bold text to 'bold bit of text'. We then have to access the correct node and change its nodeValue. Now the correct node in this case is not the element node B but its child text node: we want to change the text, not the element. So we simply do

document.getElementById('hereweare').firstChild.nodeValue='bold bit of text';

and the node changes. Try it and change it back again .

This is a paragraph

You can change the nodeValue of each text node or each attribute. Thus you can also change the ALIGN attribute of the paragraph. Try it and change it back again .

This, too, is quite simple. Take the node you need (the B's parentNode, in this case), then use the setAttribute() method to set the ALIGN attribute to the value you want:

function test2(val) {
if (document.getElementById && document.createElement)
node = document.getElementById('hereweare').parentNode;
else alert('Your browser doesn\'t support the Level 1 DOM');

Creating and removing nodes

Changing nodes is nice, it can even be useful, but it's nothing compared to actually creating your own nodes and inserting them into your document. I can easily add an HR right below this paragraph and remove it quite as easily.

Creating the element is done by a special method:

var x = document.createElement('HR');

Thus an HR is created and temporarily stored in x. The second step is to insert x into the document. I wrote a special SPAN with ID="inserthrhere" at point where it should appear. So we use the appendChild() method on the SPAN and the HR is made a child of the SPAN and it magically appears:


Removing it is slightly more complex. First I create a temporary variable node to store the SPAN, then I tell it to remove its first child (the HR).

var node = document.getElementById('inserthrhere')

In the same way we can create a new text node and append it to our faithful B ID="hereweare"

var x = document.createTextNode(' A new text node has been appended!');

Try it , then go up to see the result. You will notice that executing the old functions does not remove the new text node, that's because it has become a separate node:

| |
paragraph A new text node
has been appended!

(To merge them into one node use the normalize() method that's sadly not supported by Explorer 5 on Windows).

I won't tell you how to remove the text node, try writing that script yourself. It'll be a useful exercise.

Level 0 DOM

The Document Object Model (DOM) is the model that describes how all elements in an HTML page, like input fields, images, paragraphs etc., are related to the topmost structure: the document itself. By calling the element by its proper DOM name, we can influence it.

This page treats some DOM history and then describes the Level 0 DOM.

First of all a little introduction to the Document Object Model, followed by a bit of history. Then we'll take a look at accessing elements through the Level 0 DOM and how to use the Level 0 DOM.

Document Object Model

The Document Object Model has been around since browsers support JavaScript. From Netscape 2 onwards, web programmers wanted to access bits of HTML and change its properties. For instance, when you write a mouseover script you want to go to a certain image on the page and change its src property. When you do this, the browser reacts by changing the image on the screen.

The function of the Document Object Model is to provide this kind of access to HTML elements on your page. Exactly what elements you can access in which way and exactly what you can change depends on the browser. Each higher browser version gives you more freedom to reach any element you like and change anything you like.

DOM history

There are three DOM levels:

  1. The Level 0 DOM, supported from Netscape 2 onwards by all browsers.
    This DOM is treated below on this page.
  2. The two Intermediate DOMs, supported by Netscape 4 and Explorer 4 and 5.
    Note that the use of these DOMs is not necessary any more; I advise you to ignore them. These DOMs are treated on the archived Intermediate DOMs page.
  3. The Level 1 DOM, or W3C DOM, supported by Mozilla and Explorer 5.
    This DOM is treated in its own section .

Now let's take a look at the origins and development of the Document Object Model.

Level 0 DOM

The Level 0 DOM was invented by Netscape at the same time JavaScript was invented and was first implemented in Netscape 2. It offers access to a few HTML elements, most importantly forms and (later) images.

For reasons of backward compatibility the more advanced browsers, even those who support the Level 1 DOM, still also support the old, faithful Level 0 DOM. Not supporting it would mean that the most common scripts suddenly wouldn't work any more. So even though the Level 0 DOM doesn't entirely fit into the new DOM concepts, browsers will continue to support it.

For the same reason Microsoft was at first forced to copy the Netscape DOM for Explorer 3. They wanted a real competitor for Netscape and having it produce lots of error messages on every page that contained JavaScript would have been strategically unsound.

Therefore the Level 0 DOM is really unified: all browsers that support parts of it support these parts in the same way. With the later DOMs this situation changed.

Intermediate DOMs

When the Version 4 browsers were released, the hype of the day was DHTML so both browsers had to support it. DHTML needs access to layers, separate parts of a page that can be moved across the page. Not surprisingly in view of their increasing competition, Netscape and Microsoft chose to create their own, proprietary DOMs to provide access to layers and to change their properties (their position on the page, for instance). Netscape created the layer model and the DOM document.layers, while Microsoft used document.all. Thus a proper cross-browser DHTML script needs both intermediate DOMs .

Fortunately, nowadays these intermediate DOMs are not important any more. You can safely forget them.

Level 1 DOM

Meanwhile W3C had developed the Level 1 DOM specification. The Document Object Model W3C proposed was at first written for XML documents, but since HTML is after all a sort of XML, it could serve for browsers too.

Besides, the Level 1 DOM is a real advance. For the first time, a DOM was not only supposed to give an exact model for the entire HTML (or XML) document, it is also possible to change the document on the fly, take out a paragraph and change the layout of a table, for instance.

Since both Netscape and Microsoft had participated in the specification of the new DOM, since both browser vendors wanted to support XML in their version 5 browser and since public pressure groups like the Web Standards Project exhorted them to behave sensibly just this once, both decided to implement the Level 1 DOM.

Of course, this doesn't mean that Mozilla and Explorer 5 are the same. Again for reasons of backward compatibility Microsoft decided to continue support of document.all so that Explorer 5 now supports two DOMs (three if you count the Level 0 DOM).
On the other hand, the core of Mozilla is being built by the open source Mozilla Project and the leaders of this project have decided to completely ditch the old document.layers DOM of Netscape 4 and have Mozilla support only the Level 1 DOM.

See the Level 1 DOM page for more information.

Accessing elements

Each DOM gives access to HTML elements in the document. It requires you, the web programmer, to invoke each HTML element by its correct name. If you have done so, you can influence the element (read out bits of information or change the content or layout of the HTML element). For instance, if you want to influence the image with name="first" you have to invoke it by its proper name


and you are granted access. The Level 0 DOM supports the following nodeLists:

  • document.images[], which grants access to all images on the page.
  • document.forms[], which grants access to all forms on the page.
  • document.forms[].elements[], which grants access to all form fields in one form, whatever their tag name. This nodeList is unique to the Level 0 DOM; the W3C DOM does not have a similar construct.
  • document.links[], which grants access to all links () on the page.
  • document.anchors[], which grants access to all anchors () on the page.

How to use the Level 0 DOM

When the browser concludes that an HTML document has been completely loaded, it starts making arrays for you. It creates the array document.images[] and puts all images on the page in it, it creates the array document.forms[] and puts all forms on the page in it etc.

This means that you now have access to all forms and images, you just have to go through the array in search of the exact image or form that you want to influence. This can be done in two ways: by name or by number.

Suppose you have this HTML document:

| document |
| -------- ------------------- |
| |img | | second image | |
| -------- | | |
| ------------------- |
| ------------------------------------- |
| | form | |
| | --------------------- | |
| | | address | | |
| | --------------------- | |
| ------------------------------------- |


The first image has name="thefirst", the second has name="thesecond". Then the first image can be accessed by either of these two calls:


The second one can be accessed by either of these calls:


The first call is by name, simply fill in the name (between quotes, it's a string !) within the [] and you're ready.

The second call is by number. Each image gets a number in the document.images array, in order of appearance in the source code. So the first image on a page is document.images[0], the second one is document.images[1] etc.


The same goes for forms. Suppose the form on the page has name="contactform", then you can reach it by these two calls:


But in the case of forms, usually you don't want to access just the form, but a specific form field. No problem, for each form the browser automatically creates the array document.forms[].elements[] that contains all elements in the form.

The form above holds as first element an . You can access it by these four calls:


These four calls are completely interchangeable, it's allowed to first use one, then another. It depends on your script exactly which method of access you use.

Doing what you need to do

Once you have correctly accessed a form field or an image through the Level 0 DOM, you'll have to do what you want to do. Images are usually accessed to create a mouseover effect that changes the property src of an image:

document.images['thefirst'].src = 'another_image.gif';

Usually you want to access forms to check what a user has filled in. For instance to read out what the user has filled in check the property value:

x = document.forms[0].elements[0].value;

and then check x for whatever is necessary. See the introduction to forms for details on how to access specific form fields (checkboxes, radio buttons etc.).