A Survey of the JavaScript Programming Language

Douglas Crockford
www.crockford.com

© 2002 Douglas Crockford

Overview

This document is an introduction to the JavaScript Programming Language for professional programmers. It is a small language, so if you are familiar with other languages, then this won't be too demanding.

JavaScript is not Java. They are two very different languages. JavaScript is not a subset of Java. It is not interpreted Java. (Java is interpreted Java!) JavaScript shares C-family syntax with Java, but at a deeper level it shows greater similarity to the languages Scheme and Self. It is a small language, but it is also a suprisingly powerful and expressive language.You should take a look at it. You will find that it is not a toy language, but a full programming language with many distinctive properties.

JavaScript is a regular language which won't take much time to learn. It is better suited to some tasks, such as client programming, than Java is. In my own practice, I have found that working with JavaScript has made me a better Java programmer because it introduced me to a useful set of dynamic techniques.

When JavaScript was first introduced, I dismissed it as being not worth my attention. Much later, I took another look at it and discovered that hidden in the browser was an excellent programming language. My initial attitudes were based on the initial positioning of JavaScript by Sun and Netscape. They made many misstatements about JavaScript in order to avoid positioning JavaScript as a competitor to Java. Those misstatements continue to echo in the scores of badly written JavaScript books aimed at the dummies and amateurs market.

History

JavaScript was developed by Brendan Eich at Netscape as the in-page scripting language for Navigator 2. It is a remarkably expressive dynamic programming language. Because of its linkage to web browsers, it instantly became massively popular. It never got a trial period in which it could be corrected and polished based on actual use. The language is powerful and flawed.

This document describes ECMAScript Edition 3 (aka JavaScript 1.5). Microsoft and Netscape are developing a static revision which does not correct the language's flaws. That new language is not JavaScript and is beyond the scope of this document.

Data Types

JavaScript contains a small set of data types. It has the three primitive types boolean, number, and string and the special values null and undefined. Everything else is variations on the object type.

Boolean has two values: true and false.

Number is 64-bit floating point, similar to Java's double and Double. There is no integer type. Division between two integers may produce a fractional result. Number also includes the special values NaN (not a number) and Infinity.

String is a sequence of zero or more Unicode characters. There is no separate character type. A character is represented as a string of length 1. Literal strings are quoted using the ' or " characters. The quote characters can be used interchangeably, but they have to match.

'This is a string.'
"Isn't this a string? Yes!"
'A' // The character A
"" // An empty string

Escapement is done with the \ character, like in Java. Strings are immutable. Strings have a length member which is used to determine the number of characters in the string.

var s = "Hello World!";
s.length == 12

It is possible to add methods to the simple types. So, for example, you can add an int() method to all numbers, so that Math.PI.int() produces 3.

An implementation may provide other types, such as Dates and Regular Expressions, but these are really just objects. Everything else is just objects.

Objects

JavaScript has very nice notational conveniences for manipulating hashtables.

var myHashtable = {};

This statement makes a new hashtable and assigns it to a new local variable. JavaScript is loosely typed, so we don't use type names in declarations. We use subscript notation to add, replace, or retrieve elements in the hashtable.

myHashtable["name"] = "Carl Hollywood";

There is also a dot notation which is a little more convenient.

myHashtable.city = "Anytown";

The dot notation can be used when the subscript is a string constant in the form of a legal identifier. Because of an error in the language definition, reserved words cannot be used in the dot notation, but they can be used in the subscript notation.

You can see that JavaScript's hashtable notation is very similar to Java's object and array notations. JavaScript takes this much farther: objects and hashtables are the same thing, so I could have written

var myHashtable = new Object();

and the result would have been exactly the same.

There is an enumeration capability built into the for statement.

for (var n in myHashtable) {
    if (myHashtable.hasOwnProperty(n)) {
        document.writeln("<p>" + n + ": " + myHashtable[n] + "</p>");
    }
}

The result will be

<p>name: Carl Hollywood</p>
<p>city: Anytown</p>

An object is a referenceable container of name/value pairs. The names are strings (or other elements such as numbers that are converted to strings). The values can be any of the data types, including other objects. Objects are usually implemented as hash-tables, but none of the hash-table nature (such as hash functions or rehashing methods) is visible.

Objects can easily be nested inside of other objects, and expressions can reach into the inner objects.

this.div = document.body.children[document.body.children.length - 1];

In the object literal notation, an object description is a set of comma-separated name/value pairs inside curly braces. The names can be identifiers or strings followed by a colon. Because of an error in the language definition, reserved words cannot be used in the identifier form, but they can be used in the string form. The values can be literals or expressions of any type.

var myObject = {name: "Jack B. Nimble", 'goto': 'Jail', grade: 'A', level: 3};
return {
    event: event,
    op: event.type,
    to: event.srcElement,
    x: event.clientX + document.body.scrollLeft,
    y: event.clientY + document.body.scrollTop};
emptyObject = {};

JavaScript's object literals are the basis of the JSON data interchange format.

New members can be added to any object at any time by assignment.

myObject.nickname = 'Jackie the Bee';

Arrays and functions are implemented as objects.

Arrays

Arrays in JavaScript are also hashtable objects. This makes them very well suited to sparse array applications. When you construct an array, you do not need to declare a size. Arrays grow automatically, much like Java vectors. The values are located by a key, not by an offset. This makes JavaScript arrays very convenient to use, but not well suited for applications in numerical analysis.

The main difference between objects and arrays is the length property. The length property is always 1 larger than the largest integer key in the array. There are two ways to make a new array:

var myArray = [];
var myArray = new Array();

Arrays are not typed. They can contain numbers, strings, booleans, objects, functions, and arrays.You can mix strings and numbers and objects in the same array. You can use arrays as general nested sequences, much as s-expressions. The first index in an array is usually zero.

When a new item is added to an array and the subscript is an integer that is larger than the current value of length, then the length is changed to the subscript plus one. This is a convenience feature that makes it easy to use a for loop to go through the elements of an array.

Arrays have a literal notation, similar to that for objects.

myList = ['oats', 'peas', 'beans', 'barley'];

emptyArray = [];

month_lengths = [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31];

slides = [
    {url: 'slide0001.html', title: 'Looking Ahead'},
    {url: 'slide0008.html', title: 'Forecast'},
    {url: 'slide0021.html', title: 'Summary'}
];

A new item can be added to an array by assignment.

a[i + j] = f(a[i], a[j]);

Functions

Functions in JavaScript look like C functions, except that they are declared with the function keyword instead of a type. When calling a function, it is not required that you pass a fixed number of parameters. Excess parameters are ignored. Missing parameters are given the value undefined. This makes it easy to write functions that deal with optional arguments.

A function has access to an arguments array. It contains all of the parameters that were actually sent by the caller. It makes it easy to deal with functions taking a variable number of arguments. For example,

function sum() {  // Take any number of parameters and return the sum
    var total = 0;
    for (var i = 0; i < arguments.length; ++i) {
        total += arguments[i];
    }
    return total;
}

JavaScript has inner functions, which serve the same purpose as inner classes in Java, but are much lighter. JavaScript also has anonymous functions, which act as lambda expressions. Functions have lexical scoping.

Functions are first class objects in JavaScript. That means that they can be stored in objects and passed as arguments to functions.

Definition

There are three notations for defining functions: function statement, function operator, and function constructor.

Function statement

The function statement creates a named function within the current scope.

function name(argumentlist) block

Functions can be nested. See Closures. An argumentlist is zero or more argument names, separated with commas. A block is a list of zero or more statements enclosed in { }.

The function statement is a shorthand for the function operator form:

var name = function name (argumentlist) block ;

Function operator

The function operator is a prefix operator that produces a function object. It looks similar to the function statement.

function name(argumentlist) block

The name is optional. If it is provided, then it can be used by the function body to call itself recursively. It can also be used to access the function's object members (except on IE). If the name is omitted, then it is an anonymous function.

The function operator is commonly used to assign functions to a prototype.

The function operator can also be used to define functions in-place, which is handy when writing callbacks.

Function constructor

The function constructor takes strings containing the arguments and body, and produces a function object.

new Function(strings...)

Do not use this form. The quoting conventions of the language make it very difficult to correctly express a function body as a string. In the string form, early error checking cannot be done. It is slow because the compiler must be invoked every time the constructor is called. And it is wasteful of memory because each function requires its own independent implementation.

Objects and this

A function is an object. It can contain members just as other objects. This allows a function to contain its own data tables. It also allows an object to act as a class, containing a constructor and a set of related methods.

A function can be a member of an object. When a function is a member of an object, it is called a method. There is a special variable, called this that is set to the object when a method of the object is called.

For example, in the expression foo.bar(), the this variable is set to the object foo as a sort of extra argument for the function bar. The function bar can then refer to this to access the object of interest.

In a deeper expression like do.re.mi.fa(), the this variable is set to the object do.re.mi, not to the object do. In a simple function call, this is set to the Global Object (aka window), which is not very useful. The correct behavior should have been to preserve the current value of this, particularly when calling inner functions.

Constructor

Functions which are used to initialize objects are called constructors. The calling sequence for a constructor is slightly different than for ordinary functions. A constructor is called with the new prefix:

new Constructor(parameters...)

By convention, the name of a constructor is written with an initial capital.

The new prefix changes the meaning of the this variable. Instead of its usual value, this will be the new object. The body of the constructor function will usually initialize the object's members. The constructor will return the new object, unless explicitly overridden with the return statement.

The constructed object will contain a secret prototype link field, which contains a reference to the constructor's prototype member.

Prototype

Objects contain a hidden link property. This link points to the prototype member of the constructor of the object.

When items are accessed from an object by the dot notation or the subscript notation, if the item is not found in the object then the link object is examined. If it is not found in the link object, and if the link object itself has a link object, then that link object is examined. If the chain of link objects is exhausted, then undefined is returned.

This use of prototype link chains provides a sort of inheritance.

Members can be added to the prototype by assignment. Here we define a new class Demo, which inherits from class Ancestor, and adds its own method foo.

function Demo() {}
Demo.prototype = new Ancestor();
Demo.prototype.foo = function () {};

Vars

Named variables are defined with the var statement. When used inside of a function, var defines variables with function-scope. The vars are not accessible from outside of the function. There is no other granularity of scope in JavaScript. In particular, there is no block-scope.

Any variables used in a function which are not explicitly defined as var are assumed to belong to an outer scope, possibly to the Global Object.

Vars which are not explicitly initialized are given the value undefined.

Vars are not typed. A var can contain a reference to an object, or a string or a number or a boolean or null or undefined.

A new set of vars is made every time the function is called. This allows functions to be recursive.

Closure

Functions can be defined inside of other functions. The inner function has access to the vars and parameters of the outer function. If a reference to an inner function survives (for example, as a callback function), the outer function's vars also survive.

Return

JavaScript does not have a void type, so every function must return a value. The default value is undefined, except for constructors, where the default return value is this.

Statements

The set of named statements includes var, if, switch, for, while, do, break, continue, return, try, throw, and with. Most of them work the same as in other C-like languages.

The var statement is a list of one or more variables names, separated by commas, with optional initialization expressions.

var a, b = window.document.body;

If the var statement appears outside of any function, it adds members to the Global Object. If it appears inside of a function, it defines local variables of the function.

In if statements, while statements, do statements, and logical operators, JavaScript treats false, null, undefined, "" (the empty string), and the number 0 as false. All other values are treated as true.

The case labels in a switch statement can be expressions. They don't have to be constants. They can be strings.

There are two forms of the for statement. The first is the common (init; test; inc) form. The second is an object iterator.

for (name in object) {
    if (object.hasOwnProperty(name)) {
        value = object[name];
    }
}

The block is executed for each name in the object. The order in which the names are produced is not guaranteed.

Statements can have a label prefix, which is an identifier followed with a colon.

The with statement should not be used.

Operators

JavaScript has a fairly large set of operators. Most of them work the same way as in other C-like languages. There are a few differences to watch out for.

The + operator is used for both addition and concatenation. If either of the operands is a string, it concatenates. This can cause errors. For example, '$' + 3 + 4 produces '$34', not '$7'.

+ can be used as a prefix operator, converting its string operand to a number.

!! can be used as a prefix operator, converting its operand to a boolean.

The && operator is commonly called logical and. It can also be called guard. If the first operand is false, null, undefined, "" (the empty string), or the number 0 then it returns the first operand. Otherwise, it returns the second operand. This provides a convenient way to write a null-check:

var value = p && p.name; /* The name value will
only be retrieved from p if p has a value, avoiding an error. */

The || operator is commonly called logical or. It can also be called default. If the first operand is false, null, undefined, "" (the empty string), or the number 0, then it returns the second operand. Otherwise, it returns the first operand. This provides a convenient way to specify default values:

value = v || 10; /* Use the value of v, but if v
doesn't have a value, use 10 instead. */

JavaScript supplies a set of bitwise and shift operators, but does not have an Integer type to apply them to. What happens is the Number operand (a 64-bit floating-point number) is converted to a 32-bit integer before the operation, and then converted back to floating point after the operation.

In JavaScript, void is a prefix operator, not a type. It always returns undefined. This has very little value. I only mention it in case you accidently type void out of habit and are puzzled by the strange behavior.

The typeof operator returns a string based on the type of its operand.

Mistakes were made.

Object 'object'
Array 'object'
Function 'function'
String 'string'
Number 'number'
Boolean 'boolean'
null 'object'
undefined 'undefined'

Potpourri

Global Object

The Global Object is the keeper of all of the functions and variables which were not defined inside of other functions and objects. Surprisingly, the Global Object does not have an explicit name in the language. Sometimes the this variable points at it, but often not. In the web browsers, window and self are members of the Global Object which point to the Global Object, thus giving an indirect way of addressing it.

If a variable is accessed, but is not found in the current scope, it is looked for in the Global Object. If it is not found there, an error will result.

The ECMAScript specification does not talk about the possibility of multiple Global Objects, or contexts, but browsers support this. Each window has its own Global Object.

Semicolon Insertion

One of the mistakes in the language is semicolon insertion. This is a technique for making semicolons optional as statement terminators. It is reasonable for IDEs and shell programs to do semicolon insertion. It is not reasonable for the language definition to require compilers to do it. Use semicolons.

Reserved Words

JavaScript is very heavy handed in its restrictions on reserved words. The reserved words are

abstract
boolean break byte
case catch char class const continue
debugger default delete do double
else enum export extends
false final finally float for function
goto
if implements import in instanceof int interface
long
native new null
package private protected public
return
short static super switch synchronized
this throw throws transient true try typeof
var volatile void
while with

Most of those words are not even used in the language. A reserved word cannot be used

  1. As a name in literal object notation
  2. As a member name in dot notation
  3. As a function argument
  4. As a var
  5. As an unqualified global variable
  6. As a statement label

There is no excuse for the first two restrictions. None. There is an excuse for the second two restrictions, but it is very weak.