Identifier Resolution and Closures in the JavaScript Scope Chain

From my previous post, we now know that every function has an associated execution context that contains a variable object [VO], which is composed of all the variables, functions and parameters defined inside that given local function.

The scope chain property of each execution context is simply a collection of the current context's [VO] + all parent execution context's [VO].

Scope = VO + All Parent VOs
Eg: scopeChain = [ [VO] + [VO1] + [VO2] + [VO n+1] ];

Determining a Scope Chain’s Variable Objects [VO]s


We now know that the first [VO] of the scope chain belongs to the current execution context, and we can find the remaining parent [VO]s by looking at the parent context’s scope chain:

function one() {

    two();

    function two() {

        three();

        function three() {
            alert('I am at function three');
        }

    }

}

one();

The example is straight forward, starting from the global context we call one(), one() calls two(), which in turn calls three(), thus alerting that it is at function three. The image above shows the call stack at function three at the time alert('I am at function three') is fired. We can see that the scope chain at this point in time looks as follows:

three() Scope Chain = [ [three() VO] + [two() VO] + [one() VO] + [Global VO] ];

Lexical Scope


An important feature of JavaScript to note, is that the interpreter uses Lexical Scoping, as opposed to Dynamic Scoping. This is just a complicated way of saying all inner functions, are statically (lexically) bound to the parent context in which the inner function was physically defined in the program code.

In our previous example above, it does not matter in which sequence the inner functions are called. three() will always be statically bound to two(), which in turn will always be bound to one() and so on and so forth. This gives a chaining effect where all inner functions can access the outer functions VO through the statically bound Scope Chain.

This lexical scope is the source of confusion for many developers. We know that every invocation of a function will create a new execution context and associated VO, which holds the values of variables evaluated in the current context.

It is this dynamic, runtime evaluation of the VO paired with the lexical (static) defined scope of each context that leads unexpected results in program behaviour. Take the following classic example:

var myAlerts = [];

for (var i = 0; i < 5; i++) {
    myAlerts.push(
        function inner() {
            alert(i);
        }
    );
}

myAlerts[0](); // 5
myAlerts[1](); // 5
myAlerts[2](); // 5
myAlerts[3](); // 5
myAlerts[4](); // 5

At first glance, those new to JavaScript would assume alert(i); to be the value of i on each increment where the function was physically defined in the source code, alerting 1, 2, 3, 4 and 5 respectively.

This is the most common point of confusion. Function inner was created in the global context, therefore it’s scope chain is statically bound to the global context.

Lines 11 ~ 15 invoke inner(), which looks in inner.ScopeChain to resolve i, which is located in the global context. At the time of each invocation, i, has already been incremented to 5, giving the same result every time inner() is called. The statically bound scope chain, which holds [VOs] from each context containing live variables, often catches developers by surprise.

Resolving the value of variables


The following example alerts the value of variables a, b and c, which gives us a result of 6.

​function one() {

    var a = 1;
    two();

    function two() {

        var b = 2;
        three();

        function three() {

            var c = 3;
            alert(a + b + c); // 6

        }

    }

}

one()​;​

Line 14 is intriguing, at first glance it seems that a and b are not “inside” function three, so how can this code still work? To understand how the interpreter evaluates this code, we need to look at the scope chain of function three at the time line 14 was executed:

When the interpreter executes line 14: alert(a + b + c), it resolves a first by looking into the scope chain and checking the first variable object, three's [VO]. It checks to see if a exists inside three's [VO] but can not find any property with that name, so moves on to check the next [VO].

The interpreter keeps checking each [VO] in sequence for the existence of the variable name, in which case the value will be returned to the original evaluated code, or the program will throw a ReferenceError if none is found. Therefore, given the example above, you can see that a, b and c are all resolvable given function three’s scope chain.

How does this work with closures?


In JavaScript, closures are often regarded as some sort of magical unicorn that only advanced developers can really understand, but truth be told it is just a simple understanding of the scope chain. A closure, as Crockford says, is simply:

An inner function always has access to the vars and parameters of its outer function, even after the outer function has returned…

function foo() {
    var a = 'private variable';
    return function bar() {
        alert(a);
    }
}

var callAlert = foo();

callAlert(); // private variable

The code to the right is an example of a closure. The global context has a function named foo() and a variable named callAlert, which holds the returned value of foo(). What often surprises and confuses developers is that the private variable, a, is still available even after foo() has finished executing.

However, if we look at each of the context in detail, we will see the following:

// Global Context when evaluated
global.VO = {
    foo: pointer to foo(),
    callAlert: returned value of global.VO.foo
    scopeChain: [global.VO]
}

// Foo Context when evaluated
foo.VO = {
    bar: pointer to bar(),
    a: 'private variable',
    scopeChain: [foo.VO, global.VO]
}

// Bar Context when evaluated
bar.VO = {
    scopeChain: [bar.VO, foo.VO, global.VO]
}

Now we can see by invoking callAlert(), we get the function foo(), which returns the pointer to bar(). On entering bar(), bar.VO.scopeChain is [bar.VO, foo.VO, global.VO].

By alerting a, the interpreter checks the first VO in the bar.VO.scopeChain for a property named a but can not find a match, so promptly moves on to the next VO, foo.VO.

It checks for the existence of the property and this time finds a match, returning the value back to the bar context, which explains why the alert gives us 'private variable' even though foo() had finished executing sometime ago.

By this point in the article, we have covered the details of the scope chain and it’s lexical environment, along with how closures and variable resolution work. The rest of this article looks at some interesting situations in relation to those covered above.

Wait, how does the prototype chain affect variable resolution?


JavaScript is prototypal by nature and almost everything in the language, except for null and undefined, are objects. When trying to access a property on an object, the interpreter will try to resolve it by looking for the existence of the property in the object. If it can’t find the property, it will continue to look up the prototype chain, which is an inherited chain of objects, until it finds the property, or traversed to the end of the chain.

This leads to an interesting question, does the interpreter resolve an object property using the scope chain or prototype chain first ? It uses both. When trying to resolve a property or identifier, the scope chain will be used first to locate the object. Once the object has been found, the prototype chain of that object will then be traversed looking for the property name. Let’s look at an example:

var bar = {};

function foo() {

    bar.a = 'Set from foo()';

    return function inner() {
        alert(bar.a);
    }

}

foo()(); // 'Set from foo()'

Line 5 creates the property a on the global object bar, and sets its value to 'Set from foo()'. The interpreter looks into the scope chain and as expected finds bar.a in the global context. Now, lets consider the following:

var bar = {};

function foo() {

    Object.prototype.a = 'Set from prototype';

    return function inner() {
        alert(bar.a);
    }

}

foo()(); // 'Set from prototype()'

At runtime, we invoke inner(), which tries to resolve bar.a by looking in it’s scope chain for the existence of bar. It finds bar in the global context, and proceeds to search bar for a property named a. However, a was never set on bar, so the interpreter traverses the object’s prototype chain and finds a was set on Object.prototype.

It is this exact behavior which explains identifier resolution; locate the object in the scope chain, then proceed up the object’s prototype chain until the property is found, or returned undefined.

When to use Closures ?


Closures are a powerful concept given to JavaScript and some of the most common situations to use them are:

  • Encapsulation

    Allows us to hide the implementation details of a context from outside scopes, while exposing a controlled public interface. This is commonly referred to as the module pattern or revealing module pattern.

  • Callbacks

    Perhaps one of the most powerful uses for closures are callbacks. JavaScript, in the browser, typically runs in a single threaded event loop, blocking other events from starting until one event has finished. Callbacks allow us to defer the invocation of a function, typically in response to an event completing, in a none blocking manner. An example of this is when making an AJAX call to the server, using a callback to handle to response, while still maintaining the bindings in which it was created.

  • Closures as arguments

    We can also pass closures as arguments to a function, which is a powerful functional paradigm for creating more graceful solutions for complex code. Take for example a minimum sort function. By passing closures as parameters, we could define the implementation for different types of data sorting, while still reusing a single function body as a schematic.

When not to use Closures ?


Although closures are powerful, they should be used sparingly due to some performance concerns:

  • Large scope lengths

    Multiple nested functions are a typical sign that you might run into some performance issues. Remember, every time you need to evaluate a variable, the Scope Chain must be traversed to find the identifier, so it goes without saying that the further down the chain the variable is defined, the longer to lookup time.

Garbage collection


JavaScript is a garbage collected language, which means developers generally don’t have to worry about memory management, unlike lower level programming languages. However, this automatic garbage collection often leads developers application to suffer from poor performance and memory leaks.

Different JavaScript engines implement garbage collection slightly different, since ECMAScript does not define how the implementation should be handled, but the same philosophy can apply across engines when trying to create high performance, leak free JavaScript code. Generally speaking, the garbage collector will try to free the memory of objects when they can not be referenced by any other live object running in the program, or are unreachable.

Circular references

This leads us to closures, and the possibility of circular references in a program, which is a term used to describe a situation where one object references another object, and that object points back to the first object. Closures are especially susceptible to leaks, remember that an inner function can reference a variable defined further up the scope chain even after the parent has finished executing and returned. Most JavaScript engines handle these situations quite well (damn you IE), but it’s still worth noting and taking into consideration when doing your development.

For older versions of IE, referencing a DOM element would often cause you memory leaks. Why? In IE, the JavaScript (JScript ?) engine and DOM both have their own individual garbage collector. So when referencing a DOM element from JavaScript, the native collector hands off to the DOM and the DOM collector points back to native, resulting in neither collector knowing about the circular reference.

Summary


From working with many developers over the past few years, I often found that the concepts of scope chain and closures were known about, but not truly understood in detail. I hope this article has helped to take you from knowing the basic concept, to an understanding in more detail and depth.

Going forward, you should be armed with all the knowledge you need to determine how the resolution of variables, in any situation, works when writing your JavaScript. Happy coding !

What is the Execution Context & Stack in JavaScript?

In this post I will take an in-depth look at one of the most fundamental parts of JavaScript, the Execution Context. By the end of this post, you should have a clearer understanding about what the interpreter is trying to do, why some functions / variables can be used before they are declared and how their value is really determined.

What is the Execution Context?


When code is run in JavaScript, the environment in which it is executed is very important, and is evaluated as 1 of the following:

  • Global code – The default envionment where your code is executed for the first time.
  • Function code – Whenever the flow of execution enters a function body.
  • Eval code – Text to be executed inside the internal eval function.

You can read a lot of resources online that refer to scope, and for the purpose of this article to make things easier to understand, let’s think of the term execution context as the envionment / scope the current code is being evaluated in. Now, enough talking, let’s see an example that includes both global and function / local context evaluated code.

Nothing special is going on here, we have 1 global context represented by the purple border and 3 different function contexts represented by the green, blue and orange borders. There can only ever be 1 global context, which can be accessed from any other context in your program.

You can have any number of function contexts, and each function call creates a new context, which creates a private scope where anything declared inside of the function can not be directly accessed from outside the current function scope. In the example above, a function can access a variable declared outside of its current context, but an outside context can not access the variables / functions declared inside. Why does this happen? How exactly is this code evaluated?

Execution Context Stack


The JavaScript interpreter in a browser is implemented as a single thread. What this actually means is that only 1 thing can ever happen at one time in the browser, with other actions or events being queued in what is called the Execution Stack. The diagram below is an abstract view of a single threaded stack:

As we already know, when a browser first loads your script, it enters the global execution context by default. If, in your global code you call a function, the sequence flow of your program enters the function being called, creating a new execution context and pushing that context to the top of the execution stack.

If you call another function inside this current function, the same thing happens. The execution flow of code enters the inner function, which creates a new execution context that is pushed to the top of the existing stack. The browser will always execute the current execution context that sits on top of the stack, and once the function completes executing the current execution context, it will be popped off the top of the stack, returning control to the context below in the current stack. The example below shows a recursive function and the program’s execution stack:

(function foo(i) {
    if (i === 3) {
        return;
    }
    else {
        foo(++i);
    }
}(0));

The code simply calls itself 3 times, incrementing the value of i by 1. Each time the function foo is called, a new execution context is created. Once a context has finished executing, it pops off the stack and control returns to the context below it until the global context is reached again.

There are 5 key points to remember about the execution stack:

  • Single threaded.
  • Synchronous execution.
  • 1 Global context.
  • Infinite function contexts.
  • Each function call creates a new execution context, even a call to itself.

Execution Context in Detail


So we now know that everytime a function is called, a new execution context is created. However, inside the JavaScript interpreter, every call to an execution context has 2 stages:

  1. Creation Stage [when the function is called, but before it executes any code inside]:
    • Create variables, functions and arguments.
    • Create the Scope Chain.
    • Determine the value of "this".
  2. Activation / Code Execution Stage:
    • Assign values, references to functions and interpret / execute code.

It is possible to represent each execution context conceptually as an object with 3 properties:

executionContextObj = {
    variableObject: { /* function arguments / parameters, inner variable and function declarations */ },
    scopeChain: { /* variableObject + all parent execution context's variableObject */ },
    this: {}
}

Activation / Variable Object [AO/VO]


This executionContextObj is created when the function is invoked, but before the actual function has been executed. This is known as stage 1, the Creation Stage. Here, the interpreter creates the executionContextObj by scanning the function for parameters or arguments passed in, local function declarations and local variable declarations. The result of this scan becomes the variableObject in the executionContextObj.

Here is a pseudo-overview of how the interpreter evaluates the code:

  1. Find some code to invoke a function.
  2. Before executing the function code, create the execution context.
  3. Enter the creation stage:
    • Create the variable object:
      • Create the arguments object, check the context for parameters, initialize the name and value and create a reference copy.
      • Scan the context for function declarations:
        • For each function found, create a property in the variable object that is the exact function name, which has a reference pointer to the function in memory.
        • If the function name exists already, the reference pointer value will be overwritten.
      • Scan the context for variable declarations:
        • For each variable declaration found, create a property in the variable object that is the variable name, and initialize the value as undefined.
        • If the variable name already exists in the variable object, do nothing and continue scanning.
    • Initialize the Scope Chain.
    • Determine the value of "this" inside the context.
  4. Activation / Code Execution Stage:
    • Run / interpret the function code in the context and assign variable values as the code is executed line by line.

Let’s look at an example:

function foo(i) {
    var a = 'hello';
    var b = function privateB() {

    };
    function c() {

    }
}

foo(22);

On calling foo(22), the creation stage looks as follows:

fooExecutionContext = {
    variableObject: {
        arguments: {
            0: 22,
            length: 1
        },
        i: 22,
        c: pointer to function c()
        a: undefined,
        b: undefined
    },
    scopeChain: { ... },
    this: { ... }
}

As you can see, the creation stage handles defining the names of the properties, not assigning a value to them, with the exception of formal arguments / parameters. Once the creation stage has finished, the flow of execution enters the function and the activation / code execution stage looks like this after the function has finished execution:

fooExecutionContext = {
    variableObject: {
        arguments: {
            0: 22,
            length: 1
        },
        i: 22,
        c: pointer to function c()
        a: 'hello',
        b: pointer to function privateB()
    },
    scopeChain: { ... },
    this: { ... }
}

A Word On Hoisting


You can find many resources online defining the term hoisting in JavaScript, explaining that variable and
function declarations are hoisted to the top of their function scope. However, none explain in detail why this happens, and
armed with your new knowledge about how the interpreter creates the activation object, it is easy to see why. Take
the following code example:

​(function() {

    console.log(typeof foo); // function pointer
    console.log(typeof bar); // undefined

    var foo = 'hello',
        bar = function() {
            return 'world';
        };

    function foo() {
        return 'hello';
    }

}());​

The questions we can now answer are:

  • Why can we access foo before we have declared it?
    • If we follow the creation stage, we know the variables have already been created before the activation / code execution stage. So as the function flow started executing, foo had already been defined in the activation object.
  • Foo is declared twice, why is foo shown to be function and not undefined or string?
    • Even though foo is declared twice, we know from the creation stage that functions are created on the activation object before variables, and if the property name already exists on the activation object, we simply bypass the decleration.
    • Therefore, a reference to function foo() is first created on the activation object, and when we get interpreter gets to var foo, we already see the property name foo exists so the code does nothing and proceeds.
  • Why is bar undefined?
    • bar is actually a variable that has a function assignment, and we know the variables are created in the creation stage but they are initialized with the value of undefined.

Summary


Hopefully by now you have a good grasp about how the JavaScript interpreter is evaluating your code. Understanding the execution context and stack allows you to know the reasons behind why your code is evaluating to different values that you had not initially expected.

Do you think knowing the inner workings of the interpreter is too much overhead or a necessity to your JavaScript knowledge ? Does knowing the execution context phase help you write better JavaScript ?

Note: Some people have been asking about closures, callbacks, timeout etc which I will cover in the next post, focusing more on the Scope Chain in relation to the execution context.

Further Reading

Futures and Promises in JavaScript

With JavaScript usage constantly on the increase, asynchronous event-driven applications are becoming more and more popular. However, a common issue many developers face is with result-dependent operations being used in an asynchronous environment, you often end up with something like:

doA(function(aResult) {
    // do some stuff inside b then fire callback
    doB(aResult, function(bResult) {
        // ok b is done, now do some stuff in c and fire callback
        doC(bResult, function(cResult) {
            // finished, do something here with the result from doC()
        });
    });
});

Since each step requires the previous steps result, you will regularly see a pattern where people start nesting the callback functions within each other’s callbacks. These nested callbacks become difficult to maintain, understand and follow in larger asynchronous applications. Simple async flow such as do (A + B + C) then do D becomes an increasingly complex task.

A solution to use in this situation is the Promise / Futures pattern, which represents the result of a callback that has not happened yet. The concept is quite simple, instead of a function blocking and waiting to complete before returning the result, it simply returns immediately when invoked with an object that promises the future computation / result. This results in a non-blocking behaviour:

doA()
    .then(function() { return doB(); })
    .then(function() { return doC(); })
    .done(function() { /* do finished stuff here */ });

Writing your code using the Promise / Future pattern gives you most of the benefits of using nested callbacks, along with a cleaner, more structured code that is easier to maintain, understand and follow in most asynchronous environments.

Promises / Futures are not the ultimate solution, and there are dozens upon dozens of other solutions that all have their own benefits and drawbacks, each which should be explored in their own right for different situations.

Yahoo! Mojito @ JSDC.tw

I recently gave a talk at the JavaScript Developer’s Conference in Taiwan about Yahoo!’s new MVC JavaScript framework, called Mojito. A common problem developers have today when developing a single product is that they need to support so many different devices such as mobile, desktop, tablet etc. Each of these native devices need the same application to be coded in many different languages, which is not cost effective or productive.

With Mojito, you get an environment agnostic web application framework that allows you to write applications for multiple devices using only 1 language, JavaScript, on both the client and the server runtimes.

Here is the slide deck from the talk which explains more about the concept:

JavaScript’s Undefined Explored

It sounds a simple concept, but how do you actually check that a variable or property in JavaScript really exists? What is the best way to do this? How do we cover all of the edge cases? First, let’s look at what is undefined…

Overview of undefined


The value of a variable is given a type, and there are several built-in native types in JavaScript:

  1. Undefined
  2. Null
  3. Boolean
  4. String
  5. Number
  6. Object
  7. Reference
  8. etc…

Looking at 1, the built-in Undefined type can only ever have a single value, which is called undefined. This value is a primitive, and whenever a variable is declared it is assigned this undefined value, until you programmatically assign it a different value. Also, whenever a function finishes executing and returns without a given value, it returns undefined by default.

var foo,
    bar = (function() {
        // do some stuff
    }()),
    baz = (function() {
        var hello;
        return hello;
    }());

typeof foo; // undefined
typeof bar; // undefined
typeof baz; // undefined

So when a variable is declared but not assigned a value, it is given a value of undefined. We should also note that undefined is a variable / property that is available in the global scope, that also has the value of undefined.

typeof undefined; // undefined

var foo;

foo === undefined; // true

However, the global variable undefined is not a reserved word and therefore can be redefined. Luckily as of ECMA 5, undefined is not permitted to be redefined, but in previous versions and older browsers it was possible to do the following:

typeof undefined; // undefined
undefined = 99;
typeof undefined; // number

What is this null business all about?


Take the following:

null == undefined // true
null !== undefined // true

Many people are confused by the above, the the explanation is quite simple. The only real relationship between null and undefined is that they both evaluate to false during type coercion. So null == undefined // true is because the == is not performing a strict comparison, where as using the !== is more strict when comparing types. Whenever you see null as the value, it has always been programmatically assigned and never set by default.

Accessing properties on on object


When you try to use a property on an object that does not exist you will also get undefined, except for if you try to use the non existent property as a function will sometimes raise an error.

var foo = {};

foo.bar; // undefined
foo.bar(); // TypeError

This poses a massive problem. What happens if you want to tell the difference between a property that has a value undefined and a property that does not exist at all? Using both typeof and === will both give you the value of undefined. Thankfully this can be solved by using the in operator which will check does a certain property exist in an object:

var foo = {};

typeof foo.bar; // undefined (not good, bar has never been declared in the window object)

'bar' in foo; // false (use this if you don't care about the prototype chain)
foo.hasOwnProperty('bar'); // false (use this if you do care about the prototype chain)

Should you use typeof or in / hasOwnProperty?


It depends. Generally, if you want to test for the existence of a property then use in / hasOwnProperty and if you want to check for the value of a property / variable use typeof instead.