Sanjay PJ

In-Depth Guide to Variable Handling and Runtime Mechanics

Programming Compilers Memory Management Software Development

How a Program Knows the Location of Variables?

Symbol Table (Compiler’s Map)

  • The compiler creates a symbol table during the compilation phase.

Stack Frame (For Local Variables)

  • When a function is called, the program creates a stack frame to store local variables.
  • The compiler calculates offsets from the base pointer (rbp or ebp) for each variable.

Heap Allocation (Dynamic Variables)

  • For dynamically allocated variables, the program asks the operating system for memory using functions like malloc.
  • The OS returns a memory address, which is stored in a pointer variable.

Registers (Optimization)

  • Sometimes, compilers decide that using a CPU register is faster than memory, especially for small, frequently accessed variables.

Summary

The program knows each variable’s location because:

  • The compiler assigns memory or registers during compilation.
  • The symbol table maps variable names to their locations.
  • The assembly instructions reference those locations explicitly.

At runtime, the program executes these instructions to access the correct memory!

Managing Variables: Symbol Tables and Function Stacks

1. Symbol Tables (Compile-Time)

The symbol table is created during the compilation process. It:

  • Maps variable names, function names, and other symbols to their:
    • Data types.
    • Memory locations (relative to stack/frame pointers, global addresses, etc.).
    • Scope and lifetime information.

When and How:

  • Compile Time: The compiler uses the symbol table to resolve references, check for errors, and generate memory access instructions.
  • Lifetime: It exists only during the compilation phase. At runtime, the program does not use the symbol table.

2. Function Stack (Runtime)

The function stack, or call stack, is created and managed during program execution (runtime). It:

  • Stores local variables, function arguments, return addresses, and temporary data needed during function execution.
  • Is dynamically allocated and deallocated as functions are called and return.

When and How:

  • Runtime: When a function is called, a stack frame is created for it on the call stack.
  • Lifetime: The stack frame for a function exists only until the function finishes execution.

Summary:

  • The symbol table helps the compiler decide where variables and functions are stored and accessed.
  • The function stack manages runtime execution, handling actual memory for local variables, arguments, and control flow.

Think of the symbol table as the blueprint and the function stack as the construction site where the blueprint is realized!

Compilers vs Interpreters in Symbol Table Management

Yes, interpreters do maintain symbol tables at runtime, but the way they use and manage them can be different from how compilers manage symbol tables during the compilation process.

How Symbol Tables Work in an Interpreter (Runtime)

When you run a program with an interpreter, the following things happen related to the symbol table:

  1. Program Execution Starts:
    • As the interpreter reads the source code line by line, it starts processing the first statement.
    • For every variable or function used or defined, the interpreter updates its symbol table. For example, if a variable x is assigned a value, the symbol table will store an entry mapping x to the value it holds.
  2. Dynamic Scope Management:
    • Interpreters often have to deal with scopes (local vs global), and the symbol table helps with this. For example, if a variable x is defined inside a function, the interpreter will ensure that it is accessible only within that function’s scope.
    • If a function or method is called, the interpreter might have a separate symbol table for that function’s local scope, in addition to the global scope.
  3. Type Management:
    • In dynamically typed languages (like Python), types of variables are often resolved at runtime, and the interpreter updates the symbol table accordingly.
    • If a variable is reassigned to a new type, the interpreter will update the symbol table to reflect this.
  4. Function Calls and Lookups:
    • When a function is called, the interpreter checks the symbol table to find the function’s address or code block.
    • It may create a new entry for the function call with information like the arguments being passed, and update it with the result or return value.
  5. Symbol Table for Built-in Functions:
    • Many interpreted languages come with built-in functions (e.g., print() in Python), and the symbol table in the interpreter includes these built-ins.
    • The interpreter will resolve the function calls to these built-in symbols during execution, which are pre-defined in the language runtime.

When Does Push/Pop Happen?

Function Calls

  • When a function is called, arguments, return addresses, and saved registers might be pushed onto the stack.

Function Call Management (Calling Conventions)

When a function is called in C, the CPU needs to manage a few things:

  • Arguments to the function (the values that you pass in).
  • Return address (the point in the code to return to after the function completes).
  • Saved registers (to preserve the values of registers used by the function).
  • Local variables (the variables declared within the function).

Push and pop are essential for managing the stack frame for each function call.

Here’s how:

Push:

  • Push function arguments onto the stack when they can’t fit into available registers.
  • Push the return address onto the stack to remember where to return after the function finishes.
  • Push saved registers onto the stack before using them in the function, to avoid overwriting important values.

Pop:

  • Pop registers back to restore their values when the function finishes execution.
  • Pop the return address from the stack, so the program can jump back to the correct place after the function finishes.

Managing the Stack Frame

Each function has a stack frame, which contains:

  • The function’s arguments (if passed via the stack).
  • The return address.
  • The saved registers.
  • Local variables.

The stack frame is created when the function is called and destroyed when the function returns. The push and pop operations are essential for managing the stack frame:

  • Push adds data to the stack.
  • Pop removes data from the stack.

This ensures that after the function completes, the stack is properly cleaned up, and the program can continue execution from where it left off.

Stack and Stack Frames

The stack is a region of memory used to store function calls and local variables. When a function is called, a stack frame is created, which contains the following:

  • Return address: The location in the program where control should return after the function finishes.
  • Function arguments: The values passed to the function.
  • Saved registers: If any registers are used by the function and need to be preserved (so the calling function can use them after the call), they are saved to the stack.
  • Local variables: The variables declared inside the function.

Each time a function is called, a new stack frame is added to the stack. The stack grows as functions are called and shrinks as they return.

Closures (Functions as First-Class Citizens)

A closure is a function that remembers and captures the environment in which it was created. This means that a closure can access variables from its surrounding scope, even after that scope has finished executing.

In languages that support closures (such as JavaScript), you can define a function inside another function, and the inner function can access the variables of the outer function even after the outer function has completed execution. This is due to lexical scoping and how closures “close over” their environment.

Example: Callback Functions and Event Handling

Closures are especially useful for callback functions and event handling in asynchronous programming. A closure can remember the context in which it was created, making it easier to handle callbacks and pass state between different stages of execution.

function fetchData(url, callback) {
  // Simulate an async operation
  setTimeout(() => {
    console.log(`Data fetched from: ${url}`);
    callback(url);  // Closure captures `url` and makes it available to the callback
  }, 1000);
}

fetchData('https://api.example.com', function(url) {
  console.log(`Callback called for: ${url}`);
});
  • In this example, the callback function can access the url variable even after the fetchData function has finished executing, thanks to the closure.

To understand how closures work under the hood, we need to dive into the execution context, lexical scoping, and how JavaScript or other languages implement closures.

Key Concepts for Understanding Closures:

  1. Lexical Scoping
  2. Execution Context and the Call Stack
  3. The Closure (Environment)
  4. How Variables Are Stored in Memory

1. Lexical Scoping:

  • Lexical scoping means that the scope of a variable is determined by the location where the function is defined, not where it is called.
  • When you define a function inside another function, the inner function has access to the variables of the outer function due to lexical scoping.

2. Execution Context and the Call Stack:

When JavaScript code is executed, an execution context is created for each function call. Each execution context has three key components:

  • Variable Object (VO): Stores the function’s local variables, parameters, and function declarations.
  • Scope Chain: A stack of variable objects (each representing a scope in the function call stack).
  • this value: Refers to the current execution context (usually the object calling the function).

Each time a function is called, it creates an execution context and is pushed onto the call stack.

3. The Closure (Environment):

A closure is created when a function is defined inside another function and gains access to the outer function’s variables. The closure is the function along with the environment (i.e., the variables it has access to).

When the inner function is returned or executed outside the parent function, it remembers the environment in which it was created, even if the outer function has finished executing. The inner function has access to the lexical scope of its outer function.

4. How Variables Are Stored in Memory:

When closures are created, the environment in which the closure was created is stored in memory. This environment includes references to variables from the outer function’s scope. Even if the outer function finishes execution, the closure retains access to these variables because the environment is not destroyed.

Here’s how closures are handled under the hood:

  1. Scope Chain: Each function has a scope chain which contains references to variables in its own scope and the scopes of any outer functions. When a variable is referenced inside a function, JavaScript first looks inside the function’s own local variables, and if it doesn’t find the variable there, it looks in the outer function’s scope, and so on up the chain.
  2. Environment Record: The environment for closures is stored in an environment record. This record contains the references to the variables in the scope at the time the function is defined. These variables are not copied but referenced, meaning that any changes to the variables in the outer function (if they are still alive) will be reflected in the closure.
  3. Garbage Collection: Closures prevent the garbage collection of outer function variables as long as the closure exists. This is because the closure retains a reference to the outer function’s environment, meaning those variables are kept alive in memory.

Under-the-Hood Breakdown of Closures in JavaScript:

  • When a closure is created: The function body and the scope in which it was defined are bundled together. The function body is the closure, and it “captures” the environment in which it was created (the lexical scope of variables).
  • Memory Handling: JavaScript uses heap memory to store the environment of the closure. When the function is executed, the environment remains in memory until the closure is no longer used.
  • Scope Chain: The closure maintains a reference to its outer function’s scope chain, meaning it can still access those outer variables even after the outer function execution context is popped from the call stack.
  • Lifecycle of the Closure: The closure can be invoked later (even after its enclosing function finishes execution) because it has a reference to the environment in which it was created. If no longer referenced, the closure can eventually be garbage collected.

The new keyword: Functions and Objects

Yes, the new keyword in JavaScript is used to create a new object from a constructor function. It essentially creates a new execution context for the constructor function and performs several operations to set up the new object. Let’s break this down step by step:

What Happens When You Use the new Keyword?

  1. New Object Creation: When you invoke a function with the new keyword, a new empty object is created. This object is a blank container that will hold the properties and methods defined in the constructor function.
  2. Setting up the Execution Context: When the constructor function is called, a new execution context is created. This context will have:
    • A this reference that points to the newly created object (instead of the global object or undefined in strict mode).
    • A scope chain that includes the variables of the constructor function.
  3. Adding Properties/Methods to the Object: Inside the constructor function, any properties or methods defined using this will be added to the newly created object. For example, if you define this.name = 'John' inside the constructor, this property will be added to the new object.
  4. Returning the Object: By default, the constructor function returns the newly created object. However, if the constructor function explicitly returns an object (other than this), that object will be returned instead. If a primitive value (like a string or number) is returned, it is ignored, and the newly created object is returned.

this Inside an Object Method

In JavaScript, this is a special keyword that refers to the current object on which a method or function is being executed. Its value is determined by the execution context, which depends on how the function is called.

When this is used inside an object method, it refers to the object that the method belongs to.

When you define a method inside an object and call that method, this inside the method will refer to the object that the method is a part of.

Arrow Functions and this:

Arrow functions behave differently when it comes to this. Unlike regular functions, arrow functions do not have their own this. Instead, this in an arrow function is inherited from the surrounding lexical scope (the context in which the arrow function is defined). This is called lexical scoping.

Function Prototype

In JavaScript, functions are objects themselves, and they have a special property called prototype. This property is crucial for JavaScript’s inheritance model and is used to add methods and properties to all instances created from a function constructor.

What is a Function’s Prototype?

Every JavaScript function has a prototype property, which is an object that is used as a template for instances created with the new keyword. The prototype object is where you can define methods and properties that will be shared among all instances created from the function.

When you create an object using the new keyword with a constructor function, that object’s internal [[Prototype]] (also known as __proto__) is set to the constructor function’s prototype object.

The prototype Chain:

The prototype chain allows an object to look up properties and methods from its constructor’s prototype, and from that prototype’s prototype, and so on, until it reaches Object.prototype (the base prototype for all JavaScript objects).

By using prototypes, JavaScript provides a way to implement inheritance and shared behaviour between objects, which is fundamental to how objects and functions interact in JavaScript.

Method Chaining with Immutable Data

Method chaining is a technique in JavaScript (and other programming languages) where you can call multiple methods on the same object in a single line of code, one after the other. This is possible because each method in the chain returns the object itself (or another object that allows further method calls), which allows the next method to be invoked on it.

How Does Method Chaining Work?

Method chaining works by returning the object (or another object) from each method call, which allows multiple method calls to be “chained” together. For example:

const result = object.method1().method2().method3();