Dossier - ecmascript-dossier

Preface: Why a Systems Programmer Should Care About JavaScript

The Unlikely Marriage of Low-Level Thinking and High-Level Chaos

The Cognitive Dissonance

You’ve spent years—perhaps decades—thinking in terms of registers, stack frames, and memory layouts. You understand that a program is ultimately a sequence of machine instructions, that data structures are arrangements of bytes in memory, and that performance comes from understanding what the hardware actually does. JavaScript, at first glance, seems to be the antithesis of everything you value.

JavaScript doesn’t care about memory alignment. It has garbage collection instead of manual memory management. It converts types implicitly, sometimes in ways that defy logic. The == operator can make [] == ![] evaluate to true. The language was designed in ten days, and it shows. For someone who appreciates the elegance of C’s simplicity or the brutal honesty of assembly language, JavaScript appears to be a cosmic joke.

The Uncomfortable Truth

Yet here we are. JavaScript is the most deployed runtime environment in human history. It runs on billions of devices—every smartphone, every laptop, every desktop computer with a web browser. It executes in environments you wouldn’t expect: embedded systems, IoT devices, databases (MongoDB), serverless functions, and even spacecraft. The V8 engine alone represents one of the most sophisticated JIT compiler infrastructures ever built, comparable in complexity to LLVM.

More importantly for you: JavaScript and WebAssembly are becoming the universal compilation targets. When you need your code to run anywhere—truly anywhere—you increasingly have two choices: target native machine code for each platform (x86-64, ARM, RISC-V, etc.) or target JavaScript/WebAssembly. The latter is often more practical.

The Strategic Position

JavaScript occupies a strategic position in computing infrastructure that cannot be ignored. It is:

  • The assembly language of the web: Just as you might not write assembly directly but need to understand it, JavaScript is the bytecode that runs everywhere.

  • A compilation target: TypeScript, CoffeeScript, ClojureScript, Elm, PureScript, and hundreds of other languages compile to JavaScript. WebAssembly is designed to run alongside it.

  • A plugin mechanism: Many applications use JavaScript as an embedded scripting language (Photoshop, Unity, game engines, CAD software).

  • An optimization target: Modern JavaScript engines perform sophisticated optimizations—type inference, inline caching, hidden classes—that rival static compilers.

If you’re building a transpiler, you need to understand JavaScript as a target. If you’re building a compiler, WebAssembly is increasingly the deployment mechanism. If you’re building tools, browser extensions are the distribution channel. Like it or not, this is the infrastructure layer of modern computing.

The Systems Perspective

This book takes a different approach than typical JavaScript tutorials. We’re not here to teach you how to build a todo list app or manipulate the DOM to create animated buttons. We’re treating JavaScript as a systems-level technology:

  • The runtime as a virtual machine: Understanding the event loop, call stack, and heap allocation strategies.

  • The language as a compilation target: Studying the AST, code generation, and optimization pipelines.

  • The ecosystem as infrastructure: Package management, module systems, and build toolchains as analogous to linkers and loaders.

  • The specification as documentation: Reading ECMA-262 the way you’d read the Intel SDM or the ARM Architecture Reference Manual.

This perspective reveals that JavaScript, despite its surface chaos, has underlying mechanisms that are comprehensible and even elegant when viewed through the right lens.

The WebAssembly Revolution

WebAssembly changes the equation fundamentally. It provides:

  • A binary format: Compact, fast to parse, fast to validate.

  • A stack machine model: Simple, deterministic, and easy to target from a compiler.

  • Near-native performance: JIT-compiled to machine code with predictable performance characteristics.

  • Language agnosticism: C, C++, Rust, Go, and dozens of other languages can compile to WebAssembly.

  • Security: Sandboxed execution with capability-based security (via WASI).

WebAssembly is the first serious attempt at a portable, safe, fast binary format for code distribution since Java bytecode. Unlike Java, it’s succeeding because it doesn’t try to be a complete platform—it’s a compilation target that integrates with existing ecosystems.

For a systems programmer, WebAssembly is fascinating because it brings low-level concepts back to the web:

  • Manual memory management: You allocate from linear memory, just like malloc.

  • Static typing: i32, i64, f32, f64—no implicit conversions.

  • Deterministic execution: No garbage collection pauses (unless you implement your own GC).

  • Low-level control flow: Direct branches, computed jumps via tables.

It’s almost like someone took the good parts of a systems language and made them run in a browser.

What This Book Is (and Emphatically Isn’t)

What This Book Is

This is a technical manual for understanding JavaScript and WebAssembly from a compiler writer’s and systems programmer’s perspective. It assumes you have strong fundamentals in:

  • Systems programming: You’ve written substantial code in C, C++, or similar languages.

  • Computer architecture: You understand what machine code is, how CPUs execute instructions, and what assembly language looks like.

  • Compilers: You’ve at least read about lexing, parsing, AST construction, and code generation, even if you haven’t built a complete compiler.

  • Operating systems concepts: You know what virtual memory is, how processes work, and what system calls do.

With that foundation, we’ll explore:

  1. The JavaScript language specification (ECMA-262) in detail, focusing on semantics rather than syntax.

  2. The execution model: How JavaScript engines actually work—the JIT compilation pipeline, inline caching, hidden classes, garbage collection strategies.

  3. The module systems: From global scope chaos to ES6 modules, CommonJS, and the Node.js runtime.

  4. JavaScript as a compilation target: How to parse JavaScript, manipulate ASTs, and generate code.

  5. WebAssembly’s design: The binary format, text format, type system, and execution model.

  6. Building a WebAssembly compiler: How to translate from an intermediate representation to the Wasm stack machine.

  7. Interoperability: How JavaScript and WebAssembly communicate, share memory, and call each other’s functions.

  8. Practical applications: Browser extensions, userscripts, Node.js tooling, and WASI for server-side WebAssembly.

What This Book Isn’t

This is not:

  • A web development tutorial: We won’t build React apps or discuss CSS frameworks.

  • A beginner’s programming book: We assume you can program and understand algorithmic complexity.

  • A JavaScript best practices guide: We care more about how things work than about style guides or linting rules.

  • Framework-focused: No Angular, React, Vue, or any other frontend framework unless it illustrates a fundamental concept.

  • A replacement for the specifications: We reference ECMA-262 and the WebAssembly spec extensively, but we can’t replace them. You should have both documents available.

The Tone and Approach

I’m going to be honest with you: JavaScript has warts. Many of them. We’ll point them out, explain why they exist (historical context matters), and show you how to work around them. We won’t pretend that 0.1 + 0.2 !== 0.3 is elegant—it’s an IEEE 754 consequence that every language faces, but JavaScript’s implicit coercion makes it more visible.

At the same time, we’ll acknowledge the genuine engineering achievements: V8’s TurboFan compiler is a marvel of optimization technology. The ES6 module system is actually well-designed. WebAssembly’s security model is sophisticated and practical.

This book has opinions, but they’re engineering opinions backed by technical reasoning. When we say something is poorly designed, we’ll explain why and what the alternatives would have been. When we say something is clever, we’ll show you the mechanism.

Practical Outcomes

By the end of this book, you should be able to:

  1. Write a source-to-source compiler that targets JavaScript, handling scoping, closures, and the module system correctly.

  2. Build browser extensions for Firefox and Chrome that do non-trivial work (not just DOM manipulation).

  3. Create userscripts for Greasemonkey/Tampermonkey that can intercept and modify web traffic at the JavaScript level.

  4. Understand the Node.js runtime well enough to build command-line tools, understand the event loop, and integrate with native C/C++ addons.

  5. Read and understand ECMA-262: Navigate the specification, understand abstract operations, and predict behavior in edge cases.

  6. Write WebAssembly by hand in the text format (WAT) for educational purposes.

  7. Build a compiler backend that targets WebAssembly, handling function calls, linear memory, and the import/export system.

  8. Debug WebAssembly in browser DevTools and understand the generated binary format.

  9. Use WASI to run WebAssembly outside the browser with system interface access.

  10. Understand JavaScript engine internals at a high level: what V8’s TurboFan does, how SpiderMonkey’s IonMonkey optimizes, and how these relate to traditional compiler technology.

These are practical skills that enable you to work at the intersection of high-level and low-level programming.

Your Background Assessment: C, Assembly, and the Unix Philosophy

What I’m Assuming You Know

Looking at your GitHub portfolio (https://github.com/Chubek), I can see projects involving:

  • Compiler construction: Lexers, parsers, and code generation.

  • Assembly language programming: x86, ARM, and potentially other architectures.

  • Systems-level C/C++: Memory management, pointer manipulation, bit operations.

  • Unix/Linux tools: Shell scripting, text processing, build systems.

  • Language implementation: Interpreters, VMs, and runtime systems.

This tells me you think in terms of:

  • Memory as a flat array of bytes: Addresses, pointers, and manual layout.

  • Explicit control flow: Jumps, branches, and call/return mechanisms.

  • Minimal abstractions: You prefer mechanisms you can see through.

  • Tools as composable primitives: The Unix philosophy of small, focused utilities.

The Conceptual Bridge

JavaScript violates many of your intuitions:

Systems Intuition JavaScript Reality The Truth Underneath
Variables have types Variables hold references to values Values have types; variables are just names
Memory is explicit Garbage collection Generational GC with compaction, hidden from you
Functions are code Functions are objects Functions are closures with [[Environment]] slots
== tests equality == converts types Abstract Equality uses a complex coercion algorithm
Performance is predictable JIT compilation is non-deterministic Hidden classes and inline caches create performance cliffs

The key to understanding JavaScript is realizing that it’s a high-level language with low-level implementation details that matter. The spec describes abstract operations, but V8 implements them with sophisticated compiler techniques. You need to understand both layers.

The Translation Guide

Throughout this book, I’ll provide “translation guides” that map JavaScript concepts to systems-level equivalents:

  • Closures ↔︎ Stack frames captured on the heap

  • Prototypes ↔︎ Vtable pointers with delegation

  • The event loop ↔︎ select()/epoll() with callback queues

  • Promises ↔︎ Continuation-passing style with state machines

  • Typed Arrays ↔︎ Pointers to raw memory with typed access

  • WebAssembly linear memory ↔︎ mmap()-allocated region

These analogies aren’t perfect, but they provide mental models that map to what you already know.

The Systems Programmer’s Advantage

You have advantages that web developers often lack:

  1. You understand the cost of abstractions: When we discuss engine internals, you’ll grasp why certain patterns perform well.

  2. You can read specifications: ECMA-262 uses algorithmic pseudocode similar to compiler textbooks.

  3. You know binary formats: WebAssembly’s binary encoding will make immediate sense.

  4. You understand ISAs: The WebAssembly instruction set is simpler than x86 or ARM.

  5. You’re comfortable with low-level debugging: Understanding stack traces and heap dumps comes naturally.

Where web developers struggle with “why is this slow?”, you’ll be able to profile, understand deoptimization, and fix the root cause.

How to Use This Book

Reading Strategies

This book is designed to support multiple reading strategies:

Read chapters 1-18 in order. This builds concepts progressively:

  • Chapters 1-3: Foundation—language semantics and execution model.

  • Chapters 4-7: Core JavaScript—functions, objects, arrays, modules.

  • Chapters 8-10: Runtime environments—browser, Node.js, extensions.

  • Chapters 11-12: Transpilation—building compilers that target JavaScript.

  • Chapters 13-17: WebAssembly—from fundamentals to building a compiler.

  • Chapter 18: Engine internals—how JavaScript and WebAssembly are implemented.

Strategy 2: Goal-Oriented Read

Jump to the chapters that support your immediate goals:

  • Goal: Build a transpiler → Chapters 1-2 (foundation), 7 (modules), 11-12 (transpilation)

  • Goal: Browser extensions → Chapters 1-2 (foundation), 8 (browser environment), 10 (extensions)

  • Goal: WebAssembly compiler → Chapters 1-2 (foundation), 13-16 (WebAssembly)

  • Goal: Node.js tools → Chapters 1-2 (foundation), 7 (modules), 9 (Node.js)

  • Goal: Understand engines → Chapters 1-3 (foundation), 18 (engines)

Strategy 3: Reference Use

Keep the book nearby while working on projects. Use the appendices for quick lookups:

  • Appendix A: Syntax quick reference for when you forget the syntax.

  • Appendix B: Feature compatibility for when you need to support older environments.

  • Appendix C: WebAssembly instruction reference for when you’re hand-coding WAT.

  • Appendix D: Tools and libraries for when you need to find the right library.

Working with the Specifications

Throughout the book, I reference:

  • ECMA-262 (the JavaScript specification): ecma-262-std.pdf (847 pages)

  • WebAssembly Specification: wasm-spec.pdf (321 pages)

These documents are dense but invaluable. I’ll teach you how to read them:

  • ECMA-262 uses abstract operations (functions prefixed with !) and algorithmic steps. It’s actually quite readable once you understand the notation.

  • WebAssembly Spec uses formal semantics (mathematical notation), which is more challenging but precise.

I’ll provide page references so you can cross-reference the specifications. For example:

The [[Environment]] internal slot of a function object (ECMA-262, §10.2.3, p. 203) stores the lexical environment in which the function was created, enabling closures.

This means you can turn to page 203 of ecma-262-std.pdf to see the actual specification text.

Code Examples and Conventions

All code examples follow these conventions:

JavaScript Code
// Comments explain what's happening
function example(parameter) {
    // We use modern ES6+ syntax by default
    const result = parameter * 2;
    return result;
}
WebAssembly Text Format (WAT)
;; Comments in WAT use semicolons
(module
    (func $example (param $parameter i32) (result i32)
        ;; Stack-based operations
        local.get $parameter
        i32.const 2
        i32.mul
    )
)
C/C++ (for comparison)
// C code for comparison when useful
int example(int parameter) {
    int result = parameter * 2;
    return result;
}
Shell Commands
# Commands you run in a terminal
npm install some-package
node script.js

Exercises and Experiments

Each chapter includes “Experiments” sections suggesting things to try. These are not traditional exercises with solutions—they’re explorations:

  • Run this code in the browser console and observe the output.

  • Modify this example and see what breaks.

  • Use the debugger to step through execution.

  • Read this section of the spec and see if you can understand it now.

I don’t provide solutions because the point is exploration and building intuition. When you modify code and it breaks in an unexpected way, that’s when you learn.

Online Resources

While I’ve tried to make this book comprehensive, you’ll need external resources:

  1. MDN Web Docs (https://developer.mozilla.org): The best JavaScript reference.

  2. Node.js Documentation (https://nodejs.org/docs): For Node.js-specific APIs.

  3. WebAssembly.org (https://webassembly.org): Official WebAssembly documentation.

  4. WABT (https://github.com/WebAssembly/wabt): WebAssembly Binary Toolkit for working with Wasm.

  5. V8 Blog (https://v8.dev/blog): Deep dives into engine internals.

I’ll reference these throughout the book.

Feedback and Errata

This is a technical book about rapidly evolving technologies. JavaScript evolves yearly (ES2025, ES2026, etc.), and WebAssembly is still adding features. I’ve focused on:

  • Stable features: Things unlikely to change (closures, prototypes, the event loop).

  • Current standards: ES2025 and WebAssembly 3.0 as of publication.

  • Timeless concepts: Compilation strategies, runtime models, and engineering trade-offs.

When new features arrive, the fundamentals we cover will help you understand them.

Acknowledgments

Standing on the Shoulders of Giants

This book wouldn’t exist without:

  • Dr. Axel Rauschmayer’s “JavaScript for Impatient Programmers”: A clear, well-organized introduction that inspired parts of our structure.

  • The ECMA TC39 Committee: For stewarding JavaScript’s evolution with care.

  • The WebAssembly Community Group: For designing a remarkably elegant bytecode format.

  • Engine Developers: The teams behind V8, SpiderMonkey, JavaScriptCore, and ChakraCore who push the boundaries of JIT compilation.

  • The Open Source Community: For creating tools like Babel, Acorn, and WABT that make working with these technologies practical.

Personal Acknowledgments

To Chubek (the intended reader): I’ve studied your GitHub projects and tried to write the book I think you need. Your background in compilers, assembly, and systems programming is evident, and I’ve assumed that level of sophistication throughout. If I’ve misjudged, let me know—this book is for you.

To the broader systems programming community: Many of you are skeptical of JavaScript, and rightfully so. I hope this book convinces you that underneath the chaos, there’s interesting technology worth understanding. Not because you’ll love JavaScript (you might not), but because it’s infrastructure you can’t avoid, and infrastructure should be understood.

Tools and Environment

This book was written using:

  • ECMA-262 16th Edition (June 2025): The authoritative JavaScript specification.

  • WebAssembly Specification 3.0: The latest WebAssembly standard.

  • Node.js v20+: For testing all code examples.

  • Firefox Developer Edition and Chrome/Chromium: For browser-based examples.

  • WABT (WebAssembly Binary Toolkit): For assembling and disassembling WebAssembly.

All code examples have been tested and work as shown (or fail in the instructive way described).


A Note on Attitude

If you’ve gotten this far, you’re committed to understanding JavaScript and WebAssembly despite your misgivings. That’s good. You don’t have to love these technologies—I’m not asking you to. But you should understand them with the same rigor you’d apply to any other systems-level technology.

JavaScript is messy, yes. It has more warts than most languages. But it’s also the result of decades of evolution, trillions of dollars of investment in engine optimization, and the collective work of thousands of talented engineers. Dismissing it as “toy language for web developers” is intellectually lazy.

WebAssembly, on the other hand, is a genuinely well-designed system. It learned from Java’s mistakes, avoided JavaScript’s legacy baggage, and provides a clean compilation target with strong security properties. It deserves your respect.

Let’s begin.


Today’s date: 1404/07/21 (Jalali) / 2025/10/13 (Gregorian)


Chapter 1: The ECMAScript Standard and Its Discontents

From Mocha to ES2025: A Brief, Opinionated History

The Ten-Day Language

In May 1995, Brendan Eich was hired by Netscape Communications Corporation with a specific mandate: create a scripting language for the web browser. The timeline was absurd—the company wanted it done in ten days. The result was Mocha, later renamed LiveScript, and finally JavaScript to capitalize on Java’s marketing momentum (despite having almost nothing in common with Java beyond some syntactic similarities).

This rushed creation explains many of JavaScript’s quirks. When you have ten days to design a language, you don’t have time to think through every edge case. You borrow liberally from existing languages—Scheme for first-class functions and closures, Self for prototypal inheritance, Java for syntax—and you ship it. The consequences of those hasty decisions compound over decades when backward compatibility becomes sacrosanct.

Systems Programmer’s Note: Imagine designing a processor ISA in ten days. You’d probably copy x86’s instruction encoding, ARM’s conditional execution, and MIPS’s delayed branches. Then imagine that ISA being frozen for 30 years with billions of devices depending on every quirk. That’s JavaScript’s predicament.

The Standardization Process (1997-1999)

By 1996, JavaScript was too important to remain a proprietary Netscape technology. Microsoft had reverse-engineered it as JScript for Internet Explorer, introducing subtle incompatibilities. The Browser Wars had begun, and developers were caught in the crossfire.

Netscape submitted JavaScript to Ecma International (a European standards body) in 1996. The Technical Committee 39 (TC39) was formed to standardize the language. The first edition of ECMA-262 was published in June 1997, codifying what would become known as ECMAScript 1.

Why “ECMAScript” and not “JavaScript”? Sun Microsystems (later acquired by Oracle) owned the trademark “JavaScript.” To avoid legal issues, the standard uses “ECMAScript” as the formal name. In practice, everyone still says “JavaScript.”

The timeline of early editions:

  • ES1 (June 1997): The baseline. Standardized what Netscape Navigator 4 and IE 4 mostly agreed on.

  • ES2 (June 1998): Editorial changes to align with ISO/IEC 16262.

  • ES3 (December 1999): Added regular expressions, better string handling, try/catch, and other improvements. This became the stable base for years.

Then came ES4, the edition that never was.

The ES4 Debacle (2000-2008)

ES4 was ambitious—perhaps too ambitious. The committee wanted to add:

  • Classes and classical inheritance: Moving away from prototypes.

  • Modules: A proper module system instead of global scope pollution.

  • Namespaces: Organizing code into hierarchical structures.

  • Type annotations: Optional static typing.

  • Generators and iterators: Lazy evaluation and custom iteration.

  • Operator overloading: Custom behavior for +, *, etc.

This was a complete overhaul of the language. Microsoft, Yahoo, and others opposed it, arguing it was too complex and strayed too far from JavaScript’s roots. Adobe had ActionScript 3 (for Flash), which implemented many ES4 proposals, but the broader JavaScript community was divided.

The conflict became political. Two factions emerged:

  1. ES4 maximalists: Led by Mozilla and Adobe, wanting a comprehensive upgrade.

  2. ES3.1 minimalists: Led by Microsoft and Yahoo, wanting incremental improvements.

By 2008, it was clear ES4 wouldn’t pass. In a compromise, ES4 was abandoned, and ES3.1 became ES5. Some ES4 ideas (generators, modules, classes) would later appear in ES6, but in different forms.

Engineering Lesson: Standards by committee are slow, but that’s a feature, not a bug. Backward compatibility and consensus prevent one vendor from fragmenting the ecosystem. The price is conservatism and occasionally absurd compromises.

The Modern Era (ES5 to ES2025)

ES5 (December 2009)

ES5 brought modest but important improvements:

  • Strict mode ("use strict";): Opts into stricter error checking and removes dangerous features.

  • JSON: Native JSON.parse() and JSON.stringify().

  • Array methods: map, filter, reduce, forEach, etc.

  • Property descriptors: Fine-grained control over object properties (Object.defineProperty()).

  • Getters and setters: Accessor properties.

ES5 was conservative, but it stabilized the language. It’s the last version universally supported without transpilation (though even old IE versions had spotty support).

ES6/ES2015: The Turning Point

ES6, officially named ES2015, was transformative. After years of stagnation, the committee delivered a massive upgrade:

  • let and const: Block-scoped variables, finally!

  • Arrow functions: Lexical this binding and concise syntax.

  • Classes: Syntactic sugar over prototypes, but cleaner.

  • Modules: import and export for proper modularity.

  • Promises: Standardized asynchronous programming.

  • Generators: Functions that can pause and resume (function*).

  • Template literals: String interpolation with `Hello, ${name}`.

  • Destructuring: Pattern matching for assignment.

  • Rest/spread operators: ...args for functions and arrays.

  • Symbols: A new primitive type for unique identifiers.

  • Iterators and for...of: Standardized iteration protocol.

  • Map, Set, WeakMap, WeakSet: New collection types.

  • Typed arrays: Uint8Array, Int32Array, etc., for binary data.

ES6 was so large it took years for engines to fully implement. It marked the transition from “JavaScript is a toy language” to “JavaScript is a serious programming platform.”

The Yearly Release Cycle (ES2016-ES2025)

After ES6, TC39 adopted a yearly release cycle. Each edition is named by year: ES2016, ES2017, etc. Features are smaller and more focused:

  • ES2016 (ES7): Array.prototype.includes(), exponentiation operator (**).

  • ES2017 (ES8): async/await, Object.entries(), Object.values(), shared memory and atomics.

  • ES2018 (ES9): Asynchronous iteration, rest/spread for objects, Promise.prototype.finally().

  • ES2019 (ES10): Array.prototype.flat(), Object.fromEntries(), optional catch binding.

  • ES2020 (ES11): Optional chaining (?.), nullish coalescing (??), BigInt, Promise.allSettled(), globalThis.

  • ES2021 (ES12): Logical assignment operators (&&=, ||=, ??=), numeric separators (1_000_000), String.prototype.replaceAll().

  • ES2022 (ES13): Class fields, private methods (#private), top-level await, Array.prototype.at().

  • ES2023 (ES14): Array.prototype.findLast(), Array.prototype.toSorted(), hashbang grammar (#!/usr/bin/env node).

  • ES2024 (ES15): Promise.withResolvers(), Object.groupBy(), ArrayBuffer transfer, well-formed Unicode strings.

  • ES2025 (ES16): Regular expression modifiers, Set methods (.union(), .intersection()), duplicate named capture groups in regex.

Key Insight: The yearly cycle allows incremental evolution. TC39 uses a four-stage proposal process (Stage 0: Strawperson, Stage 1: Proposal, Stage 2: Draft, Stage 3: Candidate, Stage 4: Finished). Only Stage 4 proposals make it into the spec.

The Shadow History: Engines Drive Innovation

While TC39 standardizes, browser vendors innovate. V8 (Chrome/Node.js), SpiderMonkey (Firefox), and JavaScriptCore (Safari) compete on performance. Many language features emerged from engine experiments:

  • V8’s hidden classes: Optimization technique that became foundational to modern JS performance.

  • asm.js: Mozilla’s subset of JavaScript designed for AOT compilation, precursor to WebAssembly.

  • Typed Arrays: Originally a WebGL requirement (need fast binary data), later standardized.

The relationship is symbiotic: engines push boundaries, TC39 standardizes what works, and the spec constrains engine behavior to ensure interoperability.

The ECMA-262 Document: Structure and Navigation

Obtaining the Specification

ECMA-262 is freely available at https://ecma-international.org/publications-and-standards/standards/ecma-262/. The document is published annually, with the most recent being the 16th Edition (June 2025)—referred to as ES2025.

Physical Properties: The PDF (ecma-262-std.pdf) is 847 pages. It’s dense, technical, and uses specialized notation. Don’t try to read it cover-to-cover; treat it as a reference manual.

High-Level Structure

The specification is organized into major sections:

Front Matter (Pages 1-24)
  • Scope (§1, p. 1): One-paragraph overview of what ECMAScript is.

  • Conformance (§2, pp. 1-2): What it means for an implementation to be conformant.

  • Normative References (§3, p. 2): References to Unicode, ISO 8601 (dates), etc.

  • Overview (§4, pp. 2-11): High-level description of the language, hosts (browser vs. Node.js), and terms.

  • Notational Conventions (§5, pp. 11-24): How to read the spec’s algorithmic notation.

Systems Programmer’s Note: Think of this as the “Instruction Set Architecture” manual’s introductory chapters—defining notation, conventions, and scope before diving into opcodes.

Core Semantics (Pages 25-168)
  • ECMAScript Data Types and Values (§6, pp. 25-62): Primitive types (Undefined, Null, Boolean, String, Symbol, Number, BigInt, Object) and specification types (internal constructs like Reference, Completion Record).

  • Abstract Operations (§7, pp. 63-92): Reusable algorithms for type conversion, comparisons, and object operations. Think of these as the “standard library” of the spec itself.

  • Syntax-Directed Operations (§8, pp. 93-136): How syntax maps to semantics (evaluation rules, scope analysis).

  • Executable Code and Execution Contexts (§9, pp. 137-168): The runtime environment—how code executes, what an execution context is, the job queue, agents (threads), and realms.

Object Model (Pages 169-213)
  • Ordinary and Exotic Objects Behaviours (§10, pp. 169-213): How objects work internally. Ordinary objects follow standard property access rules. Exotic objects (Arrays, bound functions, Proxies) have custom internal methods.

Key Concept: JavaScript objects have internal slots (like [[Prototype]], [[Extensible]]) and internal methods (like [[Get]], [[Set]]). These are specification mechanisms, not directly accessible in code.

Language Syntax and Semantics (Pages 214-699)

This is the bulk of the spec:

  • Source Text (§11, pp. 214-217): How source code is interpreted (UTF-16, BOM handling).

  • Lexical Grammar (§12, pp. 218-240): Tokens—identifiers, keywords, literals, operators.

  • Expressions (§13, pp. 241-356): How expressions are parsed and evaluated.

  • Statements and Declarations (§14, pp. 357-446): Control flow, loops, variable declarations.

  • Functions and Classes (§15, pp. 447-522): Function definitions, arrow functions, class syntax.

  • Scripts and Modules (§16, pp. 523-542): Top-level code execution, module imports/exports.

  • Error Handling (§17, pp. 543-550): Native error types and throw/catch semantics.

Built-in Objects (Pages 551-826)

Every standard object (Object, Array, Function, Promise, Map, Set, etc.) is specified here:

  • Fundamental Objects (§18-20): Object, Function, Boolean, etc.

  • Numbers and Dates (§21-22): Number, BigInt, Math, Date.

  • Text Processing (§23): String, RegExp.

  • Indexed Collections (§24): Array, TypedArray.

  • Keyed Collections (§25): Map, Set, WeakMap, WeakSet.

  • Structured Data (§26): ArrayBuffer, DataView, Atomics, JSON.

  • Managing Memory (§27): WeakRef, FinalizationRegistry.

  • Control Abstraction Objects (§28-29): Iterators, generators, Promise, async/await.

  • Reflection (§30): Reflect, Proxy, Module introspection.

Appendices (Pages 827-847)
  • Grammar Summary (Annex A, pp. 827-835): Complete lexical and syntactic grammar in one place.

  • Strict Mode (Annex C, p. 836): Differences between strict and sloppy mode.

  • Corrections and Clarifications (Annex D, pp. 837-842): Changes from previous editions.

  • Bibliography: References to Unicode, ISO standards, etc.

How to Navigate the Spec

Reading Algorithmic Steps

The spec uses a pseudocode notation called abstract operations. Example from §7.1.1 (ToPrimitive, p. 63):

ToPrimitive ( input [ , preferredType ] )

  1. If Type(input) is Object, then
    1. If preferredType is not present, let hint be “default”.
    2. Else if preferredType is STRING, let hint be “string”.
    3. Else,
      1. Assert: preferredType is NUMBER.
      2. Let hint be “number”.
    4. Let exoticToPrim be ? GetMethod(input, @@toPrimitive).
    5. If exoticToPrim is not undefined, then
      1. Let result be ? Call(exoticToPrim, input, « hint »).
      2. If Type(result) is not Object, return result.
      3. Throw a TypeError exception.
    6. If hint is “default”, set hint to “number”.
    7. Return ? OrdinaryToPrimitive(input, hint).
  2. Return input.

Notation guide:

  • ?: Propagates exceptions. If the called operation returns an abrupt completion (error), this step returns immediately with that error.

  • !: Asserts the operation never fails. Used when the spec guarantees success.

  • [ , optional ]: Optional parameters.

  • « »: A list (like an array).

  • Type(x): Returns the type tag of x (Object, Number, etc.).

  • @@toPrimitive: A well-known symbol (Symbol.toPrimitive).

Systems Analogy: This is like reading CPU microcode or a state machine definition. Each step is deterministic; follow them sequentially.

Finding Information Quickly

Use the table of contents (pp. i-x). It’s detailed. For example, to find how Array.prototype.map works:

  1. Look up “Indexed Collections” → Section 24.

  2. Navigate to § → “Array.prototype.map” (p. 588).

The spec has internal hyperlinks in the PDF. Click on a reference like “§7.1.1” to jump there.

Use the index (pages 827+). Search for “ToPrimitive,” “Completion Record,” “Lexical Environment,” etc.

Cross-Referencing with MDN

While ECMA-262 is authoritative, MDN Web Docs (https://developer.mozilla.org) is more readable for learning. MDN explains what and why; the spec explains how and edge cases.

Workflow:

  1. Learn from MDN.

  2. Verify behavior in the spec.

  3. Understand nuances (like why [] == ![] is true) by tracing through abstract operations.

Internal Slots and Methods

JavaScript objects have internal slots (data) and internal methods (behavior). These are spec constructs, not accessible in code.

Example: Every function has:

  • [[Environment]] (internal slot): The lexical environment where it was created (enables closures).

  • [[Call]] (internal method): What happens when you invoke it.

Ordinary objects implement [[Get]] and [[Set]] for property access. Exotic objects (like Proxies) override these.

Systems Analogy: Internal slots are like private fields in a C++ class. Internal methods are virtual functions—exotic objects provide custom implementations.

Completion Records

Most abstract operations return a Completion Record, which is either:

  • Normal completion: { [[Type]]: normal, [[Value]]: v }

  • Abrupt completion: Errors, return, break, continue, etc.

The ? operator checks if a completion is abrupt and propagates it up the call stack.

Why this matters: Understanding Completion Records explains how exceptions work internally and why certain operations can fail.

TC39 and the Standards Process

Who is TC39?

Technical Committee 39 is the Ecma International committee responsible for ECMAScript. Members include:

  • Browser vendors: Google (V8), Mozilla (SpiderMonkey), Apple (JavaScriptCore), Microsoft (formerly Chakra, now contributes to V8).

  • Large tech companies: Facebook/Meta, Netflix, PayPal, Airbnb, Bloomberg.

  • Individual experts: Academics, language designers, and community representatives.

TC39 meets every two months (in-person or remote) to discuss proposals.

The Proposal Process

New features follow a stage-based process:

Stage 0: Strawperson

Anyone can submit an idea. It’s just a concept, not a formal proposal. Many ideas die here.

Example: “What if JavaScript had pattern matching?”

Stage 1: Proposal

The committee agrees the problem is worth solving. A champion (a TC39 member or invited expert) takes ownership. The proposal needs:

  • A clear problem statement.

  • High-level API design.

  • Potential challenges identified.

Example: Pattern matching reaches Stage 1 with a champion outlining use cases and syntax options.

Stage 2: Draft

The proposal has a formal specification text (written in ECMA-262 notation). The committee believes the feature will eventually be standardized, but details may change.

Engines may implement experimental versions (behind flags) for testing.

Example: Pattern matching gets draft spec text. V8 implements it behind --harmony-pattern-matching.

Stage 3: Candidate

The spec text is complete. Engines are expected to implement it for real-world testing. Only critical issues will cause changes.

Example: Pattern matching is implemented in Firefox Nightly and Chrome Canary. Feedback from developers refines edge cases.

Stage 4: Finished

The feature is ready for inclusion in the next annual release. Two independent implementations must exist, and significant real-world usage (or test262 tests) must validate it.

Example: Pattern matching ships in ES2026 after successful Stage 3 testing.

Consensus-Based Decision Making

TC39 operates by consensus, not majority vote. This means:

  • Everyone must agree (or at least not object strongly enough to block).

  • Compromises are common: Features are diluted or redesigned to satisfy objections.

  • Progress is slow but stable: Hasty decisions (like ES4) don’t repeat.

Engineering Trade-off: Consensus prevents fragmentation but can lead to “design by committee” where elegant solutions are compromised for political reasons.

Test262: The Conformance Test Suite

Test262 (https://github.com/tc39/test262) is the official conformance test suite. It has over 70,000 tests covering every feature in ECMA-262.

Engines run Test262 to ensure compliance. If your JavaScript engine fails Test262, it’s not conformant.

For implementers: Test262 is your ground truth. If you’re building a JavaScript compiler or interpreter, you must pass Test262.

Strict vs. Sloppy Mode: Why It Matters to You

The Historical Accident

JavaScript was designed to be forgiving. If you forgot var, the variable became global. If you used reserved words as identifiers, it often worked. If you assigned to undefined, it silently failed.

This forgiveness was a mistake. It led to bugs, performance cliffs, and security issues. But backward compatibility meant these mistakes couldn’t be fixed without breaking the web.

Solution: ES5 introduced strict mode in 2009.

Enabling Strict Mode

Add "use strict"; at the top of a file or function:

"use strict";

function example() {
    // This function runs in strict mode
    x = 10; // ReferenceError: x is not defined
}

In ES6 modules, strict mode is implicit. You don’t need "use strict"; because modules are always strict.

// Inside an ES6 module file (imported via <script type="module">)
x = 10; // ReferenceError, even without "use strict"

Key Differences

Sloppy Mode (Default) Strict Mode Why It Matters
Undeclared variables become global ReferenceError Prevents accidental globals
Assignment to read-only properties fails silently TypeError Catches bugs earlier
delete on non-configurable properties fails silently TypeError Avoids confusion
Octal literals (0123) allowed SyntaxError Octal is error-prone
with statement allowed SyntaxError with breaks optimizations
this in functions is global object this is undefined Safer default
Duplicate parameter names allowed SyntaxError Prevents ambiguity
arguments is aliased to parameters arguments is independent Simplifies semantics

Performance Implications

Strict mode enables optimizations:

  • Hidden classes: Engines can assume properties don’t change unexpectedly.

  • Inline caching: this in strict mode is more predictable.

  • Elimination of arguments aliasing: Simplifies stack frame layout.

Benchmarks show strict mode can be 10-20% faster for certain workloads because the engine doesn’t have to handle edge cases.

When to Use Strict Mode

Always. There’s no reason to use sloppy mode in new code. ES6 modules enforce it automatically, and any modern tooling (eslint, TypeScript) defaults to strict.

Exception: You’re maintaining legacy code that depends on sloppy behavior. In that case, incrementally migrate to strict mode.

The Two Faces of JavaScript: Browser (Mobile) vs. Node.js (Resident)

Mobile ECMA-262: The Browser Environment

“Mobile” refers to JavaScript running in a browser, where it’s transient—code is downloaded, executed, and discarded. The environment provides:

  • DOM APIs: document, window, Element, etc.

  • Browser APIs: fetch, localStorage, WebSocket, WebRTC.

  • Event-driven model: User clicks, network responses, timers.

  • Security sandboxing: Same-origin policy, CSP, no file system access.

The browser is a hostile environment—your code runs alongside untrusted code from other origins, so security is paramount.

Key characteristics:

  • No file system: You can’t open files directly.

  • Limited persistence: localStorage is capped at ~5-10MB.

  • Asynchronous everything: Network, user input, timers—all callback-based.

Resident ECMA-262: The Node.js Environment

“Resident” refers to JavaScript running on a server (or desktop) via Node.js, where it’s long-lived—the process stays up, handling requests or tasks continuously. Node.js provides:

  • File system APIs: fs.readFile, fs.writeFile, etc.

  • Network APIs: http, https, net, dgram (UDP).

  • Process control: child_process, cluster, process.exit().

  • Streams: Efficient I/O with backpressure.

Node.js is a trusted environment—you control the process, and it has full system access (with OS permissions).

Key characteristics:

  • Full file system access: Read/write/delete files.

  • Long-lived state: Variables persist across requests.

  • Synchronous file operations: fs.readFileSync() blocks, unlike the browser.

The Common Core

Both environments implement ECMA-262, so core JavaScript is identical:

  • Primitives, objects, functions, closures.

  • Promises, async/await, generators.

  • ES6 modules (with caveats).

Differences are in the host environment (DOM vs. Node.js APIs), not the language.

The Module System Divide

Browser: ES6 modules (import/export) are the standard, but CommonJS isn’t supported natively. Bundlers (Webpack, Rollup) transform modules for the browser.

Node.js: Originally used CommonJS (require()/module.exports). ES6 modules are now supported (since Node 12), but the ecosystem is split:

  • .mjs files: ES6 modules.

  • .cjs files: CommonJS modules.

  • .js files: Depends on package.json "type": "module" or "type": "commonjs".

Systems Programmer’s Note: This is like the mess of linking C libraries—static vs. dynamic, .a vs. .so, name mangling. Node.js is trying to unify, but legacy code remains.

Tooling Overlap

Many tools work in both environments:

  • Babel: Transpiles modern JavaScript to older versions.

  • ESLint: Lints code for errors and style.

  • Jest: Testing framework.

  • Webpack: Bundles modules (primarily for browsers, but can target Node.js).

Performance Differences

V8 (Chrome/Node.js) uses the same engine, but:

  • Browser: Optimizes for quick startup (website loads must be fast).

  • Node.js: Optimizes for throughput (servers run continuously).

Garbage collection tuning differs:

  • Browser: Frequent small GC pauses (don’t block rendering).

  • Node.js: Fewer, larger GC pauses (throughput matters more than latency).

Reading the Spec Like a Compiler Writer

The Mindset Shift

If you’ve written compilers, you’re familiar with:

  • Formal grammars: BNF, EBNF, or similar notation.

  • Attribute grammars: Syntax with semantic rules attached.

  • Operational semantics: Step-by-step execution rules.

ECMA-262 uses all three:

  • Grammar: §12-16 define lexical and syntactic grammar (similar to BNF).

  • Semantic rules: §8 defines syntax-directed operations (like attribute grammars).

  • Abstract operations: §7 defines operational semantics (step-by-step algorithms).

Your advantage: You already think this way. The spec is just documentation of the abstract machine.

Example: Tracing Type Coercion

Let’s trace why [] == ![] is true:

  1. Parse: [] == ![]

    • Left: [] (empty array)

    • Right: ![] (logical NOT of empty array)

  2. Evaluate right side:

    • ![] → Is [] truthy? Yes (all objects are truthy). → !truefalse.
  3. Now we have: [] == false

  4. Apply Abstract Equality (§7.2.15, p. 74):

    • Step 8: If one operand is Boolean, convert it to Number.

    • falseToNumber(false)0.

    • Now: [] == 0.

  5. Apply Abstract Equality again:

    • Step 10: If one operand is Object and the other is Number, convert Object to primitive.

    • []ToPrimitive([]) (§7.1.1, p. 63).

    • ToPrimitive([]) calls OrdinaryToPrimitive([], "number").

    • Tries valueOf(): [].valueOf()[] (still an object).

    • Tries toString(): [].toString()"" (empty string).

    • Now: "" == 0.

  6. Apply Abstract Equality again:

    • Step 6: If one operand is String and the other is Number, convert String to Number.

    • ""ToNumber("")0.

    • Now: 0 == 0.

  7. Result: true.

Takeaway: By following the spec’s algorithmic steps, you can predict any JavaScript behavior. This is essential when writing a transpiler or debugging generated code.

Grammars in the Spec

The spec uses two grammars:

Lexical Grammar (§12)

Defines tokens: identifiers, keywords, literals, punctuation.

Example (§12.6.1, p. 221):

IdentifierName :: IdentifierStart IdentifierName IdentifierPart

IdentifierStart :: UnicodeIDStart $ _  UnicodeEscapeSequence

IdentifierPart :: UnicodeIDContinue $ _  UnicodeEscapeSequence

This says identifiers start with a Unicode ID start character, $, or _, and can contain those plus Unicode ID continue characters.

Systems Analogy: This is like the lexer rules in lex or flex.

Syntactic Grammar (§13-16)

Defines how tokens combine into statements and expressions.

Example (§13.5, p. 256):

ConditionalExpression[In, Yield, Await] : ShortCircuitExpression[?In, ?Yield, ?Await] ShortCircuitExpression[?In, ?Yield, ?Await] ? AssignmentExpression[+In, ?Yield, ?Await] : AssignmentExpression[?In, ?Yield, ?Await]

This defines the ternary operator: condition ? trueExpr : falseExpr.

The [In, Yield, Await] are grammar parameters (context-sensitive flags). For example, [+In] means “in” is allowed, [?In] means “inherit from parent.”

Systems Analogy: This is like yacc or bison grammar rules.

Semantic Rules: RS: Evaluation

Runtime Semantics: Evaluation (§8.1, p. 93) defines what each syntactic construct does.

Example (§13.5.1, p. 257):

ConditionalExpression : ShortCircuitExpression ? AssignmentExpression : AssignmentExpression

  1. Let lref be ? Evaluation of ShortCircuitExpression.

  2. Let lval be ToBoolean(? GetValue(lref)).

  3. If lval is true, then

    1. Let trueRef be ? Evaluation of the first AssignmentExpression.
    2. Return ? GetValue(trueRef).
  4. Else,

    1. Let falseRef be ? Evaluation of the second AssignmentExpression.
    2. Return ? GetValue(falseRef).

This says: evaluate the condition, convert to boolean, and evaluate one branch depending on the result.

Systems Analogy: This is like instruction semantics in an ISA manual—each instruction’s behavior is defined step-by-step.

Building a Mental Model

To use the spec effectively:

  1. Start with MDN to understand high-level behavior.

  2. Trace through the spec to understand edge cases.

  3. Implement in code (a parser, interpreter, or transpiler) to solidify understanding.

  4. Run Test262 to validate your implementation.

Example workflow:

  • You’re implementing Array.prototype.map for your transpiler.

  • Read MDN: “Calls a function on every element, returns new array.”

  • Read ECMA-262 § (p. 588): See the exact algorithm (checks for callability, handles thisArg, handles sparse arrays).

  • Implement based on the spec.

  • Run Test262 tests for Array.prototype.map to catch edge cases.

The Spec Is Your Ground Truth

When JavaScript behaves unexpectedly:

  • Don’t trust intuition: Trust the spec.

  • Don’t trust Stack Overflow: Trust the spec (but SO can point you to the right section).

  • Don’t trust the browser: If it contradicts the spec, it’s a browser bug (or you misread the spec).

The spec defines JavaScript. Everything else is commentary.


End of Chapter 1

In the next chapter, we’ll dive into JavaScript’s type system from a systems perspective—how primitives and objects are represented, how type coercion works at the algorithmic level, and why typeof lies to you.


Chapter 2: JavaScript’s Type System from a Systems Perspective

The Seven Language Types and Their Internal Representation

Introduction: Types as Tagged Unions

If you’ve worked in C, you’re familiar with tagged unions—a discriminated union where a tag indicates which variant is active:

enum TypeTag {
    TYPE_INT,
    TYPE_FLOAT,
    TYPE_POINTER
};

struct Value {
    enum TypeTag tag;
    union {
        int i;
        double f;
        void* ptr;
    } data;
};

JavaScript’s type system is conceptually similar. Every JavaScript value is a tagged union with one of seven possible types. The difference is that JavaScript hides the tag from you—you can’t directly access or manipulate it, though you can query it with typeof (which sometimes lies) or internal operations.

ECMA-262 §6.1 (p. 25) defines the seven language types:

  1. Undefined – A singleton type with one value: undefined.

  2. Null – A singleton type with one value: null.

  3. Boolean – Two values: true and false.

  4. String – Sequences of UTF-16 code units.

  5. Symbol – Unique, immutable identifiers (ES6+).

  6. Number – IEEE 754 double-precision floating-point.

  7. BigInt – Arbitrary-precision integers (ES2020+).

  8. Object – Collections of properties (wait, that’s eight!)

Actually, Object is special. The first six are primitive types—immutable, passed by value. Object is the only reference type—mutable, passed by reference.

Systems Insight: In most JavaScript engines (V8, SpiderMonkey), values are represented using pointer tagging or NaN boxing to pack the type tag into the pointer itself. More on this in §2.2.

Undefined: The Uninitialized Sentinel

Specification: §6.1.1 (p. 25)

undefined is the type of variables that have been declared but not assigned:

let x;
console.log(x); // undefined

Internal representation: In V8, undefined is represented as a special tagged pointer (kUndefinedValue). It’s a singleton—there’s only one undefined in memory.

Where you’ll see it:

  • Uninitialized variables.

  • Missing function arguments.

  • Missing object properties.

  • Functions that don’t explicitly return (implicitly return undefined).

Systems note: undefined is not the same as an uninitialized variable in C. In JavaScript, accessing an uninitialized variable that hasn’t been declared is a ReferenceError, not undefined:

console.log(y); // ReferenceError: y is not defined
let y;

This is because of the Temporal Dead Zone (TDZ)—let and const variables exist in scope but are uninitialized until their declaration is executed (§8.1.1.4.9, p. 124).

Null: The Intentional Absence

Specification: §6.1.2 (p. 25)

null represents the intentional absence of a value. It’s semantically different from undefined:

  • undefined: “I haven’t been initialized yet.”

  • null: “I explicitly have no value.”

let x = null; // Explicitly no value
let y;        // Implicitly undefined

The typeof null bug: One of JavaScript’s most infamous quirks:

typeof null; // "object" (WAT?!)

This is a bug from JavaScript’s original implementation. In the first JavaScript engine, values were represented as a type tag (3 bits) plus a value (32 bits on 32-bit systems). Objects had a type tag of 000, and null was represented as a null pointer (all zeros), so the type check incorrectly identified null as an object.

Brendan Eich has called this “the original sin” of JavaScript. It can’t be fixed without breaking millions of websites that depend on typeof null === "object".

Systems lesson: Early implementation decisions calcify into permanent language semantics when backward compatibility is sacrosanct.

Boolean: The Simple Case

Specification: §6.1.3 (p. 25)

true and false—nothing fancy here. Two singleton values.

Internal representation: In V8, booleans are immediate values (not heap-allocated). They’re represented as tagged pointers with a specific bit pattern.

Truthy and falsy: JavaScript has a concept of truthy and falsy values for use in conditional contexts. The falsy values are:

  • false

  • 0, -0, 0n (BigInt zero)

  • "" (empty string)

  • null

  • undefined

  • NaN

Everything else is truthy, including:

  • "0" (non-empty string, even if it looks like zero)

  • [] (empty array)

  • {} (empty object)

  • function() {} (any function)

Conversion to boolean: The abstract operation ToBoolean (§7.1.2, p. 64) defines this:

ToBoolean ( argument )

  1. If argument is a Boolean, return argument.

  2. If argument is undefined or null, return false.

  3. If argument is a Number, then

    1. If argument is +0, -0, or NaN, return false.
    2. Otherwise, return true.
  4. If argument is a String, then

    1. If argument is the empty String, return false.
    2. Otherwise, return true.
  5. If argument is a Symbol or BigInt, return true.

  6. If argument is an Object, return true.

Systems note: This is why [] == false can be true (§1.6 in Chapter 1)—the array is coerced to a primitive (""), which is then coerced to a number (0), which equals false coerced to a number (0). But if ([]) is always true because objects are truthy.

String: UTF-16 Code Units, Not Characters

Specification: §6.1.4 (pp. 25-27)

JavaScript strings are sequences of 16-bit unsigned integers, representing UTF-16 code units. They are immutable and primitives (not objects, despite having methods via autoboxing—see §2.8).

UTF-16 encoding: JavaScript predates Unicode’s expansion beyond the Basic Multilingual Plane (BMP). Originally, 16 bits per character was sufficient. When Unicode added code points beyond U+FFFF, UTF-16 introduced surrogate pairs—two 16-bit code units representing one character.

Example: The emoji 💩 (U+1F4A9) is represented as two code units:

const poop = "💩";
console.log(poop.length);           // 2 (not 1!)
console.log(poop.charCodeAt(0));    // 55357 (0xD83D, high surrogate)
console.log(poop.charCodeAt(1));    // 56489 (0xDCA9, low surrogate)

This is a footgun: String.prototype.length counts code units, not characters (grapheme clusters). For most ASCII text, they’re the same. For emoji, combining characters, or other BMP-external code points, they differ.

ES6 improvements:

  • String.prototype.codePointAt(): Returns the full code point (handles surrogates).

  • String.fromCodePoint(): Creates strings from code points.

  • for...of iteration: Iterates by code points, not code units.

for (const char of "💩") {
    console.log(char); // Logs "💩" (one iteration)
}

Internal representation: Engines optimize string storage:

  • Latin-1 encoding: If all characters are < 256, store as one byte per character.

  • UTF-16 encoding: Otherwise, store as two bytes per code unit.

  • Ropes: Concatenated strings can be stored as trees of substrings (lazy concatenation).

  • Slices: Substrings can reference slices of parent strings without copying.

Systems note: JavaScript’s string model is awkward for modern Unicode text processing. If you’re building a transpiler that handles source code, you’ll need to be careful with positions—does your line/column counter use code units or code points? Source maps (§13.5) use zero-based code unit offsets.

Symbol: Unique, Unforgeable Identifiers

Specification: §6.1.5 (pp. 27-28)

Symbols were introduced in ES6 to solve the property key collision problem. Before symbols, all object property keys were strings. If you wanted to add a private or special property, you’d use a string like "__myPrivateProp" and hope no one else used the same name.

Symbols are unique: Every call to Symbol() creates a new, distinct symbol:

const sym1 = Symbol("description");
const sym2 = Symbol("description");
console.log(sym1 === sym2); // false

Symbols as property keys: You can use symbols as object property keys, and they won’t collide with string keys:

const mySymbol = Symbol("my-symbol");
const obj = {
    [mySymbol]: "value",
    "my-symbol": "different value"
};

console.log(obj[mySymbol]);     // "value"
console.log(obj["my-symbol"]);  // "different value"

Well-known symbols: The spec defines well-known symbols (§6.1.5.1, p. 28) for metaprogramming:

  • Symbol.iterator: Defines the default iterator for an object (enables for...of).

  • Symbol.toStringTag: Customizes Object.prototype.toString() behavior.

  • Symbol.toPrimitive: Customizes type coercion (§7.1.1, p. 63).

  • Symbol.hasInstance: Customizes instanceof behavior.

  • And 10+ more.

Internal representation: Symbols are heap-allocated objects with a unique identifier. The “description” is just metadata for debugging—it doesn’t affect uniqueness.

Global symbol registry: Symbol.for(key) creates or retrieves a symbol from a global registry:

const sym1 = Symbol.for("app.id");
const sym2 = Symbol.for("app.id");
console.log(sym1 === sym2); // true (same symbol)

Systems use case: If you’re implementing object property access in a transpiler, you need to handle symbol keys differently from string keys. Symbol keys are not enumerable in for...in loops and don’t show up in Object.keys().

Number: IEEE 754 Double-Precision Floating-Point

Specification: §6.1.6 (pp. 28-31)

JavaScript’s Number type is IEEE 754-2019 binary64 (double-precision floating-point). This means:

  • 64 bits total: 1 sign bit, 11 exponent bits, 52 mantissa bits (plus 1 implicit leading bit).

  • Range: Approximately ±1.7976931348623157×10308\pm 1.7976931348623157 \times 10^{308}.

  • Precision: 15-17 significant decimal digits.

  • Special values: +0, -0, +Infinity, -Infinity, NaN.

Why double-precision? JavaScript was designed for simple scripting, and doubles were “good enough” for most use cases. The assumption was that you wouldn’t need 64-bit integers (spoiler: you do, hence BigInt in ES2020).

The Integer Range Problem

Safe integer range: Integers can be represented exactly in the range [(2531),2531][-(2^{53} - 1), 2^{53} - 1]. This is because the mantissa has 52 bits plus 1 implicit bit (53 bits total).

console.log(Number.MAX_SAFE_INTEGER); // 9007199254740991 (2^53 - 1)
console.log(Number.MIN_SAFE_INTEGER); // -9007199254740991 (-(2^53 - 1))

Beyond this range, not all integers can be represented exactly:

console.log(9007199254740992 === 9007199254740993); // true (WAT?!)

Both values round to the same double-precision representation.

Systems impact: If you’re implementing a compiler backend targeting JavaScript, you can’t use Number for 64-bit integers. You need BigInt (§2.1.7) or a custom library (like bn.js or bignumber.js).

Signed Zero

IEEE 754 has two zeros: +0 and -0. They compare as equal, but have different bitwise representations:

console.log(+0 === -0);         // true
console.log(1 / +0);            // Infinity
console.log(1 / -0);            // -Infinity
console.log(Object.is(+0, -0)); // false (Object.is distinguishes them)

Why signed zero exists: It preserves the sign of underflow in floating-point computations. For example, limx0x=0\lim_{x \to 0^-} x = -0.

Systems note: If you’re generating JavaScript code that performs floating-point math, be aware that -0 can appear unexpectedly (e.g., -1 * 0 === -0).

NaN: Not a Number (But Still a Number)

NaN represents the result of invalid operations:

console.log(0 / 0);         // NaN
console.log(Math.sqrt(-1)); // NaN
console.log(parseInt("foo")); // NaN

NaN is not equal to itself:

console.log(NaN === NaN); // false

This is per IEEE 754 spec—NaN is unordered, so any comparison with NaN (including NaN == NaN) returns false.

To check for NaN, use:

Number.isNaN(value);     // Strict check (ES6, preferred)
isNaN(value);            // Global function (coerces to Number first, don't use)
Object.is(value, NaN);   // Also works
value !== value;         // Classic hack (only NaN is not equal to itself)

Systems note: If you’re implementing numeric operations in your transpiler, you must handle NaN propagation correctly. For example, NaN + 1 === NaN, NaN * 0 === NaN, etc.

Infinity

Infinity and -Infinity represent overflow:

console.log(1 / 0);    // Infinity
console.log(-1 / 0);   // -Infinity
console.log(1e308 * 10); // Infinity (overflow)

Operations with Infinity:

  • Infinity + 1 === Infinity

  • Infinity * 2 === Infinity

  • Infinity - Infinity === NaN (indeterminate form)

  • Infinity / Infinity === NaN

Systems note: Infinity behaves like the IEEE 754 special value. If you’re compiling to JavaScript, be aware that integer overflow in the source language (e.g., C’s INT_MAX + 1) won’t translate directly—JavaScript will produce Infinity or wrap differently depending on the operation.

BigInt: Arbitrary-Precision Integers

Specification: §6.1.6.2 (pp. 31-34)

BigInt was added in ES2020 to address the Number type’s inability to represent large integers. BigInt values are arbitrary-precision integers—they can represent integers of any size (limited only by memory).

Creating BigInts:

const big1 = 1234567890123456789012345678901234567890n; // Literal suffix 'n'
const big2 = BigInt("1234567890123456789012345678901234567890");
const big3 = BigInt(123); // Convert Number to BigInt (must be integer)

Operations:

const a = 10n;
const b = 20n;

console.log(a + b);  // 30n
console.log(a * b);  // 200n
console.log(b / a);  // 2n (integer division, truncates)
console.log(b % a);  // 0n
console.log(a ** b); // 100000000000000000000n (10^20)

No mixing with Number:

console.log(10n + 20);  // TypeError: Cannot mix BigInt and other types

You must explicitly convert:

console.log(10n + BigInt(20)); // 30n
console.log(Number(10n) + 20); // 30 (converts BigInt to Number, may lose precision)

Comparisons work across types:

console.log(10n == 10);  // true (abstract equality coerces)
console.log(10n === 10); // false (strict equality doesn't coerce)
console.log(10n < 20);   // true (relational comparison coerces)

Internal representation: BigInts are heap-allocated objects with dynamically-sized digit arrays. In V8, small BigInts (fitting in a word) are stored inline; large ones are stored as arrays of 32-bit or 64-bit digits (depending on architecture).

Systems use case: If you’re compiling a language with 64-bit integers (C, Rust, Go) to JavaScript, you have two options:

  1. Use Number for the safe integer range ([(2531),2531][-(2^{53}-1), 2^{53}-1]) and error/wrap on overflow.

  2. Use BigInt for all integers, accepting the performance cost (BigInt operations are slower than Number).

Emscripten (C/C++ to WebAssembly) originally used asm.js with 32-bit integers. For 64-bit integers, it used a pair of 32-bit values. With WebAssembly, 64-bit integers are native (i64 type).

Object: The Reference Type

Specification: §6.1.7 (pp. 34-43)

Object is the catch-all type for everything that’s not a primitive. This includes:

  • Plain objects: { key: value }

  • Arrays: [1, 2, 3]

  • Functions: function() {}

  • Dates: new Date()

  • RegExps: /pattern/

  • Maps, Sets, WeakMaps, WeakSets

  • Promises

  • And more.

Key distinction: Primitives are immutable and passed by value. Objects are mutable and passed by reference:

let a = 5;
let b = a;
b = 10;
console.log(a); // 5 (unchanged)

let obj1 = { x: 5 };
let obj2 = obj1;
obj2.x = 10;
console.log(obj1.x); // 10 (mutated!)

Internal structure: Objects are collections of properties. Each property has:

  • Key: String or Symbol.

  • Value: Any JavaScript value.

  • Attributes: [[Writable]], [[Enumerable]], [[Configurable]].

Properties are either data properties (have a value) or accessor properties (have a getter/setter).

Property descriptors (§6.2.5, pp. 45-46):

const obj = {};
Object.defineProperty(obj, "x", {
    value: 42,
    writable: false,
    enumerable: true,
    configurable: false
});

console.log(obj.x); // 42
obj.x = 100;        // Fails silently in sloppy mode, TypeError in strict mode
console.log(obj.x); // Still 42

Internal slots and methods: Objects have internal slots (data) and internal methods (behavior). For example:

  • [[Prototype]]: The object’s prototype (for inheritance).

  • [[Get]]: Invoked when accessing a property.

  • [[Set]]: Invoked when assigning to a property.

  • [[Call]]: Invoked when calling a function (functions are callable objects).

These are specification mechanisms, not directly accessible in code. You interact with them via built-in methods like Object.getPrototypeOf() or Reflect API.

Ordinary vs. exotic objects:

  • Ordinary objects: Follow standard property access semantics.

  • Exotic objects: Have custom internal methods. Examples:

    • Arrays: Custom [[DefineOwnProperty]] to update length.

    • Functions: Have [[Call]] and [[Construct]].

    • Proxy: Intercepts all internal methods.

    • Bound functions: Custom [[Call]] to bind this.

We’ll cover objects in depth in Chapter 3. For now, understand that Object is the foundation of JavaScript’s type system.

Pointer Tagging and NaN Boxing: How Engines Optimize Values

The Problem: Representing Dynamic Types Efficiently

JavaScript values have dynamic types—a variable can hold any type at runtime. The naive approach is a struct with a type tag and a union:

struct JSValue {
    enum { UNDEFINED, NULL, BOOL, NUMBER, STRING, OBJECT, SYMBOL, BIGINT } tag;
    union {
        bool boolean;
        double number;
        char* string;
        JSObject* object;
        // ...
    } data;
};

But this wastes space:

  • Size: On a 64-bit system, enum is 4 bytes (aligned to 8), union is 8 bytes (for a pointer or double). Total: 16 bytes per value.

  • Cache efficiency: Larger values mean fewer fit in CPU cache lines.

JavaScript engines need to optimize this. The two main techniques are pointer tagging and NaN boxing.

Pointer Tagging (V8’s Smi)

Pointer tagging exploits the fact that pointers are aligned—on a 64-bit system, heap-allocated objects are aligned to 8-byte boundaries, so the low 3 bits of a pointer are always 0.

V8’s Smi (Small Integer): If a value is a small integer (31 bits on 32-bit systems, 32 bits on 64-bit systems), V8 stores it directly in the pointer with the low bit set to 1 to distinguish it from a pointer:

Smi: [31-bit integer][1] (low bit is 1) Pointer: [61-bit address][000] (low 3 bits are 0)

Example (64-bit system):

  • The integer 42 is stored as 42 << 1 | 1 = 85 = 0x55.

  • A pointer 0x7ffeefbff000 stays as-is (low bits are 0).

Advantages:

  • Fast integer operations: Check the low bit. If it’s 1, it’s a Smi. Perform arithmetic directly on the tagged value (shift to extract, operate, shift back).

  • No heap allocation: Smis are immediate values.

Limitations:

  • Range: Smis can only represent [(231),2311][-(2^{31}), 2^{31}-1] on 64-bit systems (one bit is used for the tag, one for the sign).

  • Overflow: If an integer exceeds the Smi range, it must be boxed (heap-allocated as a HeapNumber).

Other tagged values: V8 also uses specific pointer values for singletons:

  • undefined: A specific tagged pointer.

  • null: Another specific tagged pointer.

  • true/false: Tagged pointers.

NaN Boxing (SpiderMonkey, JavaScriptCore)

NaN boxing exploits the fact that IEEE 754 doubles have many NaN representations. In IEEE 754:

  • Exponent bits all 1 (0x7FF) and mantissa non-zeroNaN.

  • There are 2522^{52} possible NaN bit patterns, but JavaScript only needs one canonical NaN.

Idea: Use the remaining NaN bit patterns to encode other types.

JavaScriptCore’s encoding (simplified, 64-bit):

NaN patterns (exponent = 0x7FF):

  • 0x7FF8_0000_0000_0000: Canonical NaN

  • 0x7FF0_0000_0000_0001 to 0x7FF7_FFFF_FFFF_FFFF: Unused NaN patterns

Encoding:

  • Doubles: Normal IEEE 754 (if not in the reserved NaN range).

  • Integers: Store as a double (if representable exactly).

  • Pointers: Use reserved NaN patterns.

  • Special values (undefined, null, true, false): Use reserved NaN patterns.

Example encoding:

undefined: 0xFFFF_FFFF_FFFF_FFF2 null: 0xFFFF_FFFF_FFFF_FFF3 true: 0xFFFF_FFFF_FFFF_FFF4 false: 0xFFFF_FFFF_FFFF_FFF5 Pointer: 0xFFFF_xxxx_xxxx_xxxx (where xxxx is the pointer value) Double: Normal IEEE 754 (0x0000_xxxx_xxxx_xxxx to 0x7FEF_xxxx_xxxx_xxxx and negatives)

Advantages:

  • Uniform 64-bit representation: Every value fits in 64 bits.

  • No extra type tag: The value itself encodes the type.

Disadvantages:

  • Pointer range limitation: On 64-bit systems, only 48 bits of address space are typically used (x86-64 canonical addresses), so pointers fit. But if the address space expands, this breaks.

  • Complexity: Bit manipulation is required for every value access.

Which Technique Does Each Engine Use?

  • V8 (Chrome, Node.js): Pointer tagging (Smi + heap objects).

  • SpiderMonkey (Firefox): NaN boxing (called “nunboxing” in SpiderMonkey).

  • JavaScriptCore (Safari): NaN boxing.

  • ChakraCore (old Edge): Pointer tagging (deprecated; Edge now uses V8).

Systems takeaway: When compiling to JavaScript, you generally don’t need to care about these details—engines handle it. But understanding them helps explain why:

  • Small integers are fast (Smi in V8).

  • Large integers or non-integer numbers are slower (heap-allocated).

  • Operations that cause boxing/unboxing (e.g., Smi overflow) have performance cliffs.

The Type() Abstract Operation: How the Spec Classifies Values

Specification: §6.1 (p. 25)

The Type(x) abstract operation returns the type tag of a value. It’s not exposed to JavaScript code, but it’s fundamental to understanding the spec.

Definition:

Type ( x )

Returns one of: Undefined, Null, Boolean, String, Symbol, Number, BigInt, Object.

Examples:

  • Type(undefined)Undefined

  • Type(null)Null

  • Type(true)Boolean

  • Type("hello")String

  • Type(Symbol())Symbol

  • Type(42)Number

  • Type(42n)BigInt

  • Type({})Object

  • Type([])Object

  • Type(function() {})Object

Note: Arrays and functions are Objects. There’s no separate Array or Function type at this level. To distinguish them, you check internal slots like [[Call]] (for functions).

The typeof Operator: JavaScript’s Unreliable Type Query

Specification: §13.5.3 (p. 260)

The typeof operator is JavaScript’s user-facing type query. It mostly matches Type(), but with quirks:

console.log(typeof undefined);       // "undefined"
console.log(typeof null);            // "object" (BUG!)
console.log(typeof true);            // "boolean"
console.log(typeof "hello");         // "string"
console.log(typeof Symbol());        // "symbol"
console.log(typeof 42);              // "number"
console.log(typeof 42n);             // "bigint"
console.log(typeof {});              // "object"
console.log(typeof []);              // "object"
console.log(typeof function() {});   // "function"
console.log(typeof Math.sqrt);       // "function"

Discrepancies:

  1. typeof null === "object": Bug from JavaScript’s original implementation (§2.1.2).

  2. typeof function() {} === "function": Functions are Objects internally, but typeof distinguishes them.

Why “function” is a special case: The spec explicitly checks for [[Call]] (§13.5.3, p. 260):

typeof operator

  1. If val has a [[Call]] internal method, return “function”.

  2. Return “object”.

Reliable Type Checking

To reliably check types:

For primitives:

value === undefined                 // Check for undefined
value === null                      // Check for null
typeof value === "boolean"          // Boolean
typeof value === "string"           // String
typeof value === "symbol"           // Symbol
typeof value === "number"           // Number
typeof value === "bigint"           // BigInt

For objects, use:

Array.isArray(value)                      // Array
typeof value === "function"               // Function
value instanceof Date                     // Date
value instanceof RegExp                   // RegExp
value instanceof Promise                  // Promise
Object.prototype.toString.call(value)     // Generic (returns "[object Type]")

Example:

Object.prototype.toString.call([]);        // "[object Array]"
Object.prototype.toString.call(new Date()); // "[object Date]"
Object.prototype.toString.call(/regex/);    // "[object RegExp]"

Why Object.prototype.toString? It uses the [[Class]] internal property (legacy term; now called Symbol.toStringTag). Objects can customize this:

const obj = {
    [Symbol.toStringTag]: "MyCustomType"
};
console.log(Object.prototype.toString.call(obj)); // "[object MyCustomType]"

Systems note: If you’re implementing type guards in a transpiler, you’ll need to emit code that handles these quirks. For example, checking for null requires value === null, not typeof value === "object".

Type Coercion: The Algorithmic Nightmare

Why Coercion Exists

JavaScript was designed for non-programmers. Brendan Eich wanted it to be forgiving—if you write "5" + 2, it shouldn’t error; it should “do something reasonable.”

The problem: “reasonable” is subjective. JavaScript’s coercion rules are Byzantine, full of edge cases, and the source of endless bugs.

Two kinds of coercion:

  1. Explicit coercion: You manually convert types (Number("5"), String(42)).

  2. Implicit coercion: JavaScript converts types automatically ("5" + 2, if (value)).

Implicit coercion is where the madness lies.

The Big Three: ToString, ToNumber, ToBoolean

The spec defines three core coercion operations:

ToString (§7.1.17, pp. 71-72)

Converts a value to a string.

Algorithm (simplified):

ToString ( argument )

  1. If argument is a String, return argument.

  2. If argument is a Symbol, throw TypeError.

  3. If argument is undefined, return “undefined”.

  4. If argument is null, return “null”.

  5. If argument is true, return “true”.

  6. If argument is false, return “false”.

  7. If argument is a Number, return NumberToString(argument).

  8. If argument is a BigInt, return BigIntToString(argument).

  9. If argument is an Object, return ? ToString(? ToPrimitive(argument, STRING)).

Examples:

String(undefined);      // "undefined"
String(null);           // "null"
String(true);           // "true"
String(42);             // "42"
String(42n);            // "42"
String({});             // "[object Object]" (calls ToPrimitive, then toString)
String([1, 2, 3]);      // "1,2,3"
String(Symbol("x"));    // TypeError!

Key insight: Objects are converted via ToPrimitive (§7.1.1, p. 63), which tries valueOf(), then toString().

ToNumber (§7.1.4, pp. 65-67)

Converts a value to a number.

Algorithm (simplified):

ToNumber ( argument )

  1. If argument is a Number, return argument.

  2. If argument is a Symbol or BigInt, throw TypeError.

  3. If argument is undefined, return NaN.

  4. If argument is null, return +0.

  5. If argument is true, return 1.

  6. If argument is false, return +0.

  7. If argument is a String, parse it as a number (details in §7.1.4.1).

  8. If argument is an Object, return ? ToNumber(? ToPrimitive(argument, NUMBER)).

Examples:

Number(undefined);      // NaN
Number(null);           // 0 (WAT?!)
Number(true);           // 1
Number(false);          // 0
Number("42");           // 42
Number("42.5");         // 42.5
Number("  42  ");       // 42 (trims whitespace)
Number("");             // 0 (empty string is 0)
Number("hello");        // NaN
Number([]);             // 0 ([] -> "" -> 0)
Number([5]);            // 5 ([5] -> "5" -> 5)
Number([1, 2]);         // NaN ([1,2] -> "1,2" -> NaN)
Number({});             // NaN ({} -> "[object Object]" -> NaN)
Number(Symbol("x"));    // TypeError!
Number(42n);            // TypeError!

Why Number(null) === 0? Historical accident. In JavaScript’s original implementation, null was treated as “nothing,” which coerced to 0. It’s a terrible decision but unchangeable due to backward compatibility.

ToBoolean (§7.1.2, p. 64)

Converts a value to a boolean (covered in §2.1.3).

Falsy values: false, 0, -0, 0n, "", null, undefined, NaN.

Everything else: Truthy.

ToPrimitive: The Object-to-Primitive Dance

Specification: §7.1.1 (pp. 63-64)

When JavaScript needs to coerce an object to a primitive (for arithmetic, string concatenation, etc.), it calls ToPrimitive:

ToPrimitive ( input [ , preferredType ] )

  1. If input is not an Object, return input.

  2. If preferredType is not present, let hint be “default”.

  3. Else if preferredType is STRING, let hint be “string”.

  4. Else, let hint be “number”.

  5. Let exoticToPrim be ? GetMethod(input, @@toPrimitive).

  6. If exoticToPrim is not undefined, then

    1. Let result be ? Call(exoticToPrim, input, « hint »).
    2. If result is not an Object, return result.
    3. Throw TypeError.
  7. If hint is “default”, set hint to “number”.

  8. Return ? OrdinaryToPrimitive(input, hint).

OrdinaryToPrimitive (§7.1.1.1, p. 64):

OrdinaryToPrimitive ( O, hint )

  1. If hint is “string”, then
    1. Let methodNames be « “toString”, “valueOf” ».
  2. Else,
    1. Let methodNames be « “valueOf”, “toString” ».
  3. For each name in methodNames, do
    1. Let method be ? Get(O, name).
    2. If IsCallable(method), then
      1. Let result be ? Call(method, O).
      2. If result is not an Object, return result.
  4. Throw TypeError.

Translation:

  • If hint is "string", try toString() first, then valueOf().

  • If hint is "number" or "default", try valueOf() first, then toString().

  • If both return objects, throw TypeError.

Examples:

const obj = {
    valueOf() { return 42; },
    toString() { return "hello"; }
};

Number(obj);  // 42 (hint is "number", tries valueOf first)
String(obj);  // "hello" (hint is "string", tries toString first)
obj + "";     // "42" (hint is "default", which becomes "number")

Custom Symbol.toPrimitive:

const obj = {
    [Symbol.toPrimitive](hint) {
        if (hint === "number") return 42;
        if (hint === "string") return "hello";
        return "default";
    }
};

Number(obj);  // 42
String(obj);  // "hello"
obj + "";     // "default"

Systems note: If you’re transpiling a language with operator overloading to JavaScript, you might emit code that defines Symbol.toPrimitive.

Abstract Equality (==) vs. Strict Equality (===)

Specification: §7.2.15 (pp. 74-75) for ==, §7.2.16 (p. 75) for ===.

Strict Equality (===)

No coercion. Types must match:

x === y

  1. If Type(x) is different from Type(y), return false.

  2. If Type(x) is Number or BigInt, compare numerically (handle NaN, ±0).

  3. Return SameValueNonNumeric(x, y) (reference comparison for Objects, value comparison for primitives).

Examples:

5 === 5;            // true
5 === "5";          // false (different types)
NaN === NaN;        // false (NaN is unordered)

+0 === -0;          // true
[] === [];          // false (different object references)
Abstract Equality (==)

Coercion madness. The algorithm is 12 steps (§7.2.15, pp. 74-75). Simplified version:

x == y

  1. If Type(x) is the same as Type(y), return x === y.

  2. If x is null and y is undefined (or vice versa), return true.

  3. If x is Number and y is String, return x == ToNumber(y).

  4. If x is String and y is Number, return ToNumber(x) == y.

  5. If x is BigInt and y is String, convert String to BigInt and compare.

  6. If x is Boolean, return ToNumber(x) == y.

  7. If y is Boolean, return x == ToNumber(y).

  8. If x is Object and y is primitive, return ToPrimitive(x) == y.

  9. If y is Object and x is primitive, return x == ToPrimitive(y).

  10. Otherwise, return false.

Examples:

null == undefined;  // true (special case)
5 == "5";           // true (coerces "5" to 5)
0 == false;         // true (false -> 0)
"" == false;        // true ("" -> 0, false -> 0)
[] == false;        // true ([] -> "" -> 0, false -> 0)
[] == ![];          // true (Chapter 1 example: [] -> 0, ![] -> false -> 0)
"0" == false;       // true ("0" -> 0, false -> 0)
"0" == 0;           // true ("0" -> 0)
0 == "0";           // true (symmetric)
false == "false";   // false ("false" -> NaN, false -> 0)

Why [] == ![] is true (§1.6 in Chapter 1):

  1. ![]false (objects are truthy).

  2. [] == false.

  3. false0 (step 7).

  4. [] == 0.

  5. []ToPrimitive([])"" (step 8).

  6. "" == 0.

  7. ""0 (step 3).

  8. 0 == 0true.

Engineering lesson: Never use == unless you explicitly need coercion (rare). Always use ===.

Relational Comparisons (<, >, <=, >=)

Specification: §7.2.14 (pp. 73-74)

Relational operators coerce to primitives (with hint "number"), then compare:

x < y

  1. Let px be ? ToPrimitive(x, NUMBER).

  2. Let py be ? ToPrimitive(y, NUMBER).

  3. If px and py are Strings, compare lexicographically.

  4. Else, convert both to Number and compare numerically.

Examples:

5 < 10;             // true
"5" < 10;           // true ("5" -> 5)
"10" < "5";         // true (lexicographic: "1" < "5")
"10" < 5;           // false ("10" -> 10, 10 < 5 is false)
[1] < [2];          // true ([1] -> "1" -> 1, [2] -> "2" -> 2)
{} < {};            // false ({} -> NaN, NaN < NaN is false)

Gotcha: String comparison is lexicographic, not numeric:

"10" < "2";         // true ("1" < "2")
10 < 2;             // false

Systems note: If you’re generating comparison code, ensure operands are the same type to avoid coercion surprises.

Autoboxing: Primitives with Methods

The Illusion of Primitive Methods

Primitives (String, Number, Boolean, Symbol, BigInt) are not objects. Yet you can call methods on them:

"hello".toUpperCase();  // "HELLO"
(42).toFixed(2);        // "42.00"
true.toString();        // "true"

How? Autoboxing (also called wrapping). When you access a property on a primitive, JavaScript temporarily converts it to an object:

Algorithm (§7.1.19, p. 72):

ToObject ( argument )

  1. If argument is undefined or null, throw TypeError.

  2. If argument is a Boolean, return a Boolean object wrapping argument.

  3. If argument is a Number, return a Number object wrapping argument.

  4. If argument is a String, return a String object wrapping argument.

  5. If argument is a Symbol, return a Symbol object wrapping argument.

  6. If argument is a BigInt, return a BigInt object wrapping argument.

  7. If argument is an Object, return argument.

Example:

const str = "hello";
str.toUpperCase();  // Internally: ToObject(str).toUpperCase()

The wrapper object is created, the method is called, and the wrapper is discarded.

Wrapper constructors:

const strObj = new String("hello");
console.log(typeof strObj);       // "object"
console.log(strObj.valueOf());    // "hello" (unwrap)
console.log(strObj === "hello");  // false (object vs. primitive)

Never use wrapper constructors explicitly (with new). They’re confusing and unnecessary. If you want to convert types, call without new:

String(42);   // "42" (converts to primitive string)
Number("42"); // 42 (converts to primitive number)

Why Autoboxing Matters

Performance: Autoboxing has overhead. Engines optimize it (e.g., V8 inlines wrapper creation), but it’s still slower than operating on primitives directly.

Mutation doesn’t work:

const str = "hello";
str.foo = "bar";
console.log(str.foo); // undefined (the wrapper is discarded)

Each access creates a new wrapper, so assignments are lost.

Systems note: If you’re compiling to JavaScript and want to attach metadata to primitives, you can’t. You’ll need to use a Map or WeakMap to associate data with primitives.

Value vs. Reference Semantics: The Great Divide

Primitives Are Passed by Value

When you assign a primitive to a variable or pass it to a function, the value is copied:

let a = 5;
let b = a;
b = 10;
console.log(a); // 5 (unchanged)
function modify(x) {
    x = 100;
}
let num = 50;
modify(num);
console.log(num); // 50 (unchanged)

Systems analogy: Primitives behave like C’s int, double, etc.—pass-by-value.

Objects Are Passed by Reference

When you assign an object to a variable or pass it to a function, the reference is copied, not the object itself:

let obj1 = { x: 5 };
let obj2 = obj1;
obj2.x = 10;
console.log(obj1.x); // 10 (mutated!)
function modify(obj) {
    obj.x = 100;
}
let myObj = { x: 50 };
modify(myObj);
console.log(myObj.x); // 100 (mutated!)

But reassigning the variable doesn’t affect the original:

function modify(obj) {
    obj = { x: 200 };
}
let myObj = { x: 50 };
modify(myObj);
console.log(myObj.x); // 50 (unchanged, because we reassigned the local variable)

Systems analogy: Objects behave like C pointers—you’re passing the address, not the data. Mutating through the pointer affects the original. Reassigning the pointer doesn’t.

Implications for Transpiler Design

If you’re compiling a language with value semantics (like Rust or C++ structs) to JavaScript:

  • Option 1: Represent structs as objects, but deep clone on assignment (expensive).

  • Option 2: Use immutable patterns (return new objects instead of mutating).

  • Option 3: Use Object.freeze() to prevent mutation (throws errors in strict mode).

Example (deep clone):

function cloneStruct(obj) {
    return JSON.parse(JSON.stringify(obj)); // Naive, doesn't handle functions/symbols
}

let struct1 = { x: 5, y: 10 };
let struct2 = cloneStruct(struct1);
struct2.x = 20;
console.log(struct1.x); // 5 (unchanged)

Better cloning (structured clone, ES2022):

let struct2 = structuredClone(struct1);

Shallow vs. Deep Comparison

=== compares references for objects:

console.log({} === {});       // false (different references)
console.log([] === []);       // false
let obj = {};
console.log(obj === obj);     // true (same reference)

Deep comparison requires a custom function:

function deepEqual(a, b) {
    if (a === b) return true;
    if (typeof a !== "object" || typeof b !== "object") return false;
    if (a === null || b === null) return false;
    
    const keysA = Object.keys(a);
    const keysB = Object.keys(b);
    if (keysA.length !== keysB.length) return false;
    
    for (const key of keysA) {
        if (!keysB.includes(key)) return false;
        if (!deepEqual(a[key], b[key])) return false;
    }
    return true;
}

console.log(deepEqual({ x: 1 }, { x: 1 })); // true
console.log(deepEqual([1, 2], [1, 2]));     // true

Systems note: If you’re emitting comparison code, primitives can use ===, but objects require deep comparison if you want value semantics.

Memory Management: Garbage Collection and WeakMaps

JavaScript Has Automatic Garbage Collection

Unlike C, you don’t manually allocate/free memory. JavaScript engines use tracing garbage collectors (mark-and-sweep or generational GC).

Key insight: You can’t deallocate memory explicitly. Objects are collected when unreachable.

Example:

let obj = { data: "large array" };
obj = null; // Now the object is unreachable and can be GC'd

Generational GC (V8)

V8 uses a generational garbage collector with two generations:

  1. Young generation (nursery): New objects. GC’d frequently (minor GC, fast).

  2. Old generation (tenured): Objects that survive multiple young-gen GCs. GC’d infrequently (major GC, slow).

Promotion: Objects move from young to old after surviving several collections.

Stop-the-world pauses: GC pauses JavaScript execution. V8 uses incremental marking and concurrent sweeping to minimize pauses.

Memory Leaks in JavaScript

You can’t have use-after-free or double-free bugs (GC prevents that), but you can have memory leaks—objects that are reachable but no longer needed.

Common causes:

  1. Global variables: Never collected.

  2. Closures: Capture variables in scope, preventing collection.

  3. Event listeners: Not removed when DOM elements are removed.

  4. Caches without eviction: Map or object holding references indefinitely.

Example (closure leak):

function createLeakyFunction() {
    const largeData = new Array(1000000).fill("data");
    return function() {
        console.log("I capture largeData in my closure!");
    };
}

const fn = createLeakyFunction(); // largeData is retained

Even if you never use largeData in the returned function, it’s captured in the closure’s scope.

WeakMap and WeakSet: Weak References

Specification: §25.3 (WeakMap, pp. 651-656), §25.4 (WeakSet, pp. 657-660)

WeakMap and WeakSet hold weak references to keys. If the only reference to a key is from a WeakMap/WeakSet, the key can be garbage collected.

Use case: Associating metadata with objects without preventing their collection.

Example:

const metadata = new WeakMap();

let obj = { id: 1 };
metadata.set(obj, { timestamp: Date.now() });

console.log(metadata.get(obj)); // { timestamp: ... }

obj = null; // Now obj can be GC'd, and the WeakMap entry is removed

Contrast with Map:

const metadata = new Map();

let obj = { id: 1 };
metadata.set(obj, { timestamp: Date.now() });

obj = null; // obj is still reachable via the Map, so it's NOT GC'd

WeakRef (ES2021): For even finer control, WeakRef creates a weak reference to an object. You can check if it’s still alive:

let obj = { data: "important" };
const ref = new WeakRef(obj);

console.log(ref.deref()); // { data: "important" }

obj = null; // Object may be GC'd

setTimeout(() => {
    console.log(ref.deref()); // undefined (if GC'd)
}, 1000);

FinalizationRegistry (ES2021): Runs a callback when an object is collected:

const registry = new FinalizationRegistry((heldValue) => {
    console.log(`Object with value ${heldValue} was collected`);
});

let obj = { id: 1 };
registry.register(obj, "myObject");

obj = null; // Callback runs eventually after GC

Systems note: If you’re implementing a transpiler with manual memory management (e.g., compiling C to JavaScript), you can use WeakMap to associate metadata (like allocation info) with objects without preventing collection.

Specification Types: Internal Constructs You’ll Never Touch Directly

What Are Specification Types?

In addition to the seven language types (Undefined, Null, Boolean, String, Symbol, Number, BigInt, Object), the spec defines specification types—internal constructs used in algorithmic descriptions. These are not accessible in JavaScript code.

Specification types (§6.2, pp. 43-61):

  1. Completion Record: Represents the result of an operation (normal, return, throw, break, continue).

  2. Reference: Represents a binding (variable, property).

  3. Property Descriptor: Describes a property’s attributes.

  4. Environment Record: Stores variable bindings (lexical scope).

  5. Data Block: Raw binary data (for TypedArrays, SharedArrayBuffer).

Completion Record: How Control Flow Works Internally

Specification: §6.2.3 (pp. 44-45)

Every abstract operation returns a Completion Record:

CompletionRecord { [[Type]]: normal | return | throw | break | continue [[Value]]: any value [[Target]]: label (for break/continue) }

Normal completion: Operation succeeded, [[Value]] is the result.

Abrupt completion: Operation failed or control flow changed (return, throw, break, continue).

The ? operator (§5.2.3.3, p. 15):

Let result be ? SomeOperation().

Equivalent to:

Let result be SomeOperation(). If result is an abrupt completion, return result. Otherwise, let result be result.[[Value]].

The ! operator (§5.2.3.4, p. 15):

Let result be ! SomeOperation().

Equivalent to:

Let result be SomeOperation(). Assert: result is a normal completion. Let result be result.[[Value]].

Systems analogy: Completion Records are like error codes in C (-1 for error, 0 for success) or Rust’s Result<T, E>. The ? operator is like Rust’s ? operator (early return on error).

Reference: How Variable Lookup Works

Specification: §6.2.4 (pp. 45-46)

A Reference represents a binding to a variable or property:

Reference { [[Base]]: environment record or object [[ReferencedName]]: property key or variable name [[Strict]]: boolean (strict mode?) }

Example: In obj.prop, the reference is:

Reference { [[Base]]: obj [[ReferencedName]]: “prop” }

GetValue (§6.2.4.5, p. 46) dereferences the reference:

GetValue ( V )

  1. If V is not a Reference, return V.

  2. Let base be V.[[Base]].

  3. If base is an Environment Record, return base.GetBindingValue(V.[[ReferencedName]]).

  4. Else (base is an Object),

    1. Return ? base.[Get].

Systems note: This is how a.b is evaluated—the parser creates a Reference, and runtime dereferences it.

Property Descriptor: How Properties Are Stored

Specification: §6.2.5 (pp. 46-47)

A Property Descriptor describes a property’s attributes:

PropertyDescriptor { [[Value]]: any (for data properties) [[Writable]]: boolean (for data properties) [[Get]]: function (for accessor properties) [[Set]]: function (for accessor properties) [[Enumerable]]: boolean [[Configurable]]: boolean }

Data property:

const obj = {};
Object.defineProperty(obj, "x", {
    value: 42,
    writable: true,
    enumerable: true,
    configurable: true
});

Accessor property:

const obj = {};
Object.defineProperty(obj, "x", {
    get() { return this._x; },
    set(val) { this._x = val; },
    enumerable: true,
    configurable: true
});

Systems note: If you’re implementing object property access in a transpiler, you need to handle both data and accessor properties correctly.

Environment Record: Lexical Scope

Specification: §9.1 (pp. 137-145)

An Environment Record stores variable bindings for a scope:

EnvironmentRecord { [[OuterEnv]]: parent environment (for scope chain) bindings: map of variable names to values }

Three kinds:

  1. Declarative Environment Record: let, const, function parameters.

  2. Object Environment Record: with statement (legacy), global object.

  3. Global Environment Record: Global scope (combines declarative and object).

Lexical scope is implemented as a linked list of environment records:

Global Environment

-> Function Environment (outer function)

-> Block Environment (let/const in block)

Systems analogy: Environment Records are like stack frames in C, but heap-allocated (for closures). The [[OuterEnv]] pointer is like the frame pointer linking to the caller’s frame.


End of Chapter 2

In the next chapter, we’ll explore JavaScript’s object model in depth—prototypes, inheritance, property access, the this keyword, and how engines optimize object layout with hidden classes.


Chapter 3: The Execution Model: Event Loop, Call Stack, and Heap

JavaScript’s Single-Threaded Concurrency Model

The Fundamental Constraint: One Thread of Execution

JavaScript was designed for the browser. In 1995, Netscape wanted a simple scripting language that could manipulate HTML without the complexity of threads. Brendan Eich’s solution: a single-threaded execution model with asynchronous I/O.

What does “single-threaded” mean?

  • Only one JavaScript call stack at a time.

  • Only one statement executes at any given moment.

  • No race conditions on shared memory (in the traditional sense).

  • No need for locks, mutexes, or semaphores in JavaScript code.

Systems analogy: Contrast this with POSIX threads (pthreads) in C:

pthread_t thread1, thread2;
int shared_counter = 0;
pthread_mutex_t lock;

void* increment(void* arg) {
    pthread_mutex_lock(&lock);
    shared_counter++;
    pthread_mutex_unlock(&lock);
    return NULL;
}

pthread_create(&thread1, NULL, increment, NULL);
pthread_create(&thread2, NULL, increment, NULL);

In JavaScript, you never write code like this. The single-threaded model eliminates the need for explicit synchronization primitives at the language level.

But how does JavaScript handle I/O without blocking?

The answer is the event loop—a run-to-completion model where asynchronous operations are queued and executed when the stack is empty.

The Event Loop Architecture: A Conceptual Overview

The JavaScript runtime consists of three core components:

  1. Call Stack: Tracks function execution (stack frames).

  2. Heap: Stores objects and closures (garbage-collected memory).

  3. Event Loop: Coordinates asynchronous tasks (timers, I/O, promises).

Diagram (conceptual):

┌│─────────────────────────────────────────────────────────┐

││ JavaScript Runtime │

├│─────────────────────────────────────────────────────────┤

││ │

││ ┌──────────────┐ ┌──────────────┐ │

││ │ Call Stack │ │ Heap │ │

││ │ │ │ │ │

││ │ frame_n │ │ Object@0x1 │ │

││ │ frame_n-1 │ │ Object@0x2 │ │

││ │ … │ │ … │ │

││ │ frame_0 │ │ │ │

││ └──────────────┘ └──────────────┘ │

││ │

││ ┌─────────────────────────────────────────────────┐ │

││ │ Event Loop (Message Queue) │ │

││ ├─────────────────────────────────────────────────┤ │

││ │ Task Queue (Macrotasks): │ │

││ │ [setTimeout, setInterval, I/O, UI events] │ │

││ │ Microtask Queue: │ │

││ │ [Promise callbacks, queueMicrotask] │ │

││ └─────────────────────────────────────────────────┘ │

││ │ └─────────────────────────────────────────────────────────┘ ▲ │ Web APIs / Node.js APIs (setTimeout, fetch, fs.readFile, etc.)

Execution flow:

  1. Synchronous code runs on the call stack until completion.

  2. Asynchronous operations (timers, I/O) are delegated to Web APIs (browser) or libuv (Node.js).

  3. When an async operation completes, its callback is enqueued.

  4. The event loop dequeues tasks when the call stack is empty.

Key invariant: The event loop never interrupts executing code. Each task runs to completion before the next task starts.

The Event Loop Specification (HTML Living Standard)

JavaScript’s core spec (ECMA-262) does not define the event loop. It’s defined in:

ECMA-262’s Job Queue (§9.5, pp. 150-151):

The spec defines an abstract job queue for promise reactions (microtasks), but leaves the event loop implementation-defined:

Jobs are scheduled for future execution via EnqueueJob. The Host (browser or Node.js) determines when jobs are executed.

HTML Living Standard’s Event Loop:

Each event loop has:

  • Task queues (plural): Multiple queues for different task sources (timers, I/O, rendering).

  • Microtask queue: Single queue for promise callbacks.

  • Rendering pipeline: Steps for updating the DOM and screen.

Node.js’s Event Loop:

Six phases, each with a queue:

  1. Timers: setTimeout, setInterval.

  2. Pending callbacks: I/O callbacks deferred from the previous cycle.

  3. Idle, prepare: Internal use.

  4. Poll: Retrieve new I/O events; execute I/O callbacks.

  5. Check: setImmediate callbacks.

  6. Close callbacks: socket.on('close', ...).

Between each phase, the microtask queue is drained.

Systems takeaway: The event loop is host-defined, but the core principle is the same: run to completion + task queues.


The Call Stack: Execution Contexts and Stack Frames

Execution Contexts: The Spec’s Abstraction

Specification: §9.4 (pp. 146-150)

An execution context is the spec’s abstraction for “what’s currently executing.” It contains:

ExecutionContext { CodeEvaluationState: suspend/resume (for generators/async) Function: the function being executed (or null for global/eval) Realm: the global object and intrinsics ScriptOrModule: the script or module LexicalEnvironment: current lexical scope (let/const) VariableEnvironment: current variable scope (var) PrivateEnvironment: private fields (class fields) }

Execution context stack: A stack of execution contexts (analogous to the call stack in C).

Example:

function outer() {
    let x = 1;
    function inner() {
        let y = 2;
        console.log(x + y);
    }
    inner();
}
outer();

Execution context stack evolution:

Initial: [GlobalExecutionContext]

outer() called: [GlobalExecutionContext, outer_ExecutionContext]

inner() called: [GlobalExecutionContext, outer_ExecutionContext, inner_ExecutionContext]

inner() returns: [GlobalExecutionContext, outer_ExecutionContext]

outer() returns: [GlobalExecutionContext]

Systems analogy: This is identical to the call stack in C:

void inner() { int y = 2; printf("%d\n", y); }
void outer() { int x = 1; inner(); }
void main() { outer(); }

Call stack:

main’s frame

-> outer’s frame

-> inner's frame

Stack Frames in JavaScript Engines

Engines don’t literally implement the spec’s execution context stack—they use optimized stack frames (like C compilers).

V8’s stack frame (simplified):

┌│─────────────────────────────────────┐

││ Return Address (caller’s PC) │

││ Frame Pointer (FP, points to base) │

││ Context (pointer to lexical env) │

││ Function (pointer to JSFunction) │

││ Receiver (this) │

││ Argument 0 │

││ Argument 1 │

││ … │

││ Local Variable 0 │

││ Local Variable 1 │

││ … │

││ Temporary Value 0 (stack operands) │

││ Temporary Value 1 │

││ … │ └─────────────────────────────────────┘

Key elements:

  • Return address: Where to jump after the function returns.

  • Frame pointer (FP): Base of the current frame (like rbp in x86-64).

  • Context: Pointer to the LexicalEnvironment (for closures).

  • Receiver: The this value.

  • Arguments: Function parameters.

  • Locals: Local variables.

  • Temporaries: Operands for bytecode instructions.

Systems note: JavaScript’s stack frames are heap-allocated if the function creates a closure (the context must outlive the function). Otherwise, they’re stack-allocated.

Stack Overflow: Recursion Depth Limits

JavaScript has a maximum call stack size (implementation-defined). Exceeding it throws RangeError: Maximum call stack size exceeded.

Example:

function recurse() {
    recurse(); // Infinite recursion
}
recurse(); // RangeError

Stack size limits (typical):

  • V8 (Chrome/Node.js): ~10,000-15,000 frames (depends on platform).

  • SpiderMonkey (Firefox): ~50,000 frames.

  • JavaScriptCore (Safari): ~100,000 frames.

Systems implication: If you’re compiling a recursive language (Scheme, Haskell) to JavaScript, you need tail call optimization (TCO) or trampolining.

Tail Call Optimization (TCO): ES6 specified TCO in strict mode (§15.9.1, p. 361), but only Safari implements it. V8 and SpiderMonkey explicitly rejected it due to debugger complexity.

Trampolining: Convert recursion to iteration:

function trampoline(fn) {
    let result = fn();
    while (typeof result === "function") {
        result = result();
    }
    return result;
}

function factorial(n, acc = 1) {
    if (n <= 1) return acc;
    return () => factorial(n - 1, n * acc); // Return a thunk
}

console.log(trampoline(() => factorial(100000))); // Works!

Continuation-Passing Style (CPS): Another approach—pass the “rest of the computation” as a callback:

function factorial_cps(n, k) {
    if (n <= 1) return k(1);
    return factorial_cps(n - 1, (result) => k(n * result));
}

factorial_cps(5, (result) => console.log(result)); // 120

But this still overflows without TCO. You’d need to trampoline the CPS’d function.

Stack Traces and Debugging

Error.stack (non-standard but universal):

function foo() { bar(); }
function bar() { baz(); }
function baz() { console.log(new Error().stack); }
foo();

Output (V8):

Error at baz (:3:29) at bar (:2:14) at foo (:1:14) at :4:1

Source maps: For transpiled/minified code, engines use source maps (.map files) to map stack traces back to original source.

Systems note: If you’re building a transpiler, generate source maps (§13.5) so stack traces are useful.


The Heap: Object Storage and Garbage Collection

Heap Layout: Young Gen, Old Gen, Large Objects

JavaScript engines use generational garbage collection with multiple heap regions:

V8’s heap structure:

┌│─────────────────────────────────────────────────────┐

││ V8 Heap │

├│─────────────────────────────────────────────────────┤

││ Young Generation (Scavenger, Semi-Space) │

││ - From-space (active) │

││ - To-space (evacuation target) │

││ - Typical size: 1-8 MB │

││ - GC frequency: Very frequent (minor GC) │

├│─────────────────────────────────────────────────────┤

││ Old Generation (Mark-Sweep-Compact) │

││ - Old pointer space (objects with pointers) │

││ - Old data space (objects without pointers) │

││ - Typical size: 100s of MB │

││ - GC frequency: Infrequent (major GC) │

├│─────────────────────────────────────────────────────┤

││ Large Object Space │

││ - Objects > 64 KB (in V8) │

││ - Never moved (to avoid copying overhead) │

├│─────────────────────────────────────────────────────┤

││ Code Space │

││ - Compiled machine code (JIT’d functions) │

││ - Executable memory (W^X protection) │ └─────────────────────────────────────────────────────┘

Young generation (nursery):

  • New objects are allocated here.

  • Minor GC (Scavenger): Runs frequently (~1-10 ms pause).

  • Copying collector: Evacuates live objects to to-space, then swaps spaces.

  • Survival: Objects that survive 2-3 minor GCs are promoted to old gen.

Old generation:

  • Long-lived objects.

  • Major GC (Mark-Sweep-Compact): Runs infrequently (~100-1000 ms pause).

  • Tri-color marking: Incremental marking to reduce pauses.

  • Compaction: Periodically compacts memory to reduce fragmentation.

Large object space:

  • Objects too large to fit in young/old gen (e.g., large arrays, buffers).

  • Never moved (to avoid copying cost).

Systems analogy: Similar to multi-generational GC in JVM (Hotspot), GHC (Haskell), or .NET CLR.

Garbage Collection Algorithms

Minor GC: Cheney’s Semi-Space Collector

Algorithm:

  1. Divide young gen into two equal semi-spaces: from-space and to-space.

  2. Allocate objects in from-space (bump-pointer allocation, very fast).

  3. When from-space is full, evacuate live objects to to-space:

    • Start from roots (global objects, stack, registers).

    • Traverse the object graph (BFS or DFS).

    • Copy each reachable object to to-space.

    • Update pointers to point to the new location.

  4. Swap from-space and to-space.

  5. Dead objects are implicitly reclaimed (left in the old from-space).

Complexity:

  • Time: O(L)O(L) where LL is the number of live objects (dead objects are ignored).

  • Space: Wastes 50% of young gen (to-space is empty until GC).

Why so fast? Most objects die young (the generational hypothesis). Copying only live objects is faster than marking all objects.

Systems note: This is the same algorithm used in early LISP systems (Cheney 1970).

Major GC: Mark-Sweep-Compact

Mark phase:

  1. Start from roots.

  2. Mark reachable objects (set a bit in the object header or a bitmap).

  3. Use tri-color marking for incrementality:

    • White: Unvisited.

    • Gray: Visited, but children not yet scanned.

    • Black: Visited, children scanned.

Sweep phase:

  1. Scan the entire old gen.

  2. Add unmarked objects to the free list.

  3. Reset mark bits.

Compact phase (optional, periodic):

  1. Move objects to eliminate fragmentation.

  2. Update all pointers.

Incremental marking: V8 splits the mark phase into small steps (1-5 ms each), interleaved with JavaScript execution. This reduces pause times.

Concurrent marking/sweeping: V8 uses parallel threads to mark/sweep concurrently with JavaScript execution (requires write barriers to track pointer updates).

Systems analogy: Similar to mark-sweep in Boehm GC (C/C++) or concurrent mark-sweep in JVM G1GC.

Write Barriers: Tracking Pointer Updates

Problem: If the GC runs incrementally, JavaScript code might update pointers while marking is in progress. The GC must track these updates to avoid missing live objects.

Solution: Write barriers—instrumentation inserted by the compiler to track pointer writes.

Example (pseudo-code):

// Without write barrier:
obj.field = newValue;

// With write barrier:
obj.field = newValue;
if (isMarking && isBlack(obj) && isWhite(newValue)) {
    markGray(newValue); // Prevent missing this object
}

V8’s write barrier:

; Store newValue to obj.field
mov [obj + field_offset], newValue
; Write barrier check
test [marking_flag], 1
jz skip_barrier
call WriteBarrier
skip_barrier:

Cost: Write barriers add overhead to every pointer write (~5-10% slowdown). But they enable concurrent GC, which reduces pauses.

Systems note: If you’re compiling to WebAssembly and want GC, you’ll need to implement write barriers yourself (WebAssembly GC proposal will add native support).

Memory Leaks: Common Pitfalls

Even with GC, you can leak memory by keeping objects reachable when they’re no longer needed.

Pitfall 1: Accidental Global Variables
function leak() {
    leakyVariable = "This is a global!"; // Forgot 'let', becomes global
}
leak();

Fix: Use strict mode ("use strict";), which throws ReferenceError for undeclared variables.

Pitfall 2: Closures Capturing Large Scopes
function createLeakyFunction() {
    const largeArray = new Array(1000000).fill("data");
    const smallValue = 42;
    
    return function() {
        console.log(smallValue); // Only uses smallValue
    };
}

const fn = createLeakyFunction(); // But largeArray is retained!

Why? JavaScript closures capture the entire lexical environment, not just the variables they reference.

Fix (manual scope minimization):

function createLeakyFunction() {
    const largeArray = new Array(1000000).fill("data");
    const smallValue = 42;
    
    // Process largeArray here, then discard
    // ...
    
    return function() {
        console.log(smallValue); // Now only smallValue is captured
    };
}

V8 optimization: Modern engines analyze closures and only capture used variables. But this isn’t guaranteed.

Pitfall 3: Event Listeners Not Removed
const element = document.getElementById("myButton");
const handler = () => console.log("Clicked!");

element.addEventListener("click", handler);

// Later, remove the element from DOM:
element.remove();
// But the listener is still registered! Memory leak.

Fix: Remove listeners explicitly:

element.removeEventListener("click", handler);

Or use { once: true }:

element.addEventListener("click", handler, { once: true });
Pitfall 4: Caches Without Eviction
const cache = {};

function getCachedData(key) {
    if (!(key in cache)) {
        cache[key] = expensiveComputation(key);
    }
    return cache[key];
}

Problem: cache grows unbounded.

Fix: Use Map with size limits, or WeakMap (if keys are objects):

const cache = new WeakMap();

function getCachedData(obj) {
    if (!cache.has(obj)) {
        cache.set(obj, expensiveComputation(obj));
    }
    return cache.get(obj);
}

// If obj is no longer reachable, the cache entry is GC'd

Profiling Heap Usage

Chrome DevTools:

  1. Open DevTools → Memory tab.

  2. Take a heap snapshot.

  3. Analyze object retainers (why an object is still alive).

Node.js:

const v8 = require("v8");
const heapStats = v8.getHeapStatistics();
console.log(heapStats);

Output:

{
  total_heap_size: 7376896,
  used_heap_size: 4523432,
  heap_size_limit: 2197815296,
  // ...
}

Memory snapshots:

const v8 = require("v8");
const fs = require("fs");

const snapshot = v8.writeHeapSnapshot();
console.log(`Snapshot written to ${snapshot}`);

Open the .heapsnapshot file in Chrome DevTools for analysis.


The Event Loop: Task Queues and Microtasks

Macrotasks vs. Microtasks

JavaScript has two types of asynchronous tasks:

  1. Macrotasks (Tasks): setTimeout, setInterval, I/O, UI events.

  2. Microtasks (Jobs): Promise callbacks (then, catch, finally), queueMicrotask(), MutationObserver.

Execution order:

  1. Execute one macrotask from the task queue.

  2. Execute ALL microtasks from the microtask queue.

  3. Render (if browser, and if necessary).

  4. Repeat.

Key difference: Microtasks interrupt macrotasks. After each macrotask, the microtask queue is drained before the next macrotask.

Example:

console.log("Script start");

setTimeout(() => console.log("setTimeout"), 0);

Promise.resolve()
    .then(() => console.log("Promise 1"))
    .then(() => console.log("Promise 2"));

console.log("Script end");

Output:

Script start Script end Promise 1 Promise 2 setTimeout

Explanation:

  1. Synchronous code: "Script start", "Script end".

  2. Microtasks: "Promise 1", "Promise 2".

  3. Macrotask: "setTimeout".

Systems analogy: Microtasks are like interrupt handlers—they run before the next task. Macrotasks are like scheduled jobs.

Task Queue: Macrotask Sources

Browser:

  • setTimeout/setInterval: Timers.

  • I/O: fetch(), XMLHttpRequest.

  • UI events: click, keydown, etc.

  • Rendering: requestAnimationFrame.

Node.js:

  • Timers: setTimeout, setInterval.

  • I/O: fs.readFile, net.connect.

  • Immediate: setImmediate (runs after I/O, before timers).

  • Close callbacks: socket.on('close').

Task ordering: The spec does not guarantee FIFO order for tasks from different sources. In practice, each task source has its own queue, and the event loop selects from them in implementation-defined order.

Microtask Queue: Promise Callbacks

HTML spec: After each task, the event loop performs a microtask checkpoint:

Microtask Checkpoint

  1. If the microtask queue is empty, return.

  2. Let oldestMicrotask be the first microtask in the queue.

  3. Remove oldestMicrotask from the queue.

  4. Run oldestMicrotask.

  5. Goto step 1 (repeat until queue is empty).

ECMA-262 spec: §9.5.4 (p. 151) defines PerformPromiseJob (equivalent to microtask execution).

Example (nested promises):

Promise.resolve().then(() => {
    console.log("1");
    Promise.resolve().then(() => console.log("2"));
});
Promise.resolve().then(() => console.log("3"));

Output:

1 3 2

Explanation:

  1. First then callback: console.log("1"), enqueues "2".

  2. Second then callback: console.log("3").

  3. Nested then callback: console.log("2").

Microtask starvation: If a microtask keeps enqueuing more microtasks, the event loop never advances to the next macrotask:

function recursiveMicrotask() {
    queueMicrotask(recursiveMicrotask);
}
recursiveMicrotask();
// Infinite loop! No macrotasks (timers, I/O) will ever run.

Systems lesson: Microtasks can starve macrotasks. Use with caution.

queueMicrotask(): Manual Microtask Scheduling

Specification: HTML Living Standard, Section 8.5.2

API:

queueMicrotask(callback);

Example:

console.log("Start");

queueMicrotask(() => console.log("Microtask"));

console.log("End");

Output:

Start End Microtask

Use case: Batching updates:

let updates = [];

function scheduleUpdate(data) {
    updates.push(data);
    if (updates.length === 1) {
        queueMicrotask(() => {
            flushUpdates();
            updates = [];
        });
    }
}

function flushUpdates() {
    console.log("Flushing:", updates);
}

scheduleUpdate("A");
scheduleUpdate("B");
scheduleUpdate("C");
// Output: Flushing: ["A", "B", "C"]

Node.js Event Loop: Six Phases

Node.js event loop (libuv):

┌───────────────────────────┐

┌│─>│ timers │ (setTimeout, setInterval)

││ └─────────────┬─────────────┘

││ ┌─────────────┴─────────────┐

││ │ pending callbacks │ (I/O callbacks deferred from prev cycle)

││ └─────────────┬─────────────┘

││ ┌─────────────┴─────────────┐

││ │ idle, prepare │ (internal)

││ └─────────────┬─────────────┘ ┌───────────────┐

││ ┌─────────────┴─────────────┐ │ incoming: │

││ │ poll │<─────┤ connections, │

││ └─────────────┬─────────────┘ │ data, etc. │

││ ┌─────────────┴─────────────┐ └───────────────┘

││ │ check │ (setImmediate)

││ └─────────────┬─────────────┘

││ ┌─────────────┴─────────────┐ └──┤ close callbacks │ (e.g., socket.on(‘close’)) └───────────────────────────┘

Between each phase: Microtask queue is drained.

Timers phase:

setTimeout(() => console.log("Timer 1"), 0);
setTimeout(() => console.log("Timer 2"), 0);
// Both run in the timers phase, in order.

Poll phase:

  • Wait for I/O events (with a timeout).

  • Execute I/O callbacks.

  • If no timers and no setImmediate, block until I/O arrives.

Check phase (setImmediate):

setImmediate(() => console.log("Immediate"));

Runs after the poll phase, even if scheduled in the same iteration.

Close callbacks phase:

const socket = new net.Socket();
socket.on("close", () => console.log("Socket closed"));
socket.destroy();

process.nextTick(): The Hidden Microtask Queue

Node.js-specific: process.nextTick() schedules a callback to run before the event loop continues to the next phase.

Execution order:

  1. Run the current phase’s callbacks.

  2. Execute ALL process.nextTick callbacks.

  3. Execute ALL promise microtasks.

  4. Move to the next phase.

Example:

console.log("Start");

setTimeout(() => console.log("setTimeout"), 0);
setImmediate(() => console.log("setImmediate"));

process.nextTick(() => console.log("nextTick 1"));
Promise.resolve().then(() => console.log("Promise"));
process.nextTick(() => console.log("nextTick 2"));

console.log("End");

Output (Node.js):

Start End nextTick 1 nextTick 2 Promise setTimeout setImmediate

Why nextTick before promises? Node.js’s nextTick queue is processed before the microtask queue. This is a Node.js-specific quirk.

Warning: process.nextTick can starve the event loop:

function recursiveNextTick() {
    process.nextTick(recursiveNextTick);
}
recursiveNextTick();
// Infinite loop! Event loop phases never advance.

Systems lesson: Prefer queueMicrotask() or promises for portability. Only use process.nextTick() when you explicitly need to run before microtasks (rare).


Timers: setTimeout, setInterval, and Precision

setTimeout: Single-Shot Timer

Specification: HTML Living Standard, Section 8.6

API:

const timerId = setTimeout(callback, delay, ...args);
clearTimeout(timerId);

Example:

setTimeout(() => console.log("Hello"), 1000);

Key points:

  1. Minimum delay: Browsers clamp delays < 4ms to 4ms (if nested > 5 levels). Node.js has no minimum (0ms is allowed).

  2. Not guaranteed: The callback runs at least delay milliseconds later, but may be delayed by:

    • Long-running tasks on the call stack.

    • Other tasks in the queue.

  3. Passed arguments: After delay, you can pass arguments to the callback:

setTimeout((a, b) => console.log(a + b), 1000, 5, 10); // Logs 15 after 1s

Canceling a timer:

const id = setTimeout(() => console.log("Never runs"), 1000);
clearTimeout(id);

setInterval: Repeating Timer

API:

const intervalId = setInterval(callback, delay, ...args);
clearInterval(intervalId);

Example:

let count = 0;
const id = setInterval(() => {
    console.log(++count);
    if (count === 5) clearInterval(id);
}, 1000);

Pitfall: Interval drift: If the callback takes longer than delay, intervals can overlap or drift.

Example:

setInterval(() => {
    const start = Date.now();
    while (Date.now() - start < 100) {} // Simulate 100ms work
    console.log("Done");
}, 50); // Scheduled every 50ms, but takes 100ms

Output: “Done” every ~100ms, not 50ms.

Fix: Use setTimeout recursively:

function repeat() {
    console.log("Done");
    setTimeout(repeat, 50);
}
repeat();

Now each invocation is scheduled after the previous one completes.

Timer Precision: The Reality Check

JavaScript timers are not precise. The delay is a minimum, not a guarantee.

Example:

const start = Date.now();
setTimeout(() => {
    console.log(`Elapsed: ${Date.now() - start}ms`);
}, 1000);

Output: Typically 1000-1005ms, but can be 1050ms or more if the system is busy.

Why?

  1. Event loop: Timers are checked at the start of each event loop iteration. If the loop is blocked, timers are delayed.

  2. OS scheduler: The OS may not wake the process exactly when the timer expires.

  3. Browser throttling: Background tabs throttle timers to 1-second intervals (to save battery).

High-precision timers (performance.now()):

const start = performance.now();
setTimeout(() => {
    const elapsed = performance.now() - start;
    console.log(`Elapsed: ${elapsed}ms`);
}, 1000);

Output: More precise (microsecond resolution), but still subject to event loop delays.

Systems note: If you need precise timing (e.g., for animation), use requestAnimationFrame() (browser) or high-resolution timers (Node.js process.hrtime()).


Promises and Async/Await: Syntactic Sugar Over the Event Loop

Promises: Microtask-Based Asynchrony

Specification: §27.2 (pp. 673-706)

Promise states (§27.2.1.1, p. 674):

  • Pending: Initial state.

  • Fulfilled: Operation completed successfully.

  • Rejected: Operation failed.

State transitions:

pending → fulfilled (resolve) pending → rejected (reject)

Once fulfilled or rejected, a promise is settled (immutable).

Creating a promise:

const promise = new Promise((resolve, reject) => {
    setTimeout(() => {
        if (Math.random() > 0.5) {
            resolve("Success!");
        } else {
            reject(new Error("Failure!"));
        }
    }, 1000);
});

promise
    .then(result => console.log(result))
    .catch(error => console.error(error));

Chaining:

Promise.resolve(5)
    .then(x => x * 2)
    .then(x => x + 3)
    .then(x => console.log(x)); // 13

Each then returns a new promise, allowing chains.

Async/Await: State Machine Compilation

Specification: §15.8 (pp. 346-354)

async/await is syntactic sugar for promises. The compiler transforms it into a state machine.

Example:

async function fetchData() {
    const response = await fetch("https://api.example.com/data");
    const data = await response.json();
    return data;
}

Desugared (conceptual):

function fetchData() {
    return new Promise((resolve, reject) => {
        fetch("https://api.example.com/data")
            .then(response => response.json())
            .then(data => resolve(data))
            .catch(error => reject(error));
    });
}

Actually, the compiler generates a state machine (similar to generators):

function fetchData() {
    let state = 0;
    let response, data;

    function step(value) {
        switch (state) {
            case 0:
                state = 1;
                return fetch("https://api.example.com/data");
            case 1:
                response = value;
                state = 2;
                return response.json();
            case 2:
                data = value;
                return data;
        }
    }

    return new Promise((resolve, reject) => {
        function next(value) {
            const result = step(value);
            if (result instanceof Promise) {
                result.then(next, reject);
            } else {
                resolve(result);
            }
        }
        next();
    });
}

Systems analogy: Async functions are compiled to coroutines with explicit state. The await keyword is a suspension point—execution pauses, the promise is awaited, and execution resumes when the promise settles.

Error Handling: try/catch in Async Functions

async function fetchData() {
    try {
        const response = await fetch("https://api.example.com/data");
        const data = await response.json();
        return data;
    } catch (error) {
        console.error("Failed to fetch:", error);
        return null;
    }
}

Desugared:

function fetchData() {
    return fetch("https://api.example.com/data")
        .then(response => response.json())
        .catch(error => {
            console.error("Failed to fetch:", error);
            return null;
        });
}

Key point: await in a try block can be caught in the catch block. This is more ergonomic than .catch() chains.

Top-Level Await (ES2022)

Specification: §16.1.8 (p. 383)

ES2022 allows await at the top level of modules:

// module.js
const data = await fetch("https://api.example.com/data");
export default data;

How it works: The module’s evaluation is suspended until the promise settles. Dependent modules wait for this module to finish.

Execution order:

// a.js
console.log("A start");
await delay(100);
console.log("A end");

// b.js
console.log("B start");
await delay(50);
console.log("B end");

// main.js
import "./a.js";
import "./b.js";
console.log("Main");

Output:

A start B start B end A end Main

Modules are evaluated in parallel (in import order), but main.js waits for both to settle.

Systems note: Top-level await changes module loading from synchronous to asynchronous. This affects bundlers (Webpack, Rollup) and requires careful dependency management.


Web Workers and Shared Memory: Breaking the Single-Threaded Model

Web Workers: True Parallelism in the Browser

Specification: HTML Living Standard, Section 10.2

Web Workers run JavaScript in a separate thread with a separate event loop, heap, and call stack.

Creating a worker:

// main.js
const worker = new Worker("worker.js");

worker.postMessage({ type: "compute", data: [1, 2, 3, 4, 5] });

worker.onmessage = (event) => {
    console.log("Result:", event.data);
};
// worker.js
self.onmessage = (event) => {
    const { type, data } = event.data;
    if (type === "compute") {
        const sum = data.reduce((acc, x) => acc + x, 0);
        self.postMessage(sum);
    }
};

Key points:

  1. No shared memory (by default): Data is cloned via the structured clone algorithm.

  2. No DOM access: Workers can’t access document, window, or the DOM.

  3. Separate global object: Workers have self (not window).

Structured clone: Serializes objects (including arrays, dates, maps, sets) but not functions, DOM nodes, or prototypes.

Transferable objects: For large data (TypedArrays, ArrayBuffers), use transferables to avoid copying:

const buffer = new ArrayBuffer(1024 * 1024); // 1 MB
worker.postMessage(buffer, [buffer]); // Transfer ownership
console.log(buffer.byteLength); // 0 (detached)

Systems analogy: Web Workers are like POSIX threads, but with message passing (like Erlang) instead of shared memory.

SharedArrayBuffer: Shared Memory Concurrency

Specification: §25.2 (pp. 646-651)

SharedArrayBuffer (SAB) allows shared memory between the main thread and workers.

Creating a shared buffer:

// main.js
const sab = new SharedArrayBuffer(1024);
const view = new Int32Array(sab);
view[0] = 42;

const worker = new Worker("worker.js");
worker.postMessage(sab);
// worker.js
self.onmessage = (event) => {
    const sab = event.data;
    const view = new Int32Array(sab);
    console.log(view[0]); // 42 (shared!)
    view[0] = 100;
};

Race conditions: Without synchronization, reads/writes can race:

// main.js
view[0] = 0;
for (let i = 0; i < 10000; i++) {
    view[0]++;
}

// worker.js
for (let i = 0; i < 10000; i++) {
    view[0]++;
}

// Final value is NOT 20000! (data race)

Atomics API: Provides atomic operations (§25.4, pp. 657-670):

// main.js
Atomics.store(view, 0, 0);
for (let i = 0; i < 10000; i++) {
    Atomics.add(view, 0, 1); // Atomic increment
}

// worker.js
for (let i = 0; i < 10000; i++) {
    Atomics.add(view, 0, 1);
}

// Final value is guaranteed to be 20000

Atomic operations:

  • Atomics.load(ta, index): Atomic read.

  • Atomics.store(ta, index, value): Atomic write.

  • Atomics.add(ta, index, value): Atomic fetch-and-add.

  • Atomics.compareExchange(ta, index, expected, replacement): CAS.

  • Atomics.wait(ta, index, value, timeout): Block until value changes (only in workers).

  • Atomics.notify(ta, index, count): Wake waiting threads.

Systems analogy: Atomics provides the same primitives as C11’s _Atomic or C++’s std::atomic.

Security note: SharedArrayBuffer was disabled in browsers in 2018 due to Spectre vulnerabilities. It’s now re-enabled with cross-origin isolation requirements (Cross-Origin-Opener-Policy: same-origin + Cross-Origin-Embedder-Policy: require-corp headers).


Node.js: libuv and the I/O Subsystem

libuv: The Cross-Platform Async I/O Library

Node.js’s event loop is implemented by libuv, a C library providing:

  • Event loop.

  • Asynchronous I/O (file system, network, etc.).

  • Thread pool (for blocking operations like fs.readFileSync).

libuv architecture:

┌│─────────────────────────────────────────────────────────┐

││ Node.js Process │

├│─────────────────────────────────────────────────────────┤

││ JavaScript (V8 Engine) │

││ ↓ │

││ Node.js Bindings (C++) │

││ ↓ │

││ libuv (C Library) │

││ ↓ │

││ OS-Specific I/O APIs │

││ - epoll (Linux) │

││ - kqueue (BSD, macOS) │

││ - IOCP (Windows) │ └─────────────────────────────────────────────────────────┘

Thread pool: libuv maintains a default thread pool of 4 threads (configurable via UV_THREADPOOL_SIZE environment variable) for blocking operations:

  • File system (except fs.readFileSync, which blocks the main thread).

  • DNS lookups (dns.lookup()).

  • Crypto operations (hashing, encryption).

Non-blocking I/O: Network operations (sockets) use the OS’s native non-blocking I/O (epoll, kqueue, IOCP).

Asynchronous File System Operations

Example:

const fs = require("fs");

fs.readFile("data.txt", "utf8", (err, data) => {
    if (err) throw err;
    console.log(data);
});

console.log("Reading file...");

Output:

Reading file… [file contents]

How it works:

  1. fs.readFile() enqueues a task to the thread pool.

  2. A worker thread performs the blocking read() syscall.

  3. When done, the callback is enqueued to the event loop.

  4. The event loop dequeues and executes the callback.

Synchronous alternative (blocks the event loop):

const data = fs.readFileSync("data.txt", "utf8");
console.log(data);

Never use *Sync methods in production code (except for startup scripts), as they block the event loop and prevent all other I/O.

Streams: Backpressure and Flow Control

Node.js Streams (§10.5 in the Node.js docs) are an abstraction for reading/writing data incrementally:

  • Readable: Source of data (fs.createReadStream, http.IncomingMessage).

  • Writable: Destination (fs.createWriteStream, http.ServerResponse).

  • Duplex: Both readable and writable (net.Socket).

  • Transform: Duplex stream that modifies data (zlib.createGzip).

Example:

const fs = require("fs");

const readable = fs.createReadStream("input.txt");
const writable = fs.createWriteStream("output.txt");

readable.pipe(writable);

Backpressure: If the writable stream can’t keep up with the readable stream, .pipe() automatically pauses the readable until the writable drains.

Manual backpressure handling:

readable.on("data", (chunk) => {
    const canContinue = writable.write(chunk);
    if (!canContinue) {
        readable.pause(); // Pause reading
    }
});

writable.on("drain", () => {
    readable.resume(); // Resume reading
});

Systems analogy: Backpressure is like flow control in TCP—the receiver signals the sender to slow down when its buffer is full.


Performance Optimization: Understanding the Runtime

Avoid Blocking the Event Loop

Rule: Never perform synchronous blocking operations in the event loop.

Bad:

const result = someExpensiveComputation(); // 1 second
console.log(result);

This blocks the event loop for 1 second, freezing all I/O, timers, and user interactions.

Good:

setImmediate(() => {
    const result = someExpensiveComputation();
    console.log(result);
});

Or use a worker:

const worker = new Worker("worker.js");
worker.postMessage({ task: "compute" });
worker.onmessage = (event) => {
    console.log(event.data);
};

Minimize Microtask Queue Depth

Bad:

Promise.resolve()
    .then(() => Promise.resolve())
    .then(() => Promise.resolve())
    .then(() => Promise.resolve())
    // 1000 more .then() calls

Each .then() enqueues a microtask. Deep promise chains can starve macrotasks.

Good: Flatten chains:

async function run() {
    await step1();
    await step2();
    await step3();
}

Use Object Pools for High-Frequency Allocations

Problem: Frequent allocations pressure the GC.

Solution: Reuse objects:

class ObjectPool {
    constructor(factory, size) {
        this.factory = factory;
        this.pool = Array(size).fill(null).map(() => factory());
    }

    acquire() {
        return this.pool.pop() || this.factory();
    }

    release(obj) {
        this.pool.push(obj);
    }
}

const pool = new ObjectPool(() => ({ x: 0, y: 0 }), 100);

function usePoint() {
    const point = pool.acquire();
    point.x = 10;
    point.y = 20;
    // ... use point
    pool.release(point);
}

Systems note: Object pooling is common in game engines (Unity, Unreal) to reduce GC pressure.

Profile Before Optimizing

Chrome DevTools:

  1. Performance tab → Record → Stop.

  2. Analyze flame chart for bottlenecks.

Node.js:

node --inspect app.js

Open chrome://inspect and profile.

Benchmarking:

const { performance } = require("perf_hooks");

const start = performance.now();
someFunction();
const end = performance.now();
console.log(`Elapsed: ${end - start}ms`);

End of Chapter 3

In the next chapter, we’ll dive into JavaScript’s object model—prototypes, inheritance, property access, hidden classes, and how engines optimize object operations.


Chapter 4: Functions, Closures, and Scopes

Functions: First-Class Citizens and the Function Object

The Function as an Object: Callable, Constructable, and More

JavaScript functions are first-class objects. They’re not just executable code—they’re instances of the Function constructor with properties, methods, and internal slots.

Specification: §20.2 (pp. 480-502)

Every function object has:

  • [[Call]]: Internal method making it callable.

  • [[Construct]]: Internal method making it constructable (for new).

  • prototype: Property pointing to the constructor’s prototype object.

  • length: Number of formal parameters.

  • name: The function’s name (or "anonymous").

Example:

function greet(name, greeting = "Hello") {
    console.log(`${greeting}, ${name}!`);
}

console.log(greet.length);        // 1 (only counts non-default params)
console.log(greet.name);          // "greet"
console.log(typeof greet);        // "function"
console.log(greet instanceof Object);  // true

Internal slots (§10.2.1, pp. 181-182):

FunctionObject { [[Call]]: executable code [[Construct]]: constructor behavior (if present) [[Environment]]: lexical environment (for closures) [[FormalParameters]]: parameter list [[ECMAScriptCode]]: parsed function body [[Realm]]: the realm in which the function was created [[HomeObject]]: for super references (methods) [[ThisMode]]: lexical, strict, or global }

Systems perspective: A JavaScript function is like a C function pointer plus a struct containing its lexical environment. The closure captures variables from outer scopes, stored in [[Environment]].

Function Declaration vs. Function Expression vs. Arrow Function

Function Declaration

Syntax:

function name(params) {
    // body
}

Hoisting: Function declarations are hoisted—the entire function is available before the declaration in source order:

console.log(add(2, 3));  // 5

function add(a, b) {
    return a + b;
}

Specification: §14.1 (pp. 265-269)

The parser creates a function environment record during the instantiation phase, before executing code. This is why hoisting works.

Systems analogy: Similar to C’s function prototypes—the compiler knows the signature before seeing the definition.

Function Expression

Syntax:

const name = function(params) {
    // body
};

No hoisting: The variable is hoisted (initialized to undefined), but the function assignment happens at runtime:

console.log(add);        // undefined
console.log(add(2, 3));  // TypeError: add is not a function

const add = function(a, b) {
    return a + b;
};

Named function expressions:

const factorial = function fact(n) {
    if (n <= 1) return 1;
    return n * fact(n - 1);  // Can use 'fact' for recursion
};

console.log(factorial.name);  // "fact"
console.log(fact);            // ReferenceError: fact is not defined

The name fact is only visible inside the function body.

Arrow Functions

Syntax:

const name = (params) => expression;
const name = (params) => { /* body */ };

Key differences from regular functions:

  1. No this binding: Arrow functions inherit this from the enclosing scope (lexical this).

  2. No arguments object: Use rest parameters (...args) instead.

  3. Cannot be used as constructors: No [[Construct]] method.

  4. No prototype property.

  5. No super, new.target, or yield.

Specification: §15.3 (pp. 321-326)

Arrow functions have [[ThisMode]] set to "lexical" (§10.2.1.1, p. 182).

Example:

function Timer() {
    this.seconds = 0;
    
    // Arrow function captures 'this' from Timer
    setInterval(() => {
        this.seconds++;
        console.log(this.seconds);
    }, 1000);
}

new Timer();  // 1, 2, 3, ...

Contrast with regular function:

function Timer() {
    this.seconds = 0;
    
    setInterval(function() {
        this.seconds++;  // 'this' is global object or undefined (strict)
        console.log(this.seconds);  // NaN
    }, 1000);
}

new Timer();

Fix (pre-ES6 pattern):

function Timer() {
    const self = this;  // Capture 'this'
    this.seconds = 0;
    
    setInterval(function() {
        self.seconds++;
        console.log(self.seconds);
    }, 1000);
}

Systems insight: Arrow functions are syntactic sugar for binding this. The compiler rewrites:

const fn = () => this.value;

Into (conceptually):

const fn = function() { return this.value; }.bind(this);

But more efficiently—no runtime .bind() call.

The arguments Object: Array-Like and Aliased

Specification: §10.2.1.3 (pp. 183-184)

Non-arrow functions have an implicit arguments object:

function sum() {
    let total = 0;
    for (let i = 0; i < arguments.length; i++) {
        total += arguments[i];
    }
    return total;
}

console.log(sum(1, 2, 3, 4));  // 10

Array-like: arguments has a length property and numeric indices, but is not an Array:

function test() {
    console.log(Array.isArray(arguments));  // false
    console.log(arguments.length);          // number of args
}

test(1, 2, 3);

Convert to array:

function test() {
    const args = Array.from(arguments);  // ES6
    const args2 = [...arguments];        // ES6 spread
    const args3 = Array.prototype.slice.call(arguments);  // ES5
}

Aliasing (in non-strict mode):

function test(a, b) {
    console.log(arguments[0]);  // 1
    a = 10;
    console.log(arguments[0]);  // 10 (aliased!)
}

test(1, 2);

In strict mode, parameters are not aliased:

"use strict";
function test(a, b) {
    a = 10;
    console.log(arguments[0]);  // 1 (not aliased)
}

Systems note: Aliasing complicates optimization. V8 must track whether parameters are aliased, potentially preventing inlining. Avoid arguments in modern code—use rest parameters.

Rest Parameters and Spread Syntax

Rest parameters (§15.1, p. 312):

function sum(...numbers) {
    return numbers.reduce((acc, n) => acc + n, 0);
}

console.log(sum(1, 2, 3, 4));  // 10

Key differences from arguments:

  • True array: numbers is an Array instance.

  • No aliasing: Changes to numbers don’t affect named parameters.

  • Only trailing parameters: Must be the last parameter.

function fn(a, b, ...rest) {
    console.log(a, b, rest);
}

fn(1, 2, 3, 4, 5);  // 1, 2, [3, 4, 5]

Spread syntax (§13.2.4, pp. 246-247):

const arr = [1, 2, 3];
console.log(Math.max(...arr));  // 3

const arr2 = [0, ...arr, 4];  // [0, 1, 2, 3, 4]

Systems insight: Spread is implemented efficiently by engines—it’s not a loop in userland. V8 uses a fast path for spreading arrays.

Default Parameters: Temporal Dead Zone and Scope

Specification: §15.1.4 (pp. 313-315)

Syntax:

function greet(name = "World", greeting = "Hello") {
    console.log(`${greeting}, ${name}!`);
}

greet();                  // Hello, World!
greet("Alice");           // Hello, Alice!
greet("Bob", "Hi");       // Hi, Bob!

Default parameters are evaluated at call time:

let counter = 0;
function fn(x = counter++) {
    console.log(x);
}

fn();  // 0
fn();  // 1
fn();  // 2

Parameters have their own scope:

function fn(a = 1, b = a + 1) {
    console.log(a, b);
}

fn();        // 1, 2
fn(5);       // 5, 6
fn(5, 10);   // 5, 10

Temporal Dead Zone (TDZ):

function fn(a = b, b = 1) {  // ReferenceError: Cannot access 'b' before initialization
    console.log(a, b);
}

fn();

Why? Parameters are evaluated left-to-right. When evaluating a = b, b hasn’t been initialized yet (it’s in the TDZ).

Parameter scope vs. function scope:

let x = 1;
function fn(a = x, x = 2) {
    console.log(a, x);
}

fn();  // ReferenceError: Cannot access 'x' before initialization

Explanation: The parameter x = 2 shadows the outer x. When evaluating a = x, the parameter x is in the TDZ.

Systems lesson: JavaScript’s parameter defaults create a separate lexical scope for parameters, distinct from the function body scope. This is similar to Scheme’s let* (sequential bindings).


Scopes: Lexical Environments and Closure Semantics

Lexical Scoping: The Foundation

Specification: §9.1 (pp. 129-145)

JavaScript uses lexical scoping (also called static scoping): A variable’s scope is determined by its position in the source code, not by the call stack at runtime.

Example:

let x = "global";

function outer() {
    let x = "outer";
    
    function inner() {
        console.log(x);  // "outer" (lexical scope)
    }
    
    return inner;
}

const fn = outer();
fn();  // "outer"

Contrast with dynamic scoping (used in Emacs Lisp, early Perl):

// Hypothetical dynamic scoping (not JavaScript!)
let x = "global";

function outer() {
    let x = "outer";
    inner();
}

function inner() {
    console.log(x);  // Would print "outer" in dynamic scoping
}

outer();

In dynamic scoping, inner() would look up x in the call stack, finding outer’s x. In JavaScript’s lexical scoping, inner() looks up x in its lexical environment (where it was defined), finding the global x.

Environment Records: The Spec’s Abstraction

Specification: §9.1.1 (pp. 129-136)

An Environment Record is the spec’s abstraction for “where variables live.” There are four types:

  1. Declarative Environment Record: For let, const, var, function parameters.

  2. Object Environment Record: For with statements and global object properties.

  3. Global Environment Record: Hybrid of declarative + object records.

  4. Module Environment Record: For ES6 modules.

Structure:

EnvironmentRecord { bindings: Map<String, Value> outer: EnvironmentRecord | null }

Example:

let globalVar = "global";

function outer() {
    let outerVar = "outer";
    
    function inner() {
        let innerVar = "inner";
        console.log(globalVar, outerVar, innerVar);
    }
    
    inner();
}

outer();

Environment chain:

inner’s Environment { bindings: { innerVar: “inner” } outer: outer’s Environment }

outer’s Environment { bindings: { outerVar: “outer” } outer: Global Environment }

Global Environment { bindings: { globalVar: “global”, outer: } outer: null }

Variable lookup: Start at the current environment, walk the outer chain until found (or ReferenceError if not found).

Systems analogy: Environment records are like stack frames in C, but they’re heap-allocated if captured by closures. The outer link is analogous to the static link in ALGOL-style languages.

Block Scopes: let and const

Specification: §14.3.1 (pp. 276-277)

ES6 introduced block-scoped variables (let, const) that create a new environment for each block:

{
    let x = 1;
    const y = 2;
    var z = 3;  // Function-scoped
}

console.log(z);  // 3
console.log(x);  // ReferenceError

Temporal Dead Zone (TDZ): let/const variables cannot be accessed before their declaration:

console.log(x);  // ReferenceError: Cannot access 'x' before initialization
let x = 1;

Contrast with var (hoisted, initialized to undefined):

console.log(x);  // undefined
var x = 1;

TDZ in loops:

// Bad: TDZ error
for (let i = 0; i < arr.length; i++) {
    console.log(i);  // OK
}
console.log(i);  // ReferenceError

// Good: Each iteration has a new 'i'
for (let i = 0; i < 3; i++) {
    setTimeout(() => console.log(i), 100);
}
// Logs: 0, 1, 2 (each closure captures a different 'i')

Contrast with var:

for (var i = 0; i < 3; i++) {
    setTimeout(() => console.log(i), 100);
}
// Logs: 3, 3, 3 (all closures share the same 'i')

Why? let creates a new binding for each loop iteration, while var reuses the same binding.

Specification: §14.7.4.2 (pp. 299-300) describes loop iteration environment creation.

var Hoisting: Function-Scoped and Initialized

Hoisting: Variable declarations are moved to the top of their function (or global scope):

function test() {
    console.log(x);  // undefined (not ReferenceError)
    var x = 1;
    console.log(x);  // 1
}

Desugared:

function test() {
    var x;  // Hoisted to top, initialized to undefined
    console.log(x);
    x = 1;
    console.log(x);
}

Function-scoped, not block-scoped:

function test() {
    if (true) {
        var x = 1;
    }
    console.log(x);  // 1 (x is function-scoped)
}

Systems lesson: var is a legacy of JavaScript’s hasty design. Always use let/const in modern code.


Closures: Capturing Lexical Environments

What is a Closure?

Definition: A closure is a function that captures variables from its enclosing lexical scope, even after the outer function has returned.

Example:

function makeCounter() {
    let count = 0;
    return function() {
        return ++count;
    };
}

const counter = makeCounter();
console.log(counter());  // 1
console.log(counter());  // 2
console.log(counter());  // 3

How it works:

  1. makeCounter creates a local variable count.

  2. The inner function captures count in its [[Environment]] slot.

  3. When makeCounter returns, its execution context is popped, but the environment record is not garbage-collected because the inner function still references it.

  4. Each call to counter() accesses the captured count.

Systems analogy: Closures are like upvalues in Lua or lexical bindings in Scheme. The captured variables are stored in a heap-allocated environment (not on the stack).

Closure Implementation: Hidden Classes and Contexts

V8’s implementation:

  1. Context object: A heap-allocated object storing captured variables.

  2. [[Environment]] slot: The function object points to the context.

  3. Variable access: Load from context, not from stack frame.

Pseudo-code:

// Source:
function makeCounter() {
    let count = 0;
    return function() { return ++count; };
}

// V8 internal representation:
Context {
    count: 0
}

Function {
    [[Environment]]: Context
    [[Code]]: bytecode for "return ++count"
}

Bytecode (simplified):

LdaContextSlot [0] ; Load count from context slot 0 Inc ; Increment StaContextSlot [0] ; Store back to context slot 0 Return

Systems note: V8 optimizes closures by only capturing used variables. If a variable isn’t referenced in the closure, it’s not stored in the context.

Closures in Loops: The Classic Pitfall

Problem:

const functions = [];
for (var i = 0; i < 3; i++) {
    functions.push(function() {
        console.log(i);
    });
}

functions[0]();  // 3
functions[1]();  // 3
functions[2]();  // 3

Why? All closures capture the same i (function-scoped by var). After the loop, i is 3.

Fix 1: Use let:

for (let i = 0; i < 3; i++) {
    functions.push(function() {
        console.log(i);
    });
}

functions[0]();  // 0
functions[1]();  // 1
functions[2]();  // 2

Fix 2: IIFE (Immediately Invoked Function Expression):

for (var i = 0; i < 3; i++) {
    (function(i) {
        functions.push(function() {
            console.log(i);
        });
    })(i);
}

Fix 3: .bind():

for (var i = 0; i < 3; i++) {
    functions.push((function(i) {
        return function() { console.log(i); };
    })(i));
}

Systems lesson: Understanding closures is critical for avoiding subtle bugs. The spec’s environment model explains why these fixes work.

Closure Memory Leaks: Retaining References

Problem:

function setupHandler() {
    const largeData = new Array(1000000).fill("data");
    
    document.getElementById("button").addEventListener("click", function() {
        console.log("Button clicked!");
        // Doesn't use largeData, but still captures it
    });
}

setupHandler();

Why? The closure captures the entire lexical environment, including largeData, even though it’s unused.

V8 optimization: Modern engines analyze closures and only capture used variables. But this isn’t guaranteed.

Fix: Explicitly null out references:

function setupHandler() {
    const largeData = new Array(1000000).fill("data");
    
    // Use largeData here...
    
    const largeDataCopy = null;  // Hint to GC
    
    document.getElementById("button").addEventListener("click", function() {
        console.log("Button clicked!");
    });
}

Or use WeakMap:

const dataMap = new WeakMap();

function setupHandler(button) {
    const largeData = new Array(1000000).fill("data");
    dataMap.set(button, largeData);
    
    button.addEventListener("click", function() {
        const data = dataMap.get(button);
        // Use data if needed
    });
}

// When button is removed from DOM, largeData is GC'd

The this Keyword: Dynamic Binding and Confusion

this Binding Rules

JavaScript’s this is dynamically bound based on how a function is called, not where it’s defined.

Four binding rules:

  1. Default binding: Global object (or undefined in strict mode).

  2. Implicit binding: The object the method is called on.

  3. Explicit binding: .call(), .apply(), .bind().

  4. new binding: The newly created object.

Rule 1: Default Binding
function fn() {
    console.log(this);
}

fn();  // Window (browser) or global (Node.js)

Strict mode:

"use strict";
function fn() {
    console.log(this);
}

fn();  // undefined
Rule 2: Implicit Binding
const obj = {
    value: 42,
    fn: function() {
        console.log(this.value);
    }
};

obj.fn();  // 42

Gotcha: Losing binding:

const fn = obj.fn;
fn();  // undefined (or error in strict mode)

Why? fn is now a standalone function, losing its context.

Rule 3: Explicit Binding

.call() and .apply():

function greet(greeting) {
    console.log(`${greeting}, ${this.name}!`);
}

const person = { name: "Alice" };

greet.call(person, "Hello");   // Hello, Alice!
greet.apply(person, ["Hi"]);   // Hi, Alice!

.bind(): Returns a new function with this permanently bound:

const boundGreet = greet.bind(person);
boundGreet("Hey");  // Hey, Alice!
Rule 4: new Binding

Specification: §20.2.3.1 (pp. 489-490)

When a function is called with new:

  1. A new object is created.

  2. The object’s [[Prototype]] is set to Constructor.prototype.

  3. The constructor is called with this bound to the new object.

  4. If the constructor returns an object, that object is returned; otherwise, the new object is returned.

function Person(name) {
    this.name = name;
}

const alice = new Person("Alice");
console.log(alice.name);  // "Alice"

Returning an object overrides the new object:

function Person(name) {
    this.name = name;
    return { name: "Overridden" };
}

const bob = new Person("Bob");
console.log(bob.name);  // "Overridden"

Arrow Functions: Lexical this

Arrow functions do not bind this—they inherit it from the enclosing scope:

const obj = {
    value: 42,
    regularFn: function() {
        setTimeout(function() {
            console.log(this.value);  // undefined (or error in strict)
        }, 100);
    },
    arrowFn: function() {
        setTimeout(() => {
            console.log(this.value);  // 42
        }, 100);
    }
};

obj.regularFn();  // undefined
obj.arrowFn();    // 42

Systems insight: Arrow functions are not closures over this—they don’t have a this binding at all. Variable lookup for this walks the environment chain, just like any other variable.

The globalThis Object (ES2020)

Specification: §19.1 (p. 447)

globalThis provides a standard way to access the global object in any environment:

// Browser: globalThis === window
// Node.js: globalThis === global
// Web Worker: globalThis === self

console.log(globalThis);

Pre-ES2020 workaround:

const globalObject = (function() {
    return this || (new Function("return this"))();
})();

Higher-Order Functions: Functional Programming Patterns

Functions as Arguments: Callbacks and Iterators

Higher-order function: A function that takes or returns other functions.

Example: Array methods:

const numbers = [1, 2, 3, 4, 5];

const doubled = numbers.map(x => x * 2);
const evens = numbers.filter(x => x % 2 === 0);
const sum = numbers.reduce((acc, x) => acc + x, 0);

console.log(doubled);  // [2, 4, 6, 8, 10]
console.log(evens);    // [2, 4]
console.log(sum);      // 15

Custom higher-order function:

function repeat(n, action) {
    for (let i = 0; i < n; i++) {
        action(i);
    }
}

repeat(3, i => console.log(`Iteration ${i}`));

Functions as Return Values: Function Factories

Currying:

function add(a) {
    return function(b) {
        return a + b;
    };
}

const add5 = add(5);
console.log(add5(3));  // 8
console.log(add5(10)); // 15

Generic currying:

function curry(fn) {
    return function curried(...args) {
        if (args.length >= fn.length) {
            return fn(...args);
        }
        return (...nextArgs) => curried(...args, ...nextArgs);
    };
}

function multiply(a, b, c) {
    return a * b * c;
}

const curriedMultiply = curry(multiply);
console.log(curriedMultiply(2)(3)(4));      // 24
console.log(curriedMultiply(2, 3)(4));      // 24
console.log(curriedMultiply(2, 3, 4));      // 24

Systems note: Currying is fundamental in Haskell/ML. In JavaScript, it’s less common but useful for partial application.

Partial Application

.bind() for partial application:

function greet(greeting, name) {
    console.log(`${greeting}, ${name}!`);
}

const sayHello = greet.bind(null, "Hello");
sayHello("Alice");  // Hello, Alice!
sayHello("Bob");    // Hello, Bob!

Custom partial application:

function partial(fn, ...presetArgs) {
    return function(...laterArgs) {
        return fn(...presetArgs, ...laterArgs);
    };
}

const add = (a, b, c) => a + b + c;
const add5and10 = partial(add, 5, 10);
console.log(add5and10(3));  // 18

Composition and Pipelining

Function composition: Combine functions left-to-right or right-to-left:

const compose = (...fns) => x => fns.reduceRight((acc, fn) => fn(acc), x);
const pipe = (...fns) => x => fns.reduce((acc, fn) => fn(acc), x);

const double = x => x * 2;
const increment = x => x + 1;
const square = x => x * x;

const composed = compose(square, increment, double);
console.log(composed(3));  // square(increment(double(3))) = square(7) = 49

const piped = pipe(double, increment, square);
console.log(piped(3));  // square(increment(double(3))) = square(7) = 49

Systems analogy: Function composition is like Unix pipes: cmd1 | cmd2 | cmd3.


Function Optimization: Inline Caching and Hidden Classes

Inline Caching: Speeding Up Property Access

Problem: Property lookup in JavaScript is expensive—walk the prototype chain, check property attributes, etc.

Solution: Inline caching (IC)—cache the result of property lookups.

Example:

function getX(obj) {
    return obj.x;
}

getX({ x: 1 });  // First call: IC miss, cache { x: 1 }'s shape
getX({ x: 2 });  // Second call: IC hit (same shape), fast path

IC states:

  1. Uninitialized: Never called.

  2. Monomorphic: Sees one object shape (fast).

  3. Polymorphic: Sees 2-4 shapes (slower, but still cached).

  4. Megamorphic: Sees >4 shapes (no caching, slow).

Systems lesson: Keep object shapes consistent for performance:

// Bad: Different shapes
function Point1(x, y) {
    this.x = x;
    this.y = y;
}

function Point2(y, x) {  // Different order!
    this.y = y;
    this.x = x;
}

const p1 = new Point1(1, 2);
const p2 = new Point2(2, 1);
// p1 and p2 have different hidden classes!

Good: Same shape:

function Point(x, y) {
    this.x = x;
    this.y = y;
}

const p1 = new Point(1, 2);
const p2 = new Point(3, 4);
// p1 and p2 share the same hidden class

Hidden Classes (Maps): V8’s Optimization

V8 internal: Objects don’t store property names—they store a pointer to a hidden class (also called a map or shape):

Object { map: HiddenClass properties: [value1, value2, …] }

HiddenClass { x: offset 0 y: offset 1 }

Property access: obj.x → look up x in obj.map → get offset → load from obj.properties[offset].

Hidden class transitions:

const obj = {};
// Hidden class: C0 (empty)

obj.x = 1;
// Transition: C0 → C1 (has 'x')

obj.y = 2;
// Transition: C1 → C2 (has 'x' and 'y')

Adding properties in different orders creates different hidden classes:

const obj1 = {};
obj1.x = 1;
obj1.y = 2;
// Hidden class: C2

const obj2 = {};
obj2.y = 2;
obj2.x = 1;
// Hidden class: C2' (different from C2!)

Systems advice: Initialize all properties in the constructor in the same order:

class Point {
    constructor(x, y) {
        this.x = x;
        this.y = y;
    }
}

// All Point instances share the same hidden class

Inlining and Deoptimization

Inlining: The JIT compiler inlines small functions for performance:

function add(a, b) {
    return a + b;
}

function compute(x) {
    return add(x, 10);
}

// After JIT:
function compute(x) {
    return x + 10;  // 'add' is inlined
}

Deoptimization: If assumptions are violated (e.g., type changes), the JIT deoptimizes back to bytecode:

function add(a, b) {
    return a + b;
}

add(1, 2);     // JIT assumes integers, compiles fast path
add(1.5, 2.5); // Still integers (floats are fine)
add("a", "b"); // Type changed! Deoptimize.

Systems lesson: Avoid polymorphic code in hot paths. Keep types stable.


Practical Patterns: Memoization, Debouncing, Throttling

Memoization: Caching Function Results

Pattern: Cache expensive function calls:

function memoize(fn) {
    const cache = new Map();
    return function(...args) {
        const key = JSON.stringify(args);
        if (cache.has(key)) {
            return cache.get(key);
        }
        const result = fn(...args);
        cache.set(key, result);
        return result;
    };
}

function fibonacci(n) {
    if (n <= 1) return n;
    return fibonacci(n - 1) + fibonacci(n - 2);
}

const memoizedFib = memoize(fibonacci);
console.log(memoizedFib(40));  // Fast!

Systems note: For recursive functions, memoize the inner function:

const fibonacci = memoize(function fib(n) {
    if (n <= 1) return n;
    return fib(n - 1) + fib(n - 2);  // Calls memoized version
});

Debouncing: Delay Execution Until Idle

Pattern: Execute a function only after a delay since the last call:

function debounce(fn, delay) {
    let timeoutId;
    return function(...args) {
        clearTimeout(timeoutId);
        timeoutId = setTimeout(() => fn(...args), delay);
    };
}

const search = debounce((query) => {
    console.log(`Searching for: ${query}`);
}, 300);

// User types: "h" → "he" → "hel" → "hell" → "hello"
// Only one search after 300ms of no typing

Use case: Search boxes, resize handlers.

Throttling: Limit Execution Rate

Pattern: Execute at most once per interval:

function throttle(fn, interval) {
    let lastCall = 0;
    return function(...args) {
        const now = Date.now();
        if (now - lastCall >= interval) {
            lastCall = now;
            fn(...args);
        }
    };
}

const logScroll = throttle(() => {
    console.log("Scrolled!");
}, 1000);

window.addEventListener("scroll", logScroll);
// Logs at most once per second

Use case: Scroll handlers, mouse move handlers.


Generators and Iterators: Pausable Functions

Generator Functions: function* and yield

Specification: §15.5 (pp. 332-340)

Syntax:

function* generatorFn() {
    yield 1;
    yield 2;
    yield 3;
}

const gen = generatorFn();
console.log(gen.next());  // { value: 1, done: false }
console.log(gen.next());  // { value: 2, done: false }
console.log(gen.next());  // { value: 3, done: false }
console.log(gen.next());  // { value: undefined, done: true }

Generators are iterators:

for (const value of generatorFn()) {
    console.log(value);  // 1, 2, 3
}

Infinite generators:

function* fibonacci() {
    let [a, b] = [0, 1];
    while (true) {
        yield a;
        [a, b] = [b, a + b];
    }
}

const fib = fibonacci();
console.log(fib.next().value);  // 0
console.log(fib.next().value);  // 1
console.log(fib.next().value);  // 1
console.log(fib.next().value);  // 2
console.log(fib.next().value);  // 3

yield*: Delegating to Another Generator

function* gen1() {
    yield 1;
    yield 2;
}

function* gen2() {
    yield* gen1();
    yield 3;
}

console.log([...gen2()]);  // [1, 2, 3]

Generators as Coroutines: Bidirectional Communication

next(value) passes a value back to the generator:

function* dialogue() {
    const name = yield "What's your name?";
    const age = yield `Hello, ${name}! How old are you?`;
    return `${name} is ${age} years old.`;
}

const conv = dialogue();
console.log(conv.next().value);        // "What's your name?"
console.log(conv.next("Alice").value); // "Hello, Alice! How old are you?"
console.log(conv.next(30).value);      // "Alice is 30 years old."

Systems note: This is similar to Lua coroutines or Python generators—pausable functions that can resume with input.

Async Generators (ES2018)

Specification: §27.7 (pp. 718-722)

Combine async/await with generators:

async function* fetchPages() {
    let page = 1;
    while (page <= 3) {
        const response = await fetch(`https://api.example.com/page/${page}`);
        const data = await response.json();
        yield data;
        page++;
    }
}

(async () => {
    for await (const data of fetchPages()) {
        console.log(data);
    }
})();

End of Chapter 4

In the next chapter, we’ll explore JavaScript’s object model—prototypes, inheritance, property descriptors, proxies, and how engines optimize object operations through hidden classes and inline caches.


Chapter 5: Objects, Prototypes, and the Class Syntax

Objects: Property Bags, Dictionaries, and More

The Object Type: JavaScript’s Fundamental Composite

Specification: §6.1.7 (pp. 73-80)

In JavaScript, everything except primitives is an object. An object is a collection of properties, where each property is a key-value pair. The key is always a string or Symbol; the value can be any type.

Object as a property bag:

const person = {
    name: "Alice",
    age: 30,
    greet: function() {
        console.log(`Hello, I'm ${this.name}`);
    }
};

console.log(person.name);  // "Alice"
person.greet();            // "Hello, I'm Alice"

Internal structure (§6.1.7.1, pp. 74-75):

Object { [[Prototype]]: Object.prototype [[Extensible]]: true properties: { “name”: { value: “Alice”, writable: true, enumerable: true, configurable: true } “age”: { value: 30, writable: true, enumerable: true, configurable: true } “greet”: { value: , writable: true, enumerable: true, configurable: true } } }

Each property is represented by a Property Descriptor with attributes:

  • [[Value]]: The property’s value.

  • [[Writable]]: Can the value be changed?

  • [[Enumerable]]: Does it show up in for...in loops?

  • [[Configurable]]: Can the property be deleted or its attributes changed?

For accessors (getters/setters):

  • [[Get]]: Getter function.

  • [[Set]]: Setter function.

  • [[Enumerable]] and [[Configurable]] (same as data properties).

Object Creation: Literals, Constructors, and Object.create

Object Literal Syntax

Specification: §13.2.5 (pp. 247-251)

const obj = {
    x: 1,
    y: 2,
    method() {
        return this.x + this.y;
    }
};

Shorthand property names (ES6):

const x = 10, y = 20;
const obj = { x, y };  // Equivalent to { x: x, y: y }

Computed property names:

const propName = "dynamicKey";
const obj = {
    [propName]: "value",
    [`${propName}_2`]: "value2"
};

console.log(obj.dynamicKey);    // "value"
console.log(obj.dynamicKey_2);  // "value2"

Spread syntax (ES2018):

const obj1 = { a: 1, b: 2 };
const obj2 = { ...obj1, c: 3 };
console.log(obj2);  // { a: 1, b: 2, c: 3 }

// Shallow copy
const copy = { ...obj1 };
Constructor Functions
function Person(name, age) {
    this.name = name;
    this.age = age;
}

Person.prototype.greet = function() {
    console.log(`Hello, I'm ${this.name}`);
};

const alice = new Person("Alice", 30);
alice.greet();  // "Hello, I'm Alice"

What new does (§20.2.3.1, pp. 489-490):

  1. Create a new empty object: const obj = {}.

  2. Set obj.[[Prototype]] to Constructor.prototype.

  3. Call Constructor.call(obj, ...args) (bind this to obj).

  4. If the constructor returns an object, return that; otherwise, return obj.

Pseudo-code:

function myNew(Constructor, ...args) {
    const obj = Object.create(Constructor.prototype);
    const result = Constructor.apply(obj, args);
    return (typeof result === 'object' && result !== null) ? result : obj;
}
Object.create(): Explicit Prototype Specification

Specification: §20.1.2.2 (pp. 461-462)

const proto = {
    greet() {
        console.log(`Hello, I'm ${this.name}`);
    }
};

const alice = Object.create(proto);
alice.name = "Alice";
alice.greet();  // "Hello, I'm Alice"

Create an object with null prototype (no inherited properties):

const pureMap = Object.create(null);
pureMap.toString = "custom";  // No conflict with Object.prototype.toString
console.log(pureMap.toString);  // "custom"

Systems insight: Object.create(null) is used for dictionaries to avoid prototype pollution attacks.

Property Access: Dot vs. Bracket Notation

Dot notation:

const obj = { name: "Alice" };
console.log(obj.name);  // "Alice"

Bracket notation:

console.log(obj["name"]);  // "Alice"

const key = "name";
console.log(obj[key]);  // "Alice"

When to use bracket notation:

  1. Dynamic keys: obj[variableKey].

  2. Invalid identifiers: obj["invalid-key"], obj["123"].

  3. Symbols: obj[Symbol.iterator].

Property access algorithm (§13.3.2, pp. 251-252):

  1. Evaluate the object reference.

  2. Evaluate the property key (convert to string or Symbol).

  3. Perform ...GetValuevia the internal method **[[Get]]**, which traverses the prototype chain until it finds a matching property key or returnsundefined`.

Prototype lookup pseudocode:

function Get(obj, key) {
    let current = obj;
    while (current !== null) {
        if (Object.hasOwn(current, key)) return current[key];
        current = Object.getPrototypeOf(current);
    }
    return undefined;
}

If the property is not found on the object itself, the lookup continues upward through each object’s [[Prototype]] link until reaching null.


The Prototype Chain and Inheritance

The prototype Property of Constructor Functions

In JavaScript, constructor functions automatically have a .prototype property that becomes the [[Prototype]] of instances created with new.

function Person(name) {
    this.name = name;
}
Person.prototype.sayHi = function() {
    console.log(`Hi, I'm ${this.name}`);
};

const alice = new Person('Alice');
alice.sayHi();  // from Person.prototype

Internally:
alice.[[Prototype]] → Person.prototype Person.prototype.[[Prototype]] → Object.prototype Object.prototype.[[Prototype]] → null

Thus, every property lookup climbs the chain until a match is found or termination occurs at null.


Manipulating the Prototype Directly

  • Object.getPrototypeOf(obj) — returns the current [[Prototype]].

  • Object.setPrototypeOf(obj, proto) — changes the link (discouraged at runtime for performance).

Creating objects with a specific prototype:

const proto = { greet() { console.log('Hello'); } };
const obj = Object.create(proto);
obj.greet(); // “Hello”

Changing prototype chains after object creation can deoptimize hidden‑class optimizations (see below).


Classes: Modern Syntax for Prototype Inheritance

Basic Form

Introduced in ES2015, the class keyword provides a declarative layer over traditional prototype mechanics.

class Person {
    constructor(name) {
        this.name = name;
    }
    greet() {
        console.log(`Hello, I'm ${this.name}`);
    }
}

Equivalent behavior:

function Person(name) { this.name = name; }
Person.prototype.greet = function() { console.log(`Hello, I'm ${this.name}`); };

Adding Inheritance with extends

class Employee extends Person {
    constructor(name, role) {
        super(name);
        this.role = role;
    }
    describe() {
        console.log(`${this.name} works as ${this.role}`);
    }
}

The super() call invokes the parent constructor and ensures its initialization logic runs before the subclass’s.

Prototype chain under extends:

Employee.prototype.[[Prototype]] → Person.prototype Employee.[[Prototype]] → Person

Static methods on the parent class are inherited by the subclass through the second prototype link.


Static vs. Instance Methods

  • Instance methods live on the prototype and are shared by all instances.

  • Static methods belong directly to the constructor (class object itself).

class MathUtil {
    static clamp(value, min, max) {
        return Math.min(Math.max(value, min), max);
    }
}
console.log(MathUtil.clamp(10, 0, 5));  // 5

Static methods do not appear on instances.


this Binding in Methods and Arrow Functions

Within class method bodies, this refers to the instance. Arrow functions capture this lexically:

class Counter {
    constructor() {
        this.c = 0;
    }
    inc() {
        this.c++;
    }
    incAsync() {
        setTimeout(() => this.inc(), 100);  // lexical this
    }
}

Using traditional functions inside setTimeout would lose this unless bound manually (.bind(this)), so arrow functions are common for callbacks.


Object Representation in Engines

Engines like V8, SpiderMonkey, and JavaScriptCore translate objects into internal “hidden classes” (also called shapes).

  • When an object’s property layout changes, the engine may create a new hidden class.

  • Each hidden class maps property names → memory offsets.

  • Consistent object shapes enable Inline Caching (IC)—fast access when the same property is read repeatedly on likewise-shaped objects.

Example of deoptimization:

const obj = {};
obj.a = 1;  // shape #1
obj.b = 2;  // new hidden class (shape #2)

Changing property addition order affects runtime efficiency.


Enumerability and Property Inspection

  • Object.keys(obj) lists own enumerable string keys.

  • Object.getOwnPropertyNames(obj) returns all own string keys, enumerable or not.

  • Object.getOwnPropertySymbols(obj) lists own Symbol properties.

  • Reflect.ownKeys(obj) combines both string and Symbol keys.

Iteration with for...in traverses the entire prototype chain, enumerating enumerable properties only.


Controlling Mutability: Sealing, Freezing, Extensions

Method Prevent Add Prevent Delete Prevent Reconfigure Prevent Write
Object.preventExtensions()
Object.seal()
Object.freeze()

Each returns the same object reference after applying restrictions. For immutability validation, use Object.isFrozen(obj) etc.


Proxies and the Reflect API

Proxies

Proxies intercept fundamental operations like property getting, setting, calling, or construction. (§27.5 in ECMA‑262)

const target = { message: "Hi" };
const handler = {
    get(obj, prop, receiver) {
        console.log(`Accessing ${prop}`);
        return obj[prop];
    }
};
const proxy = new Proxy(target, handler);
console.log(proxy.message); // Logs + returns “Hi”

Internally, the get trap intercepts the [[Get]] operation; other traps include set, has, deleteProperty, construct, etc.

Reflect API

Reflect exposes the same internal operations without interception semantics, enabling transparent invocation:

Reflect.get(target, "message");
Reflect.set(target, "message", "Hello");

When designing Proxy handlers, prefer delegating to Reflect methods for consistent behavior.


Accessor and Data Properties

To define or inspect property attributes:

Object.defineProperty(obj, "x", {
    value: 1,
    writable: false,
    enumerable: true,
    configurable: false
});

console.log(Object.getOwnPropertyDescriptor(obj, "x"));

Accessor example:

Object.defineProperty(obj, "y", {
    get() { return this._y; },
    set(v) { this._y = v; },
    enumerable: true
});

Defining properties precisely allows you to implement computed members and validations without exposing backing storage directly.


The super Keyword and Lexical Home Objects

Inside a method using super, JavaScript references the method on the prototype of the current home object—the lexical container where the method was defined—not the runtime receiver.

class A {
    greet() { console.log("Hi from A"); }
}
class B extends A {
    greet() {
        super.greet();
        console.log("Hi from B");
    }
}
new B().greet();

Call resolution:

  1. B.prototype.greet uses its [[HomeObject]] = B.prototype.

  2. super.greet reads from Object.getPrototypeOf(B.prototype)A.prototype.

This ensures super behaves predictably even when functions are borrowed or rebound.


Summary

  • Objects are extensible property maps managed through descriptors.

  • Prototype chains handle inheritance and influence lookup semantics.

  • Classes provide syntactic sugar around prototype linkage and constructor invocation.

  • Hidden classes and inline caches optimize property access internally.

  • Proxies and Reflect expose or intercept fundamental object operations.

  • Descriptors and freeze/seal control configurability and immutability.

  • super relies on lexical [[HomeObject]] for predictable prototype method delegation.

Together, these mechanisms define the structure and performance behavior of objects—the backbone of all JavaScript execution models.


Chapter 6: Arrays, Typed Arrays, and Buffers

Arrays: JavaScript’s Flexible Ordered Collections

The Array Type: Objects with Special Length Behavior

Specification: §23.1 (pp. 545-584)

In JavaScript, arrays are specialized objects where integer-indexed properties (0, 1, 2, …) are automatically tracked via a magic length property. Arrays inherit from Array.prototype, gaining powerful iteration and transformation methods.

Basic array creation:

const arr = [1, 2, 3];
console.log(arr.length);  // 3
console.log(arr[0]);      // 1

arr[5] = 10;
console.log(arr.length);  // 6 (automatically updated)
console.log(arr);         // [1, 2, 3, <2 empty items>, 10]

Internal representation (conceptual):

Array { [[Prototype]]: Array.prototype 0: 1 1: 2 2: 3 5: 10 length: 6 // automatically maintained }

The length property is writable: setting it truncates or extends the array.

arr.length = 3;
console.log(arr);  // [1, 2, 3] (elements beyond index 2 removed)

arr.length = 5;
console.log(arr);  // [1, 2, 3, <2 empty items>]

Array Creation Methods

Array Literals
const empty = [];
const nums = [1, 2, 3];
const mixed = [1, "two", { three: 3 }, [4, 5]];

Sparse arrays (with holes):

const sparse = [1, , 3];
console.log(sparse.length);  // 3
console.log(1 in sparse);    // false (no property at index 1)
Array Constructor
const arr1 = new Array(3);        // [<3 empty items>], length = 3
const arr2 = new Array(1, 2, 3);  // [1, 2, 3]
const arr3 = Array.of(3);         // [3] (avoids ambiguity)

Note: new Array(n) creates a sparse array of length n if n is a single number.

Array.from(): Converting Iterables to Arrays

Specification: §23.1.2.1 (pp. 547-548)

// From string
Array.from("hello");  // ['h', 'e', 'l', 'l', 'o']

// From Set
Array.from(new Set([1, 2, 2, 3]));  // [1, 2, 3]

// With mapping function
Array.from([1, 2, 3], x => x * 2);  // [2, 4, 6]

// From array-like object
const arrayLike = { 0: 'a', 1: 'b', length: 2 };
Array.from(arrayLike);  // ['a', 'b']

Array-like objects have a length property and indexed elements, but lack array methods. Array.from() converts them to real arrays.

Spread Syntax (ES6)
const arr1 = [1, 2];
const arr2 = [...arr1, 3, 4];  // [1, 2, 3, 4]

// Shallow copy
const copy = [...arr1];

// Concatenation
const combined = [...arr1, ...arr2];

Array Methods: Iteration and Transformation

Specification: §23.1.3 (pp. 549-584)

Mutating Methods
Method Description Returns
push(item) Add to end New length
pop() Remove from end Removed item
shift() Remove from start Removed item
unshift(item) Add to start New length
splice(start, deleteCount, ...items) Remove/insert at index Removed items array
reverse() Reverse in place The array itself
sort([compareFn]) Sort in place The array itself
const arr = [3, 1, 4, 1, 5];

arr.push(9);           // [3, 1, 4, 1, 5, 9]
arr.pop();             // [3, 1, 4, 1, 5]
arr.unshift(0);        // [0, 3, 1, 4, 1, 5]
arr.shift();           // [3, 1, 4, 1, 5]

arr.splice(2, 1, 2);   // Remove 1 item at index 2, insert 2
                       // [3, 1, 2, 1, 5]

arr.reverse();         // [5, 1, 2, 1, 3]
arr.sort();            // [1, 1, 2, 3, 5] (lexicographic by default!)

Sorting gotcha: Default sort converts elements to strings!

[10, 2, 1].sort();  // [1, 10, 2] (string comparison!)

// Numeric sort:
[10, 2, 1].sort((a, b) => a - b);  // [1, 2, 10]
Non-Mutating Methods
Method Description Returns
concat(...arrays) Merge arrays New array
slice(start, end) Extract subarray New array
join(separator) Join to string String
indexOf(item) First index of item Index or -1
lastIndexOf(item) Last index of item Index or -1
includes(item) Check presence (ES7) Boolean
const arr = [1, 2, 3, 4, 5];

arr.slice(1, 3);           // [2, 3]
arr.concat([6, 7]);        // [1, 2, 3, 4, 5, 6, 7]
arr.join('-');             // "1-2-3-4-5"
arr.indexOf(3);            // 2
arr.includes(4);           // true
Higher-Order Methods (Iteration)

Specification: §23.1.3.10, §23.1.3.15, §23.1.3.7, §23.1.3.8 (pp. 554-564)

Method Description Returns
forEach(fn) Execute fn for each undefined
map(fn) Transform each element New array
filter(fn) Keep elements where fn returns truthy New array
reduce(fn, initial) Accumulate values Single value
find(fn) First element where fn returns truthy Element or undefined
findIndex(fn) Index of first match Index or -1
some(fn) Any element passes test Boolean
every(fn) All elements pass test Boolean
const nums = [1, 2, 3, 4, 5];

nums.forEach(x => console.log(x));

const doubled = nums.map(x => x * 2);         // [2, 4, 6, 8, 10]
const evens = nums.filter(x => x % 2 === 0);  // [2, 4]
const sum = nums.reduce((acc, x) => acc + x, 0);  // 15

const firstEven = nums.find(x => x % 2 === 0);       // 2
const firstEvenIdx = nums.findIndex(x => x % 2 === 0);  // 1

nums.some(x => x > 10);   // false
nums.every(x => x > 0);   // true

Key insight: These methods do not mutate the original array (except forEach which has no return value).

Performance consideration: Chaining multiple array methods creates intermediate arrays. For large datasets, consider single-pass iteration or generator-based approaches.

Array Holes and Sparse Arrays

Arrays can have holes—missing indices that don’t hold a value.

const sparse = [1, , 3];
console.log(sparse.length);  // 3
console.log(1 in sparse);    // false

sparse.forEach(x => console.log(x));  // 1, 3 (skips hole)
sparse.map(x => x * 2);               // [2, <1 empty item>, 6]

Holes vs. undefined:

const withUndefined = [1, undefined, 3];
console.log(1 in withUndefined);  // true (has property)

const withHole = [1, , 3];
console.log(1 in withHole);       // false (no property)

Most iteration methods skip holes, but some (like map) preserve them in results.


Typed Arrays: Fixed-Size Binary Data Views

Motivation: Performance and Binary Data

From the extracted text (wasm-defguide.pdf):

“JavaScript is a flexible and dynamic language, but it has not historically made it easy or efficient to deal with individual bytes of large data sets. This complicates the use of low-level libraries, as the data has to be copied into and out of JavaScript-native formats, which is inefficient.”

Typed Arrays were introduced for WebGL and now power:

  • Canvas 2D

  • XMLHttpRequest2

  • File API

  • WebSockets (binary)

  • WebAssembly memory

The ArrayBuffer: Raw Binary Data

Specification: §25.1 (pp. 636-638)

An ArrayBuffer represents a fixed-length raw binary data buffer—just bytes in memory.

const buffer = new ArrayBuffer(16);  // 16 bytes
console.log(buffer.byteLength);      // 16

You cannot directly read or write an ArrayBuffer—you need a view:

Typed Array Views: Interpreting Bytes

Specification: §23.2 (pp. 585-618)

Typed arrays are views over an ArrayBuffer, interpreting bytes according to a specific type.

Available types:

Type Bytes per Element C Equivalent Range
Int8Array 1 int8_t -128 to 127
Uint8Array 1 uint8_t 0 to 255
Uint8ClampedArray 1 uint8_t (clamped) 0 to 255
Int16Array 2 int16_t -32768 to 32767
Uint16Array 2 uint16_t 0 to 65535
Int32Array 4 int32_t 231-2^{31} to 23112^{31}-1
Uint32Array 4 uint32_t 0 to 23212^{32}-1
Float32Array 4 float IEEE 754 single
Float64Array 8 double IEEE 754 double
BigInt64Array 8 int64_t 263-2^{63} to 26312^{63}-1
BigUint64Array 8 uint64_t 0 to 26412^{64}-1
Creating Typed Arrays
// From length (creates new buffer)
const u8 = new Uint8Array(10);
console.log(u8.length);        // 10
console.log(u8.byteLength);    // 10
console.log(u8.buffer);        // ArrayBuffer(10)

// From array
const u32 = new Uint32Array([1, 2, 3]);

// From existing buffer
const buffer = new ArrayBuffer(16);
const view1 = new Uint32Array(buffer);     // 4 elements (16 / 4)
const view2 = new Uint8Array(buffer);      // 16 elements (16 / 1)
Example: Multiple Views of Same Buffer

From the extracted text (wasm-defguide.pdf, page 68):

var u32arr = new Uint32Array(10);
u32arr[0] = 257;

var u32buf = u32arr.buffer;
var u8arr = new Uint8Array(u32buf);

console.log(u32arr);  // Uint32Array(10) [ 257, 0, 0, 0, ... ]
console.log(u8arr);   // Uint8Array(40) [ 1, 1, 0, 0, 0, 0, ... ]

Why does 257 appear as [1, 1, 0, 0]?

The number 257 in binary: 0000 0001 0000 0001 (two bytes).

Little-endian (most common on x86/x64):

  • Least significant byte stored first

  • 257 = 0x0101 → stored as [0x01, 0x01, 0x00, 0x00]

Visual representation:

Uint32Array index 0: 257 [ byte 0 ][ byte 1 ][ byte 2 ][ byte 3 ] Uint8Array: [ 1 ][ 1 ][ 0 ][ 0 ] Binary: [00000001 ][00000001 ][00000000 ][00000000 ]

The Uint8Array view shows the raw bytes underlying the Uint32Array.

Endianness: Little-Endian vs. Big-Endian

From the extracted text (wasm-defguide.pdf, page 69):

“In this case, a little endian system stores the least significant bytes first (the 1s). A big endian system would store the 0s first. In the grand scheme of things, it does not matter how they are stored, but different systems and protocols will pick one or the other.”

Little-endian (x86, ARM in little mode):

  • Number 0x12345678 stored as: [78, 56, 34, 12]

Big-endian (network byte order, some ARM modes):

  • Number 0x12345678 stored as: [12, 34, 56, 78]

Systems consideration: When interfacing with binary protocols or file formats, you must know the expected endianness. JavaScript Typed Arrays use the platform’s native endianness (usually little-endian).

DataView: Explicit Endianness Control

Specification: §25.3 (pp. 643-647)

DataView allows reading/writing values with explicit endianness.

const buffer = new ArrayBuffer(4);
const view = new DataView(buffer);

// Write 32-bit integer in big-endian
view.setUint32(0, 0x12345678, false);  // false = big-endian

// Read as bytes
const u8 = new Uint8Array(buffer);
console.log(u8);  // [0x12, 0x34, 0x56, 0x78]

// Read back as little-endian
view.getUint32(0, true);   // 0x78563412
// Read back as big-endian
view.getUint32(0, false);  // 0x12345678

DataView methods:

// Getters: getInt8, getUint8, getInt16, getUint16, getInt32, getUint32,
//          getFloat32, getFloat64, getBigInt64, getBigUint64
// Setters: setInt8, setUint8, etc.

view.getInt16(byteOffset, littleEndian);
view.setFloat32(byteOffset, value, littleEndian);

Use cases:

  • Parsing binary file formats (BMP, WAV, etc.)

  • Network protocols

  • WebAssembly memory interaction


Typed Arrays and WebAssembly Memory

WebAssembly Linear Memory

Specification: WebAssembly Core Spec §4.2.8 (pp. 37-38)

WebAssembly modules have a linear memory—a contiguous, resizable array of bytes starting at address 0.

Accessing from JavaScript:

const memory = new WebAssembly.Memory({ initial: 1 });  // 1 page = 64KB
const buffer = memory.buffer;  // ArrayBuffer

const u8 = new Uint8Array(buffer);
u8[0] = 42;

const i32 = new Int32Array(buffer);
i32[1] = 0x12345678;

// Grow memory
memory.grow(1);  // Add 1 page (64KB)
// Note: buffer reference becomes detached; must re-acquire

Memory growth invalidates old buffer references:

const oldBuffer = memory.buffer;
memory.grow(1);
// oldBuffer is now detached (length becomes 0)
const newBuffer = memory.buffer;  // Must get new reference

Sharing Memory Between WebAssembly and JavaScript

Pattern: Allocate space in WebAssembly memory, pass offset to JS, manipulate via Typed Array.

// WebAssembly exports memory
const wasmModule = await WebAssembly.instantiate(wasmBytes);
const { memory, allocate, process } = wasmModule.instance.exports;

// Allocate 1024 bytes in Wasm memory
const ptr = allocate(1024);

// View as Uint8Array
const u8 = new Uint8Array(memory.buffer, ptr, 1024);

// Write data
u8.set([1, 2, 3, 4]);

// Call Wasm function to process data
process(ptr, 1024);

This pattern avoids copying data between JavaScript and WebAssembly.


SharedArrayBuffer and Atomics: True Parallelism

SharedArrayBuffer: Memory Shared Between Workers

Specification: §25.2 (pp. 638-643)

SharedArrayBuffer allows multiple workers to access the same memory—enabling true parallelism.

// Main thread
const sab = new SharedArrayBuffer(1024);
const worker = new Worker('worker.js');
worker.postMessage(sab);

// worker.js
self.onmessage = (e) => {
    const sab = e.data;
    const u32 = new Uint32Array(sab);
    u32[0] = 42;  // Visible to main thread
};

Security note: SharedArrayBuffer was temporarily disabled after Spectre/Meltdown attacks. It now requires:

  • HTTPS

  • Cross-Origin-Opener-Policy: same-origin

  • Cross-Origin-Embedder-Policy: require-corp

Atomics: Safe Concurrent Access

Specification: §25.4 (pp. 647-659)

Without synchronization, concurrent reads/writes cause data races.

Atomics operations:

Method Description
Atomics.load(ta, index) Atomic read
Atomics.store(ta, index, value) Atomic write
Atomics.add(ta, index, value) Atomic add, return old value
Atomics.sub(ta, index, value) Atomic subtract
Atomics.and(ta, index, value) Atomic bitwise AND
Atomics.or(ta, index, value) Atomic bitwise OR
Atomics.xor(ta, index, value) Atomic bitwise XOR
Atomics.exchange(ta, index, value) Atomic swap
Atomics.compareExchange(ta, index, expected, replacement) CAS operation

Wait/notify for coordination:

// Worker 1: Wait for signal
const i32 = new Int32Array(sab);
Atomics.wait(i32, 0, 0);  // Block until i32[0] != 0
console.log('Woken up!');

// Worker 2: Send signal
Atomics.store(i32, 0, 1);
Atomics.notify(i32, 0, 1);  // Wake 1 waiter

Example: Atomic counter:

const sab = new SharedArrayBuffer(4);
const counter = new Int32Array(sab);

// Multiple workers increment safely
Atomics.add(counter, 0, 1);

// Read final value
console.log(Atomics.load(counter, 0));

Array-like Objects and the Iterable Protocol

Array-like Objects

An array-like object has:

  1. A length property

  2. Indexed properties (0, 1, 2, …)

const arrayLike = {
    0: 'a',
    1: 'b',
    2: 'c',
    length: 3
};

// Convert to real array
const arr = Array.from(arrayLike);
console.log(arr);  // ['a', 'b', 'c']

// Use array methods
Array.prototype.forEach.call(arrayLike, item => console.log(item));

Common array-like objects:

  • arguments (inside functions)

  • DOM NodeLists

  • Typed Arrays

The Iterable Protocol

Specification: §27.1 (pp. 661-671)

An object is iterable if it implements Symbol.iterator, which returns an iterator.

const arr = [1, 2, 3];
const iterator = arr[Symbol.iterator]();

console.log(iterator.next());  // { value: 1, done: false }
console.log(iterator.next());  // { value: 2, done: false }
console.log(iterator.next());  // { value: 3, done: false }
console.log(iterator.next());  // { value: undefined, done: true }

Custom iterable:

const range = {
    start: 1,
    end: 5,
    [Symbol.iterator]() {
        let current = this.start;
        const end = this.end;
        return {
            next() {
                if (current <= end) {
                    return { value: current++, done: false };
                }
                return { done: true };
            }
        };
    }
};

for (const n of range) {
    console.log(n);  // 1, 2, 3, 4, 5
}

Performance Considerations

Array Method Overhead

Chaining array methods creates intermediate arrays:

// Three intermediate arrays created
const result = arr
    .filter(x => x > 0)
    .map(x => x * 2)
    .slice(0, 10);

Optimization: Use single-pass iteration when possible:

const result = [];
for (const x of arr) {
    if (x > 0) {
        result.push(x * 2);
        if (result.length === 10) break;
    }
}

Typed Array Benefits

Performance advantages:

  1. Fixed size: No reallocation overhead

  2. Type safety: No type checks at runtime

  3. Memory efficient: Compact storage

  4. Cache friendly: Contiguous memory layout

  5. Native optimization: JIT can generate specialized code

// Slower: boxing overhead
const arr = [];
for (let i = 0; i < 1000000; i++) arr.push(i);

// Faster: no boxing, contiguous memory
const ta = new Uint32Array(1000000);
for (let i = 0; i < 1000000; i++) ta[i] = i;

When to Use Typed Arrays

Use Typed Arrays when:

  • Working with binary data (files, network, WebAssembly)

  • Need fixed-size, homogeneous numeric data

  • Performance is critical

  • Interfacing with native APIs (WebGL, Canvas, Audio)

Use regular Arrays when:

  • Need dynamic sizing

  • Mixed types

  • Rich array methods (filter, map, reduce)

  • Readability over raw performance


Summary

Arrays are JavaScript’s flexible, dynamic ordered collections:

  • Automatic length management

  • Rich suite of transformation methods

  • Support for sparse arrays and holes

  • Based on prototype inheritance from Array.prototype

Typed Arrays provide high-performance, fixed-size binary data views:

  • Multiple views (Uint8Array, Float32Array, etc.) over ArrayBuffer

  • Essential for WebGL, Canvas, WebAssembly, and binary protocols

  • Endianness matters for multi-byte values

  • DataView offers explicit endianness control

SharedArrayBuffer and Atomics enable true multi-threaded parallelism:

  • Workers share memory via SharedArrayBuffer

  • Atomics prevent data races through synchronized operations

  • Requires secure context and CORS headers

Performance insights:

  • Regular arrays optimize for flexibility

  • Typed arrays optimize for raw speed and memory efficiency

  • Choose based on your use case: dynamic vs. fixed, mixed vs. homogeneous, JS-only vs. native interop

Together, these mechanisms provide JavaScript with both high-level ergonomics and low-level control over memory—bridging the gap between scripting convenience and systems programming performance.


Chapter 7: Modules, Imports, and Code Organization

The Evolution of JavaScript Modules

Pre-Module Era: Script Tags and Global Scope

Before ES6 (2015), JavaScript had no native module system. Code organization relied on:

  1. Multiple <script> tags with global variables

  2. Immediately Invoked Function Expressions (IIFE) for encapsulation

  3. Community solutions: CommonJS (Node.js) and AMD (RequireJS)

The global namespace problem:

// file1.js
var counter = 0;
function increment() { counter++; }

// file2.js
var counter = 10;  // Collision! Overwrites file1's counter
function increment() { /* different implementation */ }  // Collision!

HTML loading order dependency:

<script src="library.js"></script>
<script src="plugin.js"></script>  <!-- Must load AFTER library -->
<script src="app.js"></script>      <!-- Must load AFTER plugin -->

Loading order errors were common and hard to debug.

IIFE Pattern: Manual Encapsulation

// Module pattern using IIFE
var myModule = (function() {
    // Private variables
    var privateVar = 'secret';
    
    function privateFunction() {
        return privateVar;
    }
    
    // Public API
    return {
        publicMethod: function() {
            return privateFunction();
        }
    };
})();

myModule.publicMethod();  // 'secret'
myModule.privateVar;      // undefined (encapsulated)

Limitations:

  • No dependency management

  • Manual dependency ordering

  • No static analysis

  • No tree-shaking

  • Verbose syntax

CommonJS: Node.js Module System

Specification: Not part of ECMA-262 (Node.js specific)

// math.js
function add(a, b) {
    return a + b;
}

module.exports = { add };
// or: exports.add = add;

// app.js
const math = require('./math');
console.log(math.add(2, 3));  // 5

Characteristics:

  • Synchronous loading: Blocks until module loaded (fine for server, bad for browser)

  • Dynamic imports: require() can be called conditionally

  • Runtime resolution: Dependencies resolved during execution

  • Single export object: module.exports or exports

Module caching:

// counter.js
let count = 0;
module.exports = {
    increment: () => ++count,
    get: () => count
};

// app.js
const counter1 = require('./counter');
const counter2 = require('./counter');

counter1.increment();
console.log(counter2.get());  // 1 (same instance!)

Modules are cached after first load; subsequent require() calls return the same object.


ES6 Modules: Native JavaScript Module System

Basic Syntax and Semantics

Specification: §16.2 (pp. 377-396)

ES6 modules use static import/export syntax with these key features:

  1. File-based: Each file is a separate module

  2. Strict mode by default: All module code runs in strict mode

  3. Top-level scope: Variables don’t leak to global

  4. Static structure: Imports/exports must be at top level (not in blocks)

  5. Asynchronous loading: Designed for browsers

Basic export:

// math.js
export function add(a, b) {
    return a + b;
}

export function subtract(a, b) {
    return a - b;
}

export const PI = 3.14159;

Basic import:

// app.js
import { add, subtract, PI } from './math.js';

console.log(add(2, 3));      // 5
console.log(subtract(5, 2)); // 3
console.log(PI);             // 3.14159

Named Exports vs. Default Export

Named Exports
// utils.js
export function helper1() { }
export function helper2() { }
export const CONFIG = { };

// app.js
import { helper1, helper2, CONFIG } from './utils.js';

Renaming on export:

function internalName() { }
export { internalName as publicName };

Renaming on import:

import { helper1 as h1, helper2 as h2 } from './utils.js';
Default Export

Each module can have one default export:

// logger.js
export default function log(message) {
    console.log(message);
}

// app.js
import log from './logger.js';  // No braces
log('Hello');

// Can use any name
import myLogger from './logger.js';
myLogger('Hello');

Default + named exports:

// module.js
export default function main() { }
export function helper() { }

// app.js
import main, { helper } from './module.js';

Default export gotchas:

// ❌ Invalid syntax
export default const x = 1;

// ✅ Valid alternatives
const x = 1;
export default x;

// or
export default 1;

Import Variations and Patterns

Namespace Import

Import all exports as a single object:

// math.js
export const PI = 3.14159;
export function add(a, b) { return a + b; }
export function multiply(a, b) { return a * b; }

// app.js
import * as math from './math.js';

console.log(math.PI);         // 3.14159
console.log(math.add(2, 3));  // 5

Systems insight: Namespace imports enable tree-shaking while maintaining clean code organization.

Re-exporting

Create barrel files (index.js) to aggregate exports:

// components/Button.js
export default function Button() { }

// components/Input.js
export default function Input() { }

// components/index.js (barrel)
export { default as Button } from './Button.js';
export { default as Input } from './Input.js';

// app.js
import { Button, Input } from './components/index.js';

Re-export all:

// Re-export everything from another module
export * from './other-module.js';

// Re-export everything as namespace
export * as utils from './utils.js';
Side-effect Imports

Import for side effects only (no bindings):

// polyfill.js
if (!Array.prototype.includes) {
    Array.prototype.includes = function(item) {
        return this.indexOf(item) !== -1;
    };
}

// app.js
import './polyfill.js';  // Execute but don't import anything

Use cases:

  • Polyfills

  • Global CSS imports

  • Registering web components

  • Database connection initialization

Live Bindings: A Key Semantic Difference

Specification: §16.2.1.5 (pp. 381-383)

Unlike CommonJS (which exports values), ES6 modules export live bindings—references to the exported variables.

// counter.js (ES6 module)
export let count = 0;
export function increment() {
    count++;
}

// app.js
import { count, increment } from './counter.js';

console.log(count);  // 0
increment();
console.log(count);  // 1 (automatically updated!)

// ❌ Cannot mutate imported binding
count = 5;  // SyntaxError: "count" is read-only

Contrast with CommonJS:

// counter.js (CommonJS)
let count = 0;
module.exports = {
    count,
    increment() { count++; }
};

// app.js
const counter = require('./counter');
console.log(counter.count);  // 0
counter.increment();
console.log(counter.count);  // 0 (NOT updated! Value was copied)

Systems insight: Live bindings enable:

  • Circular dependencies to work correctly

  • Hot module replacement (HMR)

  • Better tree-shaking (bundlers can trace usage)


Module Loading and Resolution

Browser Module Loading

Specification: HTML spec §8.1.4 (not in ECMA-262)

<!-- Load as module -->
<script type="module" src="./app.js"></script>

<!-- Inline module -->
<script type="module">
    import { render } from './render.js';
    render();
</script>

Module script behavior:

  1. Deferred by default: Wait for HTML parsing (like defer attribute)

  2. CORS required: Cross-origin modules need CORS headers

  3. Strict mode: Always runs in strict mode

  4. Top-level await: Allowed (blocks dependent modules)

  5. No document.write(): Would break deferred loading

Module resolution algorithm:

  1. Normalize module specifier

    • Relative: ‘./module.js’, ‘../utils.js’

    • Absolute: ‘/lib/module.js’

    • URL: ‘https://cdn.example.com/lib.js’

    • Bare specifier: ‘lodash’ (requires import map)

  2. Check module cache

    • If cached, return cached module
  3. Fetch module source

    • Parse as module code

    • Recursively fetch dependencies

  4. Instantiate module

    • Create module environment

    • Bind exports

  5. Execute module code

    • Run module body once

Import Maps: Bare Specifier Resolution

Specification: HTML spec (WICG proposal, now baseline)

<script type="importmap">
{
    "imports": {
        "lodash": "/node_modules/lodash-es/lodash.js",
        "jquery": "https://cdn.jsdelivr.net/npm/jquery@3/dist/jquery.min.js",
        "utils/": "/src/utils/"
    }
}
</script>

<script type="module">
    import _ from 'lodash';          // Resolves to /node_modules/...
    import $ from 'jquery';          // Resolves to CDN
    import { helper } from 'utils/helper.js';  // Resolves to /src/utils/helper.js
</script>

Scoped imports (different resolution per path):

{
    "imports": {
        "lodash": "/node_modules/lodash-es/lodash.js"
    },
    "scopes": {
        "/legacy/": {
            "lodash": "/node_modules/lodash/lodash.js"
        }
    }
}

Node.js Module Resolution

Algorithm (simplified):

  1. Core modules: fs, path, http (built-in)

  2. Relative/absolute paths: Resolve directly

  3. Bare specifiers: Search node_modules/

    • Current directory’s node_modules/

    • Parent directory’s node_modules/

    • Recursively up to filesystem root

File/directory resolution:

// import './module'
// Tries in order:
// 1. ./module.js
// 2. ./module.json
// 3. ./module.node
// 4. ./module/package.json (check "main" field)
// 5. ./module/index.js

ES Modules in Node.js (v12.17+):

  • .mjs extension: Always treated as ES module

  • .js extension: Check nearest package.json for "type": "module"

  • .cjs extension: Always treated as CommonJS

// package.json
{
    "type": "module",  // All .js files are ES modules
    "exports": {
        ".": "./src/index.js",
        "./utils": "./src/utils.js"
    }
}

Dynamic Imports: Runtime Module Loading

import() Expression

Specification: §13.3.10 (pp. 258-259)

Dynamic imports return a Promise that resolves to the module namespace:

// Static import (top-level only)
import { helper } from './utils.js';

// Dynamic import (anywhere, returns Promise)
import('./utils.js').then(module => {
    module.helper();
});

// With async/await
async function load() {
    const module = await import('./utils.js');
    module.helper();
}

Use cases:

1. Code Splitting (Lazy Loading)
async function showAdmin() {
    const { AdminPanel } = await import('./admin-panel.js');
    // admin-panel.js only loaded when needed
    return new AdminPanel();
}

document.getElementById('admin-btn').onclick = showAdmin;
2. Conditional Loading
async function loadPolyfills() {
    if (!('IntersectionObserver' in window)) {
        await import('./intersection-observer-polyfill.js');
    }
}
3. Computed Module Paths
const locale = navigator.language;
const translations = await import(`./i18n/${locale}.js`);
4. Dynamic Feature Detection
async function getRenderer() {
    if (supportsWebGL()) {
        return import('./webgl-renderer.js');
    } else {
        return import('./canvas-renderer.js');
    }
}

Error handling:

try {
    const module = await import('./might-not-exist.js');
} catch (err) {
    console.error('Module load failed:', err);
    // Fallback logic
}

Systems insight: Bundlers (webpack, Rollup) use dynamic imports as split points to create separate chunks, enabling:

  • Faster initial load (smaller main bundle)

  • Parallel loading (chunks load independently)

  • Better caching (unchanged chunks stay cached)


Module Patterns and Best Practices

The Singleton Pattern

Modules execute once and cache their state:

// database.js
let connection = null;

export function connect(config) {
    if (!connection) {
        connection = new DatabaseConnection(config);
    }
    return connection;
}

export function query(sql) {
    return connection.execute(sql);
}

Multiple imports get the same instance:

// file1.js
import { connect } from './database.js';
connect({ host: 'localhost' });

// file2.js
import { query } from './database.js';
query('SELECT * FROM users');  // Uses connection from file1

The Factory Pattern

Export functions that create instances:

// logger.js
export function createLogger(name) {
    return {
        log(message) {
            console.log(`[${name}] ${message}`);
        }
    };
}

// app.js
import { createLogger } from './logger.js';

const userLogger = createLogger('User');
const authLogger = createLogger('Auth');

userLogger.log('Created');  // [User] Created
authLogger.log('Login');    // [Auth] Login

Dependency Injection via Modules

// services.js
let db = null;
let cache = null;

export function configure(dependencies) {
    db = dependencies.db;
    cache = dependencies.cache;
}

export function getUserById(id) {
    const cached = cache.get(id);
    if (cached) return cached;
    
    const user = db.query('SELECT * FROM users WHERE id = ?', id);
    cache.set(id, user);
    return user;
}

// main.js
import { configure } from './services.js';
import { createDB } from './database.js';
import { createCache } from './cache.js';

configure({
    db: createDB(),
    cache: createCache()
});

Circular Dependencies

ES6 modules handle circular dependencies via live bindings:

// a.js
import { b } from './b.js';
export const a = 'A';
console.log(b);  // Works! b is a live binding

// b.js
import { a } from './a.js';
export const b = 'B';
console.log(a);  // Works! a is a live binding

Execution order:

  1. Start loading a.js

  2. Encounter import of b.js, start loading it

  3. b.js imports a.js (already loading, continue)

  4. b.js exports b, accesses a (binding exists but not yet initialized)

  5. Return to a.js, export a, access b (now initialized)

Best practice: Avoid circular dependencies when possible. If necessary:

  • Use functions (not top-level code) to access circular imports

  • Ensure initialization order doesn’t matter

// ✅ Safe circular dependency
// a.js
import { getB } from './b.js';
export function getA() { return 'A'; }
console.log(getB());

// b.js
import { getA } from './a.js';
export function getB() { return 'B'; }
console.log(getA());

Module Bundling and Build Tools

Why Bundling?

Problems with native modules in production:

  1. Too many HTTP requests: Each import = separate request

  2. No minification: Unoptimized source code

  3. No transpilation: Can’t use newer syntax on older browsers

  4. No tree-shaking: Dead code included

Bundlers solve these by:

  1. Concatenating modules into fewer files

  2. Resolving dependencies statically

  3. Eliminating dead code (tree-shaking)

  4. Minifying output

  5. Code splitting for optimal loading

Tree-Shaking: Dead Code Elimination

Specification: Not in ECMA-262 (bundler feature)

Tree-shaking removes unused exports by analyzing static imports:

// utils.js
export function used() { }
export function unused() { }

// app.js
import { used } from './utils.js';
used();

// Bundled output (unused eliminated)
function used() { }
used();

Why static imports matter:

// ✅ Tree-shakeable (static)
import { func } from './module.js';

// ❌ Not tree-shakeable (dynamic)
const module = require('./module.js');
const { func } = module;

// ❌ Not tree-shakeable (conditional)
if (condition) {
    import { func } from './module.js';
}

Best practices for tree-shaking:

  1. Use named exports (not default)

  2. Avoid namespace imports (import * as)

  3. Mark side-effect-free in package.json:

{
    "sideEffects": false
}

Or specify files with side effects:

{
    "sideEffects": ["./src/polyfills.js", "*.css"]
}

Common Bundlers

Webpack: Full-featured, complex configuration

  • Entry/output configuration

  • Loaders for non-JS assets

  • Plugins for optimization

  • Code splitting via dynamic imports

Rollup: Optimized for libraries

  • Better tree-shaking

  • Smaller output

  • Multiple output formats (ESM, CJS, UMD)

esbuild: Extremely fast (written in Go)

  • 10-100× faster than webpack

  • Built-in TypeScript support

  • Minimal configuration

Vite: Modern dev server + Rollup for production

  • Native ES modules in development

  • Instant server start

  • Hot Module Replacement (HMR)


Module Scope and this

Top-Level this is undefined

Specification: §16.1.7 (pp. 375-376)

// module.js
console.log(this);  // undefined

// script.js (non-module)
console.log(this);  // window (in browser) or global (in Node.js)

Why? Modules are always strict mode, and strict mode makes top-level this undefined to prevent accidental global access.

Variables Don’t Leak to Global

// module.js
var x = 1;
let y = 2;
const z = 3;

// None of these leak to window/global
console.log(window.x);  // undefined

Contrast with scripts:

// script.js
var x = 1;
console.log(window.x);  // 1 (leaked to global!)

Top-Level await

Specification: §16.1.7 (pp. 375-376)

Modules can use await at the top level:

// data.js
export const users = await fetch('/api/users').then(r => r.json());

// app.js
import { users } from './data.js';
console.log(users);  // Waits for data.js to finish loading

Execution model:

  1. Module starts loading

  2. Reaches await, pauses execution

  3. Dependent modules wait for this module

  4. Once resolved, continues execution

Use cases:

  • Fetch configuration before app starts

  • Load translations

  • Initialize database connections

Caution: Blocks dependent modules—use sparingly for critical resources only.


Worker Modules

Specification: HTML spec (Web Workers)

Workers can load ES modules:

// main.js
const worker = new Worker('./worker.js', { type: 'module' });

worker.postMessage({ cmd: 'process', data: [1, 2, 3] });

// worker.js (ES module)
import { process } from './processor.js';

self.onmessage = (e) => {
    const result = process(e.data.data);
    self.postMessage(result);
};

Benefits:

  • Clean imports (no importScripts())

  • Static analysis

  • Same module syntax as main thread


Summary

JavaScript modules have evolved from global scripts to a sophisticated system:

Historical progression:

  • IIFE pattern: Manual encapsulation

  • CommonJS: Synchronous, dynamic, Node.js

  • ES6 modules: Static, asynchronous, native

ES6 module characteristics:

  • Static structure: Enables tree-shaking and static analysis

  • Live bindings: Exports are references, not values

  • Strict mode: Always enabled

  • Deferred execution: In browsers, like defer attribute

  • Singleton by default: Module code runs once

Import/export patterns:

  • Named exports: Multiple exports per module

  • Default export: One primary export

  • Namespace import: Import all as object

  • Re-exports: Barrel files for aggregation

  • Side-effect imports: Execute without bindings

Dynamic imports:

  • Runtime loading: import() returns Promise

  • Code splitting: Lazy load features

  • Conditional loading: Load based on runtime conditions

Build tools:

  • Bundlers: Webpack, Rollup, esbuild, Vite

  • Tree-shaking: Remove unused code

  • Code splitting: Multiple output chunks

Best practices:

  • Use named exports for tree-shaking

  • Avoid circular dependencies

  • Mark side effects explicitly

  • Use dynamic imports for code splitting

  • Prefer static imports for core dependencies

The module system bridges JavaScript’s dynamic nature with the performance and tooling benefits of static analysis—enabling both developer productivity and runtime efficiency.


Chapter 8: The Browser Environment (Beyond ECMA-262)

The Boundary Between ECMAScript and the Web Platform

What ECMA-262 Does NOT Specify

Important distinction: ECMA-262 defines the JavaScript language, not the browser environment. The following are NOT part of the ECMAScript specification:

  • window, document, navigator

  • DOM APIs (getElementById, querySelector, etc.)

  • setTimeout, setInterval, requestAnimationFrame

  • fetch, XMLHttpRequest

  • localStorage, sessionStorage, IndexedDB

  • console (though engines implement it)

  • Browser events (click, load, resize, etc.)

  • Canvas, WebGL, Web Audio

  • alert, confirm, prompt

These are defined by:

  • HTML Standard (WHATWG)

  • W3C Web APIs

  • CSSWG (CSS Object Model)

  • Browser vendor extensions

Why this matters: Understanding the boundary helps explain:

  • Why Node.js lacks window but has global

  • Why Deno has window but not document

  • Why “JavaScript” behaves differently across environments


The Global Object in Browsers

window: The Browser’s Global Object

HTML Standard §8.1.1 (not ECMA-262)

In browsers, the global object is window:

// All equivalent
var x = 1;
window.x = 1;
this.x = 1;  // In non-module, non-strict code

console.log(window.x);  // 1

window properties:

window.document        // The DOM
window.location        // Current URL
window.history         // Navigation history
window.navigator       // Browser info
window.screen          // Screen dimensions
window.localStorage    // Persistent storage
window.sessionStorage  // Session storage
window.console         // Developer console
window.fetch           // HTTP requests
window.setTimeout      // Timer functions
window.addEventListener  // Event registration

globalThis: Universal Global Access

ECMA-262 §19.3 (pp. 505-506)

ES2020 introduced globalThis for platform-independent global access:

// Works everywhere
globalThis.setTimeout      // Browser
globalThis.global          // Node.js (has both)
globalThis.window          // Browser (also works)

// Platform-specific checks
if (typeof window !== 'undefined') {
    // Browser environment
} else if (typeof global !== 'undefined') {
    // Node.js environment
}

// Better approach
if (typeof globalThis.document !== 'undefined') {
    // Browser (DOM available)
}

Why globalThis?

Before ES2020, accessing the global object required:

// Unreliable cross-platform hack
const globalObj = (function() {
    return this;
})() || (typeof window !== 'undefined' ? window : global);

// Now just:
const globalObj = globalThis;

self: Worker-Compatible Global

HTML Standard (Web Workers)

self refers to the global object in both main thread and Workers:

// main.js (works)
self.addEventListener('load', () => {});

// worker.js (also works)
self.addEventListener('message', () => {});

// But window doesn't work in Workers:
// window.addEventListener('message', () => {});  // ❌ ReferenceError

Best practice: Use self in code that might run in Workers.


The Document Object Model (DOM)

DOM Structure and Representation

DOM Standard (WHATWG)

The DOM is a tree representation of HTML:

<!DOCTYPE html>
<html>
  <head>
    <title>Page Title</title>
  </head>
  <body>
    <div id="app">
      <h1>Hello</h1>
      <p class="intro">Text</p>
    </div>
  </body>
</html>

Tree structure:

Document └── html (HTMLHtmlElement) ├── head (HTMLHeadElement) │ └── title (HTMLTitleElement) │ └── #text: “Page Title” └── body (HTMLBodyElement) └── div#app (HTMLDivElement) ├── h1 (HTMLHeadingElement) │ └── #text: “Hello” └── p.intro (HTMLParagraphElement) └── #text: “Text”

Node Types and Hierarchy

Node interface (base for all DOM nodes):

// Node types (constants)
Node.ELEMENT_NODE                // 1
Node.ATTRIBUTE_NODE              // 2 (deprecated)
Node.TEXT_NODE                   // 3
Node.CDATA_SECTION_NODE          // 4
Node.PROCESSING_INSTRUCTION_NODE // 7
Node.COMMENT_NODE                // 8
Node.DOCUMENT_NODE               // 9
Node.DOCUMENT_TYPE_NODE          // 10
Node.DOCUMENT_FRAGMENT_NODE      // 11

// Example
const div = document.createElement('div');
console.log(div.nodeType);  // 1 (ELEMENT_NODE)

Inheritance hierarchy:

EventTarget └── Node ├── Element │ ├── HTMLElement │ │ ├── HTMLDivElement │ │ ├── HTMLSpanElement │ │ ├── HTMLInputElement │ │ └── … │ └── SVGElement ├── Text ├── Comment └── Document └── HTMLDocument

DOM Traversal

Properties for navigation:

const element = document.getElementById('app');

// Parent/child
element.parentNode           // Parent node
element.parentElement        // Parent element (null if parent is Document)
element.childNodes           // NodeList (includes text nodes)
element.children             // HTMLCollection (elements only)
element.firstChild           // First node (may be text)
element.firstElementChild    // First element
element.lastChild
element.lastElementChild

// Siblings
element.nextSibling          // Next node
element.nextElementSibling   // Next element
element.previousSibling
element.previousElementSibling

Example: Walking the tree:

function walkDOM(node, callback) {
    callback(node);
    node = node.firstChild;
    while (node) {
        walkDOM(node, callback);
        node = node.nextSibling;
    }
}

walkDOM(document.body, (node) => {
    if (node.nodeType === Node.ELEMENT_NODE) {
        console.log(node.tagName);
    }
});

DOM Querying: Finding Elements

getElementById
const el = document.getElementById('app');
// Returns Element or null
// Only searches by ID attribute
getElementsByClassName
const elements = document.getElementsByClassName('intro');
// Returns live HTMLCollection
// Updates automatically when DOM changes
getElementsByTagName
const divs = document.getElementsByTagName('div');
// Returns live HTMLCollection
querySelector / querySelectorAll

Recommended approach (uses CSS selectors):

// Returns first match or null
const first = document.querySelector('.intro');
const firstDiv = document.querySelector('div');
const specific = document.querySelector('#app > p.intro');

// Returns static NodeList (not live)
const all = document.querySelectorAll('.intro');
const complex = document.querySelectorAll('div.container > p:not(.hidden)');

// Iterate NodeList
all.forEach(element => {
    console.log(element);
});

// Or convert to array
const array = Array.from(all);

Live vs. static collections:

// Live HTMLCollection
const liveList = document.getElementsByClassName('item');
console.log(liveList.length);  // 3

// Add new element
const newItem = document.createElement('div');
newItem.className = 'item';
document.body.appendChild(newItem);

console.log(liveList.length);  // 4 (automatically updated!)

// Static NodeList
const staticList = document.querySelectorAll('.item');
console.log(staticList.length);  // 4
document.body.appendChild(anotherItem);
console.log(staticList.length);  // Still 4 (snapshot)

DOM Manipulation

Creating Elements
// Create element
const div = document.createElement('div');
const text = document.createTextNode('Hello');
const comment = document.createComment('This is a comment');

// Set attributes
div.id = 'myDiv';
div.className = 'container';
div.setAttribute('data-value', '123');

// Add content
div.textContent = 'Hello';  // Escapes HTML
div.innerHTML = '<b>Bold</b>';  // Parses HTML (XSS risk!)

// Add to DOM
document.body.appendChild(div);
Modifying Elements
const element = document.getElementById('app');

// Content
element.textContent = 'New text';
element.innerHTML = '<p>HTML content</p>';

// Attributes
element.setAttribute('data-id', '123');
element.getAttribute('data-id');  // '123'
element.removeAttribute('data-id');
element.hasAttribute('data-id');  // false

// Classes
element.classList.add('active');
element.classList.remove('hidden');
element.classList.toggle('expanded');
element.classList.contains('active');  // true
element.classList.replace('old', 'new');

// Styles
element.style.color = 'red';
element.style.backgroundColor = 'blue';
element.style.fontSize = '16px';

// Better: Use CSS classes instead
element.classList.add('highlighted');
Inserting Elements
const parent = document.getElementById('container');
const child = document.createElement('div');

// Classic methods
parent.appendChild(child);          // Add to end
parent.insertBefore(child, refNode);  // Insert before reference
parent.removeChild(child);          // Remove child
parent.replaceChild(newChild, oldChild);

// Modern methods (more intuitive)
parent.append(child);               // Add to end (can add multiple)
parent.prepend(child);              // Add to beginning
child.before(newElement);           // Insert before child
child.after(newElement);            // Insert after child
child.replaceWith(newElement);      // Replace child
child.remove();                     // Remove from parent

// Example: Multiple insertions
parent.append(
    document.createElement('div'),
    'Plain text',
    document.createElement('span')
);
Document Fragments (Performance)

Best practice for bulk operations:

// ❌ Slow: Multiple reflows
for (let i = 0; i < 1000; i++) {
    const div = document.createElement('div');
    div.textContent = i;
    document.body.appendChild(div);  // Reflow each time!
}

// ✅ Fast: Single reflow
const fragment = document.createDocumentFragment();
for (let i = 0; i < 1000; i++) {
    const div = document.createElement('div');
    div.textContent = i;
    fragment.appendChild(div);  // No reflow
}
document.body.appendChild(fragment);  // Single reflow

Systems insight: Document fragments exist in memory only—modifying them doesn’t trigger layout recalculation (reflow) or repaint.


The Event System

Event Flow: Capturing and Bubbling

DOM Events Standard (WHATWG)

Events propagate in three phases:

  1. Capturing phase: From window down to target

  2. Target phase: Event at the target element

  3. Bubbling phase: From target back up to window

<div id="outer">
  <div id="inner">
    <button id="btn">Click</button>
  </div>
</div>

Event flow when button clicked:

Capturing: window → document → html → body → outer → inner → btn Target: btn Bubbling: btn → inner → outer → body → html → document → window

Registering listeners:

const btn = document.getElementById('btn');
const inner = document.getElementById('inner');
const outer = document.getElementById('outer');

// Bubbling phase (default)
btn.addEventListener('click', (e) => {
    console.log('Button clicked');
});

inner.addEventListener('click', (e) => {
    console.log('Inner div clicked');
});

outer.addEventListener('click', (e) => {
    console.log('Outer div clicked');
});

// Click button logs:
// "Button clicked"
// "Inner div clicked"
// "Outer div clicked"

// Capturing phase (third argument = true)
outer.addEventListener('click', (e) => {
    console.log('Outer (capturing)');
}, true);

// Click button logs:
// "Outer (capturing)"  ← Capturing phase
// "Button clicked"     ← Target phase
// "Inner div clicked"  ← Bubbling phase
// "Outer div clicked"  ← Bubbling phase

Event Object

Properties and methods:

element.addEventListener('click', (event) => {
    // Target information
    event.target          // Element that triggered event
    event.currentTarget   // Element with listener attached
    
    // Event details
    event.type            // 'click'
    event.timeStamp       // High-resolution timestamp
    event.isTrusted       // true if user-initiated
    
    // Mouse events
    event.clientX         // X relative to viewport
    event.clientY         // Y relative to viewport
    event.pageX           // X relative to document
    event.pageY           // Y relative to document
    event.screenX         // X relative to screen
    event.screenY         // Y relative to screen
    event.button          // Which mouse button (0=left, 1=middle, 2=right)
    
    // Keyboard events
    event.key             // Key name ('Enter', 'a', 'Shift')
    event.code            // Physical key ('KeyA', 'Enter')
    event.keyCode         // Deprecated numeric code
    event.ctrlKey         // Ctrl pressed?
    event.shiftKey        // Shift pressed?
    event.altKey          // Alt pressed?
    event.metaKey         // Meta/Cmd pressed?
    
    // Control flow
    event.preventDefault();   // Prevent default action
    event.stopPropagation();  // Stop bubbling/capturing
    event.stopImmediatePropagation();  // Stop other listeners on same element
});

Example: target vs. currentTarget:

outer.addEventListener('click', (e) => {
    console.log('target:', e.target.id);          // 'btn' (what was clicked)
    console.log('currentTarget:', e.currentTarget.id);  // 'outer' (listener owner)
});

// Click button logs:
// target: btn
// currentTarget: outer

Event Delegation

Pattern: Attach listener to parent, handle events from children.

// ❌ Inefficient: Multiple listeners
document.querySelectorAll('.item').forEach(item => {
    item.addEventListener('click', handleClick);
});

// ✅ Efficient: Single listener on parent
document.getElementById('list').addEventListener('click', (e) => {
    if (e.target.classList.contains('item')) {
        handleClick(e);
    }
});

Benefits:

  • Fewer event listeners → less memory

  • Works with dynamically added elements

  • Better performance for large lists

Advanced delegation (with closest):

document.getElementById('list').addEventListener('click', (e) => {
    // Find nearest ancestor with .item class
    const item = e.target.closest('.item');
    if (item) {
        console.log('Item clicked:', item.dataset.id);
    }
});

Preventing Default Behavior

// Prevent form submission
form.addEventListener('submit', (e) => {
    e.preventDefault();
    // Handle with fetch instead
});

// Prevent link navigation
link.addEventListener('click', (e) => {
    e.preventDefault();
    // Custom navigation logic
});

// Prevent context menu
element.addEventListener('contextmenu', (e) => {
    e.preventDefault();
    // Show custom menu
});

Custom Events

// Create custom event
const customEvent = new CustomEvent('userLogin', {
    detail: { username: 'alice', timestamp: Date.now() },
    bubbles: true,
    cancelable: true
});

// Dispatch event
element.dispatchEvent(customEvent);

// Listen for custom event
element.addEventListener('userLogin', (e) => {
    console.log('User logged in:', e.detail.username);
});

Timers and Animation

setTimeout and setInterval

HTML Standard §8.6 (not ECMA-262)

// Execute once after delay
const timeoutId = setTimeout(() => {
    console.log('Executed after 1 second');
}, 1000);

// Cancel timer
clearTimeout(timeoutId);

// Execute repeatedly
const intervalId = setInterval(() => {
    console.log('Executed every 1 second');
}, 1000);

// Cancel interval
clearInterval(intervalId);

Important gotchas:

// ❌ Incorrect: 'this' context lost
class Timer {
    constructor() {
        this.count = 0;
    }
    start() {
        setInterval(function() {
            this.count++;  // 'this' is window, not Timer instance!
        }, 1000);
    }
}

// ✅ Correct: Use arrow function
class Timer {
    constructor() {
        this.count = 0;
    }
    start() {
        setInterval(() => {
            this.count++;  // 'this' is Timer instance
        }, 1000);
    }
}

Minimum delay:

// HTML spec mandates minimum 4ms delay for nested timers
setTimeout(() => {
    setTimeout(() => {
        // This has minimum 4ms delay, not 0ms
    }, 0);
}, 0);

requestAnimationFrame

Optimal for animations (60 FPS when possible):

function animate() {
    // Update animation state
    element.style.left = `${position}px`;
    position += velocity;
    
    // Schedule next frame
    requestAnimationFrame(animate);
}

// Start animation
requestAnimationFrame(animate);

Advantages over setInterval:

  • Synchronized with screen refresh (60 Hz = ~16.67ms)

  • Pauses when tab inactive (saves battery)

  • Automatic throttling (prevents overload)

With timestamp:

let startTime = null;

function animate(timestamp) {
    if (!startTime) startTime = timestamp;
    const elapsed = timestamp - startTime;
    
    const progress = Math.min(elapsed / 1000, 1);  // 0 to 1 over 1 second
    element.style.left = `${progress * 100}px`;
    
    if (progress < 1) {
        requestAnimationFrame(animate);
    }
}

requestAnimationFrame(animate);

Cancel animation:

const animationId = requestAnimationFrame(animate);
cancelAnimationFrame(animationId);

Web Storage APIs

localStorage and sessionStorage

HTML Standard §12.2

Both provide key-value storage (strings only):

// localStorage: Persistent across sessions
localStorage.setItem('username', 'alice');
localStorage.getItem('username');  // 'alice'
localStorage.removeItem('username');
localStorage.clear();  // Remove all
localStorage.length;   // Number of items

// sessionStorage: Cleared when tab closes
sessionStorage.setItem('tempToken', 'xyz123');

Storing objects (requires serialization):

// ❌ Incorrect: Stores "[object Object]"
localStorage.setItem('user', { name: 'alice' });

// ✅ Correct: Serialize to JSON
const user = { name: 'alice', age: 30 };
localStorage.setItem('user', JSON.stringify(user));

// Retrieve and parse
const retrieved = JSON.parse(localStorage.getItem('user'));
console.log(retrieved.name);  // 'alice'

Quota: Typically 5-10 MB per origin (varies by browser).

Storage events (cross-tab communication):

// Listen for storage changes from other tabs
window.addEventListener('storage', (e) => {
    console.log('Key:', e.key);
    console.log('Old value:', e.oldValue);
    console.log('New value:', e.newValue);
    console.log('URL:', e.url);
});

// In another tab
localStorage.setItem('sharedData', 'value');  // Triggers event in first tab

IndexedDB: Client-Side Database

Indexed Database API (W3C)

Key-value database with indexes and transactions:

// Open database
const request = indexedDB.open('MyDatabase', 1);

request.onupgradeneeded = (event) => {
    const db = event.target.result;
    
    // Create object store
    const store = db.createObjectStore('users', { keyPath: 'id' });
    
    // Create indexes
    store.createIndex('name', 'name', { unique: false });
    store.createIndex('email', 'email', { unique: true });
};

request.onsuccess = (event) => {
    const db = event.target.result;
    
    // Add data
    const transaction = db.transaction(['users'], 'readwrite');
    const store = transaction.objectStore('users');
    store.add({ id: 1, name: 'Alice', email: 'alice@example.com' });
    
    // Query by key
    const getRequest = store.get(1);
    getRequest.onsuccess = () => {
        console.log(getRequest.result);
    };
    
    // Query by index
    const index = store.index('email');
    const searchRequest = index.get('alice@example.com');
};

Modern Promise-based wrapper (idb library):

import { openDB } from 'idb';

const db = await openDB('MyDatabase', 1, {
    upgrade(db) {
        const store = db.createObjectStore('users', { keyPath: 'id' });
        store.createIndex('name', 'name');
    }
});

// Add
await db.add('users', { id: 1, name: 'Alice' });

// Get
const user = await db.get('users', 1);

// Query by index
const aliceRecords = await db.getAllFromIndex('users', 'name', 'Alice');

Use cases:

  • Offline applications

  • Large datasets (GBs, not just MBs)

  • Structured data with complex queries


Network: fetch API

Making HTTP Requests

Fetch Standard (WHATWG)

// Simple GET
const response = await fetch('https://api.example.com/users');
const data = await response.json();

// With options
const response = await fetch('https://api.example.com/users', {
    method: 'POST',
    headers: {
        'Content-Type': 'application/json',
        'Authorization': 'Bearer token123'
    },
    body: JSON.stringify({ name: 'Alice' })
});

// Check response
if (!response.ok) {
    throw new Error(`HTTP error: ${response.status}`);
}

const data = await response.json();

Response Methods

const response = await fetch(url);

// Parse as JSON
const json = await response.json();

// Get as text
const text = await response.text();

// Get as blob (binary)
const blob = await response.blob();

// Get as ArrayBuffer
const buffer = await response.arrayBuffer();

// Get as FormData
const formData = await response.formData();

Request Configuration

fetch(url, {
    method: 'POST',           // GET, POST, PUT, DELETE, PATCH
    headers: new Headers({    // or plain object
        'Content-Type': 'application/json'
    }),
    body: JSON.stringify(data),  // string, FormData, Blob, ArrayBuffer
    mode: 'cors',             // cors, no-cors, same-origin
    credentials: 'include',   // include, same-origin, omit
    cache: 'default',         // default, no-cache, reload, force-cache
    redirect: 'follow',       // follow, error, manual
    referrer: 'client',       // URL or client
    signal: abortController.signal  // Abort signal
});

Aborting Requests

const controller = new AbortController();

fetch(url, { signal: controller.signal })
    .then(response => response.json())
    .catch(err => {
        if (err.name === 'AbortError') {
            console.log('Request aborted');
        }
    });

// Abort after 5 seconds
setTimeout(() => controller.abort(), 5000);

CORS: Cross-Origin Resource Sharing

Same-Origin Policy restricts cross-origin requests:

// ❌ Blocked by default
fetch('https://other-domain.com/api/data')
    .catch(err => {
        // CORS error: No 'Access-Control-Allow-Origin' header
    });

// ✅ Allowed if server sends proper headers
// Server must respond with:
// Access-Control-Allow-Origin: https://your-domain.com
// Access-Control-Allow-Methods: GET, POST
// Access-Control-Allow-Headers: Content-Type

Preflight request for non-simple requests:

Browser sends OPTIONS request: Origin: https://your-domain.com Access-Control-Request-Method: POST Access-Control-Request-Headers: Content-Type

Server responds: Access-Control-Allow-Origin: https://your-domain.com Access-Control-Allow-Methods: POST Access-Control-Allow-Headers: Content-Type

Then actual POST request is sent.


Browser APIs Overview

Location and Navigation

// Current URL
console.log(window.location.href);     // Full URL
console.log(window.location.protocol); // 'https:'
console.log(window.location.host);     // 'example.com:443'
console.log(window.location.hostname); // 'example.com'
console.log(window.location.port);     // '443'
console.log(window.location.pathname); // '/path/to/page'
console.log(window.location.search);   // '?key=value'
console.log(window.location.hash);     // '#section'

// Navigate
window.location.href = 'https://example.com';
window.location.assign('https://example.com');
window.location.replace('https://example.com');  // No history entry
window.location.reload();  // Refresh

// History API
history.pushState({ page: 1 }, 'Title', '/page1');
history.replaceState({ page: 2 }, 'Title', '/page2');
history.back();
history.forward();
history.go(-2);  // Go back 2 pages
navigator.userAgent        // Browser identification string
navigator.language         // 'en-US'
navigator.languages        // ['en-US', 'en']
navigator.onLine           // Network connectivity
navigator.cookieEnabled    // Cookies allowed?
navigator.geolocation      // Geolocation API
navigator.mediaDevices     // Camera/microphone access
navigator.serviceWorker    // Service Worker registration

Console API

Not in ECMA-262 (but universally implemented):

console.log('Message', variable);
console.info('Info');
console.warn('Warning');
console.error('Error');

console.table([{ a: 1, b: 2 }, { a: 3, b: 4 }]);

console.group('Group');
console.log('Inside group');
console.groupEnd();

console.time('Timer');
// ... code ...
console.timeEnd('Timer');  // Logs elapsed time

console.assert(condition, 'Assertion failed');

console.trace();  // Stack trace

Performance APIs

performance.now()

High-resolution timestamp (microsecond precision):

const start = performance.now();
expensiveOperation();
const end = performance.now();
console.log(`Took ${end - start} milliseconds`);

// More precise than Date.now()
Date.now();          // ~1ms resolution
performance.now();   // ~0.001ms resolution (if high-resolution timing allowed)

Performance Monitoring

// Navigation timing
const timing = performance.timing;
console.log('DOM loaded:', timing.domContentLoadedEventEnd - timing.navigationStart);
console.log('Page loaded:', timing.loadEventEnd - timing.navigationStart);

// Resource timing
const resources = performance.getEntriesByType('resource');
resources.forEach(resource => {
    console.log(`${resource.name}: ${resource.duration}ms`);
});

// Mark and measure
performance.mark('start-render');
renderComponents();
performance.mark('end-render');
performance.measure('render-duration', 'start-render', 'end-render');

const measure = performance.getEntriesByName('render-duration')[0];
console.log(`Rendering took ${measure.duration}ms`);

Summary

The browser environment extends JavaScript with:

Global objects:

  • window: Browser global object (HTML Standard)

  • globalThis: Universal global access (ECMA-262)

  • self: Worker-compatible global reference

DOM (Document Object Model):

  • Tree structure: Nodes with parent/child relationships

  • Querying: querySelector, getElementById, etc.

  • Manipulation: Creating, modifying, removing elements

  • Live vs. static collections

Event system:

  • Three phases: Capturing → Target → Bubbling

  • Event delegation: Efficient pattern for dynamic content

  • Custom events: Application-specific communication

Timers and animation:

  • setTimeout/setInterval: Basic timing (≥4ms minimum)

  • requestAnimationFrame: Synchronized with display refresh

Storage:

  • localStorage/sessionStorage: Simple key-value (5-10 MB)

  • IndexedDB: Structured database (GBs)

Networking:

  • fetch API: Modern HTTP requests

  • CORS: Cross-origin security mechanism

  • AbortController: Canceling requests

Other APIs:

  • Location/History: Navigation control

  • Navigator: Browser capabilities

  • Performance: High-resolution timing

Understanding the boundary between ECMA-262 and web standards is essential for:

  • Writing portable code (works in Node.js, Deno, browsers)

  • Debugging environment-specific issues

  • Leveraging platform-specific optimizations

The browser is a rich execution environment—JavaScript is just the language that coordinates it all.


Chapter 9: Node.js Runtime (Resident ECMA-262)

Node.js: JavaScript Beyond the Browser

What Node.js Is (and Isn’t)

Node.js is a JavaScript runtime built on Chrome’s V8 engine, designed for server-side execution. It is not:

  • A programming language (uses JavaScript/ECMA-262)

  • A framework (like Express or Nest.js)

  • A browser environment

Core architecture:

┌│─────────────────────────────────────────┐

││ JavaScript Application Code │

├│─────────────────────────────────────────┤

││ Node.js APIs (C++ bindings) │

││ fs, http, crypto, path, os, etc. │

├│─────────────────────────────────────────┤

││ V8 Engine │

││ (ECMA-262 implementation + JIT) │

├│─────────────────────────────────────────┤

││ libuv │

││ (Event loop, async I/O, thread pool) │

├│─────────────────────────────────────────┤

││ Operating System │ └─────────────────────────────────────────┘

Key components:

  • V8: Executes JavaScript (same engine as Chrome)

  • libuv: Cross-platform asynchronous I/O library (written in C)

  • Native modules: C++ bindings to OS functionality

  • Node.js APIs: JavaScript APIs wrapping native functionality

Global Object: global vs. globalThis

Node.js specifics:

// Node.js global object (pre-ES2020)
global.setTimeout    // Available
global.Buffer        // Node.js-specific
global.process       // Node.js-specific
global.__dirname     // In CommonJS modules only
global.__filename    // In CommonJS modules only

// ES2020 universal global
globalThis.setTimeout    // Works everywhere
globalThis === global    // true in Node.js

// Browser comparison
// Browser: globalThis === window === self
// Node.js: globalThis === global

Top-level this differences:

// CommonJS module (default in Node.js)
console.log(this);  // {} (empty object, module.exports)

// ES Module (.mjs or "type": "module")
console.log(this);  // undefined (same as browsers)

// Browser (non-module)
console.log(this);  // window

The Node.js Event Loop

Architecture: Single-Threaded with Thread Pool

Important distinction: Node.js is single-threaded for JavaScript execution, but uses a thread pool for I/O operations.

// JavaScript code runs on single thread
console.log('1');
setTimeout(() => console.log('2'), 0);
console.log('3');
// Output: 1, 3, 2

// But I/O operations use libuv's thread pool
const fs = require('fs');
fs.readFile('large-file.txt', (err, data) => {
    // This callback runs on main thread,
    // but file reading happened on thread pool
});

Event Loop Phases (Node.js-Specific)

Six phases in each iteration:

┌───────────────────────────┐

┌│─>│ timers │

││ └─────────────┬─────────────┘

││ ┌─────────────┴─────────────┐

││ │ pending callbacks │

││ └─────────────┬─────────────┘

││ ┌─────────────┴─────────────┐

││ │ idle, prepare │

││ └─────────────┬─────────────┘ ┌───────────────┐

││ ┌─────────────┴─────────────┐ │ incoming: │

││ │ poll │<─────┤ connections, │

││ └─────────────┬─────────────┘ │ data, etc. │

││ ┌─────────────┴─────────────┐ └───────────────┘

││ │ check │

││ └─────────────┬─────────────┘

││ ┌─────────────┴─────────────┐ └──┤ close callbacks │ └───────────────────────────┘

Phase descriptions:

  1. Timers: Execute setTimeout and setInterval callbacks

  2. Pending callbacks: Execute I/O callbacks deferred from previous iteration

  3. Idle, prepare: Internal use only

  4. Poll: Retrieve new I/O events; execute I/O callbacks (except close, timers, setImmediate)

  5. Check: Execute setImmediate callbacks

  6. Close callbacks: Execute close event callbacks (e.g., socket.on('close', ...))

Between each phase: Process process.nextTick() queue and microtasks (Promises)

process.nextTick() vs. Microtasks vs. Macrotasks

Critical ordering:

console.log('1');

setTimeout(() => console.log('2'), 0);      // Macrotask (timers phase)

Promise.resolve().then(() => console.log('3'));  // Microtask

process.nextTick(() => console.log('4'));   // nextTick queue

console.log('5');

// Output: 1, 5, 4, 3, 2

Execution order:

  1. Synchronous code (1, 5)

  2. process.nextTick() queue (4)

  3. Microtask queue (Promises) (3)

  4. Macrotask queue (setTimeout) (2)

Example showing all phases:

const fs = require('fs');

console.log('Start');

// Timers phase
setTimeout(() => console.log('setTimeout'), 0);

// Check phase
setImmediate(() => console.log('setImmediate'));

// Poll phase
fs.readFile(__filename, () => {
    console.log('readFile callback');
    
    setTimeout(() => console.log('setTimeout in readFile'), 0);
    setImmediate(() => console.log('setImmediate in readFile'));
    process.nextTick(() => console.log('nextTick in readFile'));
});

// nextTick queue
process.nextTick(() => console.log('nextTick'));

// Microtask queue
Promise.resolve().then(() => console.log('Promise'));

console.log('End');

// Output:
// Start
// End
// nextTick
// Promise
// setTimeout  (or setImmediate - order not guaranteed initially)
// setImmediate  (or setTimeout)
// readFile callback
// nextTick in readFile
// setImmediate in readFile
// setTimeout in readFile

Why setImmediate in I/O callback runs before setTimeout:

After an I/O callback (poll phase), the event loop immediately moves to the check phase (where setImmediate runs), then circles back to timers phase (where setTimeout runs).

setImmediate vs. setTimeout(..., 0)

Outside I/O cycle (order not guaranteed):

setTimeout(() => console.log('setTimeout'), 0);
setImmediate(() => console.log('setImmediate'));

// Output varies:
// Could be: setTimeout, setImmediate
// Could be: setImmediate, setTimeout

Inside I/O cycle (setImmediate always first):

fs.readFile(__filename, () => {
    setTimeout(() => console.log('setTimeout'), 0);
    setImmediate(() => console.log('setImmediate'));
});

// Output (guaranteed):
// setImmediate
// setTimeout

Systems insight: When event loop enters poll phase and finishes I/O callbacks, it checks for setImmediate callbacks before returning to timers.


Module Systems in Node.js

CommonJS: The Original Node.js Module System

Default in Node.js (files with .js extension without "type": "module"):

// math.js (CommonJS module)
function add(a, b) {
    return a + b;
}

function multiply(a, b) {
    return a * b;
}

// Export individual functions
exports.add = add;
exports.multiply = multiply;

// Or export object
module.exports = { add, multiply };

// Or export single function
module.exports = add;

Importing CommonJS modules:

// Import entire module
const math = require('./math');
console.log(math.add(2, 3));

// Destructure imports
const { add, multiply } = require('./math');
console.log(add(2, 3));

// Built-in modules
const fs = require('fs');
const path = require('path');
const http = require('http');

CommonJS characteristics:

// 1. Synchronous loading
const data = require('./data.json');  // Blocks until loaded

// 2. Cached after first load
const math1 = require('./math');
const math2 = require('./math');
console.log(math1 === math2);  // true (same object)

// 3. Dynamic imports possible
const moduleName = './math';
const math = require(moduleName);  // Works

// 4. Module wrapper function
// Node.js wraps every module in:
(function(exports, require, module, __filename, __dirname) {
    // Your module code here
});

// 5. Available variables
console.log(__filename);  // Absolute path to current file
console.log(__dirname);   // Absolute path to directory
console.log(module);      // Module object
console.log(exports);     // Reference to module.exports
console.log(require);     // Function to load modules

ES Modules in Node.js

Enable ES modules:

Option 1: Use .mjs extension:

// math.mjs
export function add(a, b) {
    return a + b;
}

export function multiply(a, b) {
    return a * b;
}

// main.mjs
import { add, multiply } from './math.mjs';

Option 2: Set "type": "module" in package.json:

{
  "type": "module"
}
// Now .js files are ES modules
// math.js
export function add(a, b) {
    return a + b;
}

// main.js
import { add } from './math.js';  // Must include .js extension!

ES Module characteristics in Node.js:

// 1. Asynchronous loading (top-level await supported)
const data = await fetch('https://api.example.com/data');

// 2. Static imports (must be at top level)
import { add } from './math.js';  // ✅
if (condition) {
    import { add } from './math.js';  // ❌ Syntax error
}

// 3. Dynamic imports (returns Promise)
const modulePath = './math.js';
const math = await import(modulePath);  // ✅

// 4. No __filename, __dirname
// Use import.meta instead
console.log(import.meta.url);  // file:///path/to/module.js

// Get __dirname equivalent
import { fileURLToPath } from 'url';
import { dirname } from 'path';
const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);

// 5. File extensions required
import { add } from './math';     // ❌ Error
import { add } from './math.js';  // ✅ Required

Interoperability: CommonJS ↔︎ ES Modules

Importing CommonJS from ES Module:

// math.cjs (CommonJS)
module.exports = { add: (a, b) => a + b };

// main.mjs (ES Module)
import math from './math.cjs';  // Default import works
console.log(math.add(2, 3));

// Named imports DON'T work
import { add } from './math.cjs';  // ❌ Error

Importing ES Module from CommonJS:

// math.mjs (ES Module)
export function add(a, b) {
    return a + b;
}

// main.cjs (CommonJS)
// Cannot use require() with ES modules!
const math = require('./math.mjs');  // ❌ Error: require() of ES Module not supported

// Must use dynamic import
(async () => {
    const math = await import('./math.mjs');
    console.log(math.add(2, 3));
})();

Package exports field (best practice):

{
  "name": "my-package",
  "exports": {
    ".": {
      "import": "./esm/index.js",
      "require": "./cjs/index.js"
    }
  }
}

Built-in Node.js Modules

File System: fs and fs/promises

Callback-based API (traditional):

const fs = require('fs');

// Read file
fs.readFile('file.txt', 'utf8', (err, data) => {
    if (err) {
        console.error(err);
        return;
    }
    console.log(data);
});

// Write file
fs.writeFile('output.txt', 'Hello', (err) => {
    if (err) throw err;
    console.log('File written');
});

// Check if file exists
fs.access('file.txt', fs.constants.F_OK, (err) => {
    console.log(err ? 'Does not exist' : 'Exists');
});

Promise-based API (modern, cleaner):

const fs = require('fs').promises;
// or: import fs from 'fs/promises';

// Read file
try {
    const data = await fs.readFile('file.txt', 'utf8');
    console.log(data);
} catch (err) {
    console.error(err);
}

// Write file
await fs.writeFile('output.txt', 'Hello');

// Read directory
const files = await fs.readdir('.');
console.log(files);

// File stats
const stats = await fs.stat('file.txt');
console.log(stats.size);
console.log(stats.isFile());
console.log(stats.isDirectory());

// Create directory
await fs.mkdir('new-dir', { recursive: true });

// Delete file
await fs.unlink('file.txt');

// Rename/move file
await fs.rename('old.txt', 'new.txt');

Synchronous API (blocks event loop, use sparingly):

const fs = require('fs');

// Read file synchronously
const data = fs.readFileSync('file.txt', 'utf8');

// Write file synchronously
fs.writeFileSync('output.txt', 'Hello');

// Use case: Loading config at startup
const config = JSON.parse(fs.readFileSync('config.json', 'utf8'));

Streams (for large files):

const fs = require('fs');

// Read stream
const readStream = fs.createReadStream('large-file.txt', 'utf8');
readStream.on('data', (chunk) => {
    console.log('Chunk:', chunk.length);
});
readStream.on('end', () => {
    console.log('Done reading');
});

// Write stream
const writeStream = fs.createWriteStream('output.txt');
writeStream.write('Hello\n');
writeStream.write('World\n');
writeStream.end();

// Pipe (copy file efficiently)
fs.createReadStream('input.txt').pipe(fs.createWriteStream('output.txt'));

Path: Cross-Platform Path Handling

const path = require('path');

// Join paths (cross-platform)
const filePath = path.join(__dirname, 'data', 'file.txt');
// Windows: C:\project\data\file.txt
// Unix: /project/data/file.txt

// Resolve to absolute path
const absolute = path.resolve('data', 'file.txt');
// /current/working/directory/data/file.txt

// Get directory name
path.dirname('/path/to/file.txt');  // '/path/to'

// Get base name
path.basename('/path/to/file.txt');  // 'file.txt'
path.basename('/path/to/file.txt', '.txt');  // 'file'

// Get extension
path.extname('file.txt');  // '.txt'

// Parse path
const parsed = path.parse('/path/to/file.txt');
// {
//   root: '/',
//   dir: '/path/to',
//   base: 'file.txt',
//   ext: '.txt',
//   name: 'file'
// }

// Build path from object
path.format(parsed);  // '/path/to/file.txt'

// Normalize path
path.normalize('/path//to/../file.txt');  // '/path/file.txt'

// Platform-specific separator
path.sep;  // '/' on Unix, '\' on Windows

Process: Runtime Information and Control

// Command-line arguments
console.log(process.argv);
// ['node', '/path/to/script.js', 'arg1', 'arg2']

// Environment variables
console.log(process.env.NODE_ENV);
console.log(process.env.PATH);

// Set environment variable
process.env.MY_VAR = 'value';

// Current working directory
console.log(process.cwd());

// Change directory
process.chdir('/new/directory');

// Platform information
console.log(process.platform);  // 'linux', 'darwin', 'win32'
console.log(process.arch);      // 'x64', 'arm64'

// Process ID
console.log(process.pid);

// Exit process
process.exit(0);  // Success
process.exit(1);  // Error

// Exit handlers
process.on('exit', (code) => {
    console.log(`Exiting with code ${code}`);
});

// Uncaught exception handler
process.on('uncaughtException', (err) => {
    console.error('Uncaught exception:', err);
    process.exit(1);
});

// Unhandled promise rejection
process.on('unhandledRejection', (reason, promise) => {
    console.error('Unhandled rejection:', reason);
});

// Memory usage
console.log(process.memoryUsage());
// {
//   rss: 4935680,        // Resident set size
//   heapTotal: 1826816,  // Total heap size
//   heapUsed: 650472,    // Used heap size
//   external: 49879      // C++ objects bound to JS
// }

// CPU usage
console.log(process.cpuUsage());
// { user: 38579, system: 6986 }

// Uptime (seconds)
console.log(process.uptime());

HTTP: Creating Servers

Basic HTTP server:

const http = require('http');

const server = http.createServer((req, res) => {
    console.log(`${req.method} ${req.url}`);
    
    // Set response headers
    res.writeHead(200, { 'Content-Type': 'text/plain' });
    
    // Send response
    res.end('Hello World\n');
});

server.listen(3000, () => {
    console.log('Server running at http://localhost:3000/');
});

JSON API server:

const http = require('http');

const server = http.createServer((req, res) => {
    if (req.url === '/api/users' && req.method === 'GET') {
        res.writeHead(200, { 'Content-Type': 'application/json' });
        res.end(JSON.stringify({ users: ['Alice', 'Bob'] }));
    } else if (req.url === '/api/users' && req.method === 'POST') {
        let body = '';
        req.on('data', chunk => {
            body += chunk.toString();
        });
        req.on('end', () => {
            const data = JSON.parse(body);
            res.writeHead(201, { 'Content-Type': 'application/json' });
            res.end(JSON.stringify({ created: data }));
        });
    } else {
        res.writeHead(404);
        res.end('Not Found');
    }
});

server.listen(3000);

Making HTTP requests:

const http = require('http');

http.get('http://api.example.com/data', (res) => {
    let data = '';
    
    res.on('data', (chunk) => {
        data += chunk;
    });
    
    res.on('end', () => {
        console.log(JSON.parse(data));
    });
}).on('error', (err) => {
    console.error(err);
});

Crypto: Hashing and Encryption

const crypto = require('crypto');

// Hash (SHA-256)
const hash = crypto.createHash('sha256')
    .update('password')
    .digest('hex');
console.log(hash);

// HMAC (keyed hash)
const hmac = crypto.createHmac('sha256', 'secret-key')
    .update('message')
    .digest('hex');

// Random bytes
const randomBytes = crypto.randomBytes(16).toString('hex');

// Password hashing (with salt)
const password = 'user-password';
const salt = crypto.randomBytes(16).toString('hex');
const hash = crypto.pbkdf2Sync(password, salt, 100000, 64, 'sha512').toString('hex');

// Encryption (AES-256-GCM)
const algorithm = 'aes-256-gcm';
const key = crypto.randomBytes(32);
const iv = crypto.randomBytes(16);

const cipher = crypto.createCipheriv(algorithm, key, iv);
let encrypted = cipher.update('secret message', 'utf8', 'hex');
encrypted += cipher.final('hex');
const authTag = cipher.getAuthTag();

// Decryption
const decipher = crypto.createDecipheriv(algorithm, key, iv);
decipher.setAuthTag(authTag);
let decrypted = decipher.update(encrypted, 'hex', 'utf8');
decrypted += decipher.final('utf8');

URL: Parsing and Formatting URLs

const { URL } = require('url');

// Parse URL
const myURL = new URL('https://user:pass@example.com:8080/path?query=value#hash');

console.log(myURL.protocol);  // 'https:'
console.log(myURL.hostname);  // 'example.com'
console.log(myURL.port);      // '8080'
console.log(myURL.pathname);  // '/path'
console.log(myURL.search);    // '?query=value'
console.log(myURL.hash);      // '#hash'
console.log(myURL.username);  // 'user'
console.log(myURL.password);  // 'pass'

// Query parameters
console.log(myURL.searchParams.get('query'));  // 'value'
myURL.searchParams.append('key', 'new');
myURL.searchParams.delete('query');

// Convert back to string
console.log(myURL.href);
console.log(myURL.toString());

Buffer: Binary Data Handling

Creating Buffers

Node.js-specific (not in ECMA-262):

// Create buffer from string
const buf1 = Buffer.from('Hello');
console.log(buf1);  // <Buffer 48 65 6c 6c 6f>

// Create buffer from array
const buf2 = Buffer.from([72, 101, 108, 108, 111]);

// Create buffer with size (uninitialized)
const buf3 = Buffer.alloc(10);  // Filled with zeros
const buf4 = Buffer.allocUnsafe(10);  // May contain old data (faster)

// Create from hexadecimal
const buf5 = Buffer.from('48656c6c6f', 'hex');

// Create from base64
const buf6 = Buffer.from('SGVsbG8=', 'base64');

Reading and Writing Buffers

const buf = Buffer.alloc(10);

// Write string
buf.write('Hello', 0, 'utf8');

// Write integers
buf.writeUInt8(255, 5);       // 1 byte at index 5
buf.writeUInt16BE(1000, 6);   // 2 bytes, big-endian
buf.writeUInt32LE(1000000, 8);  // 4 bytes, little-endian

// Read integers
const byte = buf.readUInt8(5);
const short = buf.readUInt16BE(6);
const int = buf.readUInt32LE(8);

// Read string
const str = buf.toString('utf8', 0, 5);  // 'Hello'

Buffer Encoding

const buf = Buffer.from('Hello');

// Various encodings
buf.toString('utf8');    // 'Hello'
buf.toString('hex');     // '48656c6c6f'
buf.toString('base64');  // 'SGVsbG8='
buf.toString('binary');  // (legacy)

// Supported encodings:
// 'utf8', 'utf16le', 'latin1', 'base64', 'hex', 'ascii', 'binary'

Buffer vs. TypedArray

Similarity: Both provide views over binary data

// Buffer (Node.js-specific)
const buf = Buffer.from([1, 2, 3, 4]);

// TypedArray (standard JavaScript)
const arr = new Uint8Array([1, 2, 3, 4]);

// Convert Buffer to TypedArray
const typedArray = new Uint8Array(buf.buffer, buf.byteOffset, buf.byteLength);

// Convert TypedArray to Buffer
const buffer = Buffer.from(arr.buffer, arr.byteOffset, arr.byteLength);

Difference: Buffer is more convenient for Node.js I/O operations:

const fs = require('fs');

// Read file as Buffer
const buf = fs.readFileSync('file.bin');
console.log(buf instanceof Buffer);  // true

// Write Buffer to file
fs.writeFileSync('output.bin', buf);

Streams: Efficient Data Processing

Stream Types

Four types:

  1. Readable: Source of data (e.g., fs.createReadStream)

  2. Writable: Destination (e.g., fs.createWriteStream)

  3. Duplex: Both readable and writable (e.g., TCP socket)

  4. Transform: Duplex that modifies data (e.g., compression)

Readable Streams

const fs = require('fs');

const readable = fs.createReadStream('large-file.txt', {
    encoding: 'utf8',
    highWaterMark: 16 * 1024  // 16 KB chunks
});

// Event-based consumption
readable.on('data', (chunk) => {
    console.log(`Received ${chunk.length} bytes`);
});

readable.on('end', () => {
    console.log('No more data');
});

readable.on('error', (err) => {
    console.error(err);
});

// Pause and resume
readable.pause();
setTimeout(() => readable.resume(), 1000);

Async iteration (modern approach):

const fs = require('fs');

async function processFile() {
    const readable = fs.createReadStream('file.txt', 'utf8');
    
    for await (const chunk of readable) {
        console.log(chunk);
    }
}

Writable Streams

const fs = require('fs');

const writable = fs.createWriteStream('output.txt');

writable.write('Hello\n');
writable.write('World\n');
writable.end('Done\n');  // Closes stream

writable.on('finish', () => {
    console.log('All writes completed');
});

writable.on('error', (err) => {
    console.error(err);
});

Backpressure handling:

const writable = fs.createWriteStream('output.txt');

function writeMillionTimes(writer, data, callback) {
    let i = 1000000;
    write();
    
    function write() {
        let ok = true;
        while (i > 0 && ok) {
            i--;
            if (i === 0) {
                writer.write(data, callback);
            } else {
                // Returns false if internal buffer is full
                ok = writer.write(data);
            }
        }
        if (i > 0) {
            // Wait for 'drain' event before continuing
            writer.once('drain', write);
        }
    }
}

writeMillionTimes(writable, 'Hello\n', () => {
    console.log('Done');
});

Piping Streams

Most efficient way to transfer data:

const fs = require('fs');
const zlib = require('zlib');

// Copy file
fs.createReadStream('input.txt')
  .pipe(fs.createWriteStream('output.txt'));

// Compress file
fs.createReadStream('input.txt')
  .pipe(zlib.createGzip())
  .pipe(fs.createWriteStream('input.txt.gz'));

// Chain multiple transforms
fs.createReadStream('input.txt.gz')
  .pipe(zlib.createGunzip())
  .pipe(transform())
  .pipe(zlib.createGzip())
  .pipe(fs.createWriteStream('output.txt.gz'));

Pipeline (better error handling):

const { pipeline } = require('stream');
const fs = require('fs');
const zlib = require('zlib');

pipeline(
    fs.createReadStream('input.txt'),
    zlib.createGzip(),
    fs.createWriteStream('input.txt.gz'),
    (err) => {
        if (err) {
            console.error('Pipeline failed:', err);
        } else {
            console.log('Pipeline succeeded');
        }
    }
);

Transform Streams

Custom transform:

const { Transform } = require('stream');

// Uppercase transform
const uppercaseTransform = new Transform({
    transform(chunk, encoding, callback) {
        this.push(chunk.toString().toUpperCase());
        callback();
    }
});

fs.createReadStream('input.txt')
  .pipe(uppercaseTransform)
  .pipe(fs.createWriteStream('output.txt'));

Child Processes: Running External Commands

exec: Simple Command Execution

const { exec } = require('child_process');

exec('ls -la', (error, stdout, stderr) => {
    if (error) {
        console.error(`Error: ${error.message}`);
        return;
    }
    if (stderr) {
        console.error(`stderr: ${stderr}`);
        return;
    }
    console.log(`stdout: ${stdout}`);
});

// Promise version
const { promisify } = require('util');
const execPromise = promisify(exec);

const { stdout, stderr } = await execPromise('ls -la');
console.log(stdout);

spawn: Streaming Output

const { spawn } = require('child_process');

const ls = spawn('ls', ['-la']);

ls.stdout.on('data', (data) => {
    console.log(`stdout: ${data}`);
});

ls.stderr.on('data', (data) => {
    console.error(`stderr: ${data}`);
});

ls.on('close', (code) => {
    console.log(`Process exited with code ${code}`);
});

Pipe data to child process:

const child = spawn('wc', ['-w']);

child.stdout.on('data', (data) => {
    console.log(`Word count: ${data}`);
});

child.stdin.write('Hello world\n');
child.stdin.write('This is a test\n');
child.stdin.end();

fork: Node.js Child Processes

Communication via IPC:

// parent.js
const { fork } = require('child_process');

const child = fork('child.js');

child.on('message', (msg) => {
    console.log('Message from child:', msg);
});

child.send({ hello: 'world' });

// child.js
process.on('message', (msg) => {
    console.log('Message from parent:', msg);
    process.send({ received: true });
});

Worker Threads: True Parallelism

Creating Worker Threads

Node.js 10.5.0+ (experimental in 10.x, stable in 12+):

// main.js
const { Worker } = require('worker_threads');

const worker = new Worker('./worker.js');

worker.on('message', (msg) => {
    console.log('Message from worker:', msg);
});

worker.on('error', (err) => {
    console.error('Worker error:', err);
});

worker.on('exit', (code) => {
    console.log(`Worker exited with code ${code}`);
});

worker.postMessage({ task: 'compute' });
// worker.js
const { parentPort } = require('worker_threads');

parentPort.on('message', (msg) => {
    console.log('Message from main:', msg);
    
    // Perform computation
    const result = heavyComputation();
    
    parentPort.postMessage({ result });
});

SharedArrayBuffer in Workers

Shared memory between threads:

// main.js
const { Worker } = require('worker_threads');

const sharedBuffer = new SharedArrayBuffer(4);
const sharedArray = new Int32Array(sharedBuffer);

const worker = new Worker('./worker.js', {
    workerData: { sharedBuffer }
});

// Atomically increment
Atomics.add(sharedArray, 0, 1);
console.log(sharedArray[0]);  // 1

worker.on('message', () => {
    console.log(sharedArray[0]);  // 2 (incremented by worker)
});
// worker.js
const { parentPort, workerData } = require('worker_threads');

const sharedArray = new Int32Array(workerData.sharedBuffer);

// Atomically increment
Atomics.add(sharedArray, 0, 1);

parentPort.postMessage('done');

Package Management: npm and package.json

package.json Structure

{
  "name": "my-app",
  "version": "1.0.0",
  "description": "My application",
  "main": "index.js",
  "type": "module",
  "scripts": {
    "start": "node index.js",
    "test": "jest",
    "build": "webpack"
  },
  "dependencies": {
    "express": "^4.18.0",
    "lodash": "~4.17.21"
  },
  "devDependencies": {
    "jest": "^29.0.0",
    "webpack": "^5.75.0"
  },
  "engines": {
    "node": ">=16.0.0"
  },
  "keywords": ["example"],
  "author": "Your Name",
  "license": "MIT"
}

Semantic versioning:

^4.18.0 → >=4.18.0 <5.0.0 (compatible changes) ~4.18.0 → >=4.18.0 <4.19.0 (bug fixes only)

4.18.0 → Exact version

  •    → Latest version (dangerous!)

npm Commands

# Install dependencies
npm install
npm install express        # Add to dependencies
npm install --save-dev jest  # Add to devDependencies
npm install -g nodemon     # Install globally

# Update packages
npm update
npm outdated               # Check for updates

# Remove package
npm uninstall express

# Run scripts
npm start                  # Runs "start" script
npm test                   # Runs "test" script
npm run build              # Runs custom script

# View package info
npm info express
npm list                   # Installed packages
npm list --depth=0         # Top-level packages only

# Audit security
npm audit
npm audit fix              # Auto-fix vulnerabilities

Summary

Node.js runtime provides:

Global environment:

  • global (pre-ES2020) / globalThis (universal)

  • process: Runtime information, exit handling

  • Buffer: Binary data handling

Event loop architecture:

  • Six phases: timers → pending → poll → check → close

  • process.nextTick(): Executed before next phase

  • setImmediate(): Executed in check phase

Module systems:

  • CommonJS: require(), module.exports, synchronous

  • ES Modules: import/export, asynchronous, .mjs or "type": "module"

Built-in modules:

  • fs: File system operations (callback, promise, sync APIs)

  • path: Cross-platform path handling

  • http: HTTP server and client

  • crypto: Hashing, encryption

  • stream: Efficient data processing

Concurrency:

  • Child processes: Run external commands (exec, spawn, fork)

  • Worker threads: True parallelism with SharedArrayBuffer

Package management:

  • npm: Install, update, manage dependencies

  • package.json: Project configuration, scripts, versioning

Node.js extends ECMA-262 with system-level capabilities while preserving the single-threaded execution model that makes JavaScript reasoning tractable. Understanding the event loop phases and module interoperability is critical for building robust server-side applications.


Chapter 10: Browser Extensions and Userscripts

Introduction: Extending Browser Functionality

Browser extensions and userscripts allow developers to modify web page behavior, inject custom JavaScript, and extend browser capabilities beyond what standard web applications can achieve. They operate with elevated privileges compared to regular web pages, accessing APIs unavailable to standard JavaScript.

Key differences:

┌│─────────────────────────────────────────────────────────┐

││ Privilege Levels │

├│─────────────────────────────────────────────────────────┤

││ Web Page JavaScript │

││ • Sandboxed execution │

││ • No cross-origin requests (CORS) │

││ • Limited browser APIs │

├│─────────────────────────────────────────────────────────┤

││ Userscripts (Tampermonkey/Greasemonkey) │

││ • Runs in page context │

││ • Cross-origin XHR with GM_xmlhttpRequest │

││ • Can modify page DOM │

││ • Limited storage (GM_setValue/getValue) │

├│─────────────────────────────────────────────────────────┤

││ Browser Extensions (WebExtensions API) │

││ • Isolated world execution │

││ • Full browser.* API access │

││ • Background service workers │

││ • Cross-origin requests without CORS │

││ • Persistent storage │

││ • Tab/window management │ └─────────────────────────────────────────────────────────┘


Userscripts: Quick Page Modifications

What Are Userscripts?

Userscripts are JavaScript programs that modify web pages in the browser. They require a userscript manager extension:

  • Tampermonkey (Chrome, Firefox, Safari, Edge)

  • Greasemonkey (Firefox only, original)

  • Violentmonkey (Chrome, Firefox, open-source)

Use cases:

  • Remove ads or annoying elements

  • Add missing features to websites

  • Customize page appearance

  • Automate repetitive tasks

  • Enhance privacy

Basic Userscript Structure

Metadata block (required):

// ==UserScript==
// @name         Example Userscript
// @namespace    http://example.com/
// @version      1.0.0
// @description  Demonstrates userscript basics
// @author       Your Name
// @match        https://www.example.com/*
// @grant        GM_xmlhttpRequest
// @grant        GM_setValue
// @grant        GM_getValue
// @require      https://code.jquery.com/jquery-3.6.0.min.js
// @run-at       document-end
// ==/UserScript==

(function() {
    'use strict';
    
    // Your code here
    console.log('Userscript loaded!');
    
    // Modify page
    document.body.style.backgroundColor = '#f0f0f0';
    
    // Add button
    const button = document.createElement('button');
    button.textContent = 'Click me';
    button.onclick = () => alert('Userscript button clicked!');
    document.body.prepend(button);
})();

Metadata directives:

Directive Purpose Example
@name Script name My Script
@namespace Unique identifier http://example.com/
@version Version number 1.0.0
@description What the script does Removes ads
@author Script creator Your Name
@match URL pattern to run on https://example.com/*
@include Alternative URL pattern *://example.com/*
@exclude URLs to skip https://example.com/admin/*
@grant API permissions GM_xmlhttpRequest
@require External libraries jQuery URL
@run-at Execution timing document-start, document-end, document-idle

URL Matching Patterns

Match patterns:

// Exact domain
// @match https://www.example.com/*

// Any subdomain
// @match https://*.example.com/*

// Any protocol
// @match *://example.com/*

// Multiple patterns
// @match https://example.com/*
// @match https://example.org/*

// Include (wildcard allowed)
// @include /^https?://example\.com/.*/

// Exclude specific pages
// @exclude https://example.com/login

Run timing:

// @run-at document-start
// Runs as soon as HTML is available (before DOM ready)

// @run-at document-end
// Runs after DOM is ready (default)

// @run-at document-idle
// Runs after page load event

Greasemonkey API (GM.* Functions)

Cross-origin requests (bypass CORS):

// @grant GM_xmlhttpRequest

GM_xmlhttpRequest({
    method: 'GET',
    url: 'https://api.example.com/data',
    headers: {
        'User-Agent': 'MyUserscript/1.0'
    },
    onload: function(response) {
        console.log(response.responseText);
        const data = JSON.parse(response.responseText);
        // Use data
    },
    onerror: function(error) {
        console.error('Request failed:', error);
    }
});

// Modern async/await wrapper
function gmXHR(config) {
    return new Promise((resolve, reject) => {
        GM_xmlhttpRequest({
            ...config,
            onload: resolve,
            onerror: reject
        });
    });
}

// Usage
const response = await gmXHR({
    method: 'GET',
    url: 'https://api.example.com/data'
});
const data = JSON.parse(response.responseText);

Persistent storage:

// @grant GM_setValue
// @grant GM_getValue

// Save data
GM_setValue('username', 'Alice');
GM_setValue('settings', JSON.stringify({ theme: 'dark' }));

// Read data
const username = GM_getValue('username', 'default');  // with default
const settingsJSON = GM_getValue('settings');
const settings = JSON.parse(settingsJSON);

// Delete data
GM_deleteValue('username');

// List all keys
const keys = await GM_listValues();

Other GM functions:

// Open new tab
GM_openInTab('https://example.com', { active: true });

// Get resource URL
// @resource logo https://example.com/logo.png
const logoURL = GM_getResourceURL('logo');

// Get resource text
const cssText = GM_getResourceText('customCSS');

// Add CSS
GM_addStyle(`
    .annoying-ad {
        display: none !important;
    }
`);

// Script info
const info = GM_info;
console.log(info.script.name);
console.log(info.script.version);

Practical Userscript Examples

Example 1: Remove elements:

// ==UserScript==
// @name         Remove Ads
// @match        https://example.com/*
// @run-at       document-end
// ==/UserScript==

(function() {
    'use strict';
    
    // Remove by class
    document.querySelectorAll('.ad, .advertisement').forEach(el => el.remove());
    
    // Remove by ID
    const banner = document.getElementById('annoying-banner');
    if (banner) banner.remove();
    
    // Observe for dynamically added ads
    const observer = new MutationObserver((mutations) => {
        mutations.forEach((mutation) => {
            mutation.addedNodes.forEach((node) => {
                if (node.nodeType === 1) {  // Element node
                    if (node.matches('.ad')) {
                        node.remove();
                    }
                }
            });
        });
    });
    
    observer.observe(document.body, {
        childList: true,
        subtree: true
    });
})();

Example 2: Auto-fill form:

// ==UserScript==
// @name         Auto-fill Login
// @match        https://example.com/login
// @grant        GM_getValue
// @grant        GM_setValue
// ==/UserScript==

(function() {
    'use strict';
    
    const username = GM_getValue('saved_username', '');
    const password = GM_getValue('saved_password', '');
    
    const usernameInput = document.querySelector('input[name="username"]');
    const passwordInput = document.querySelector('input[name="password"]');
    
    if (usernameInput && username) {
        usernameInput.value = username;
    }
    
    if (passwordInput && password) {
        passwordInput.value = password;
    }
    
    // Add save button
    const saveButton = document.createElement('button');
    saveButton.textContent = 'Save Credentials';
    saveButton.type = 'button';
    saveButton.onclick = () => {
        GM_setValue('saved_username', usernameInput.value);
        GM_setValue('saved_password', passwordInput.value);
        alert('Credentials saved!');
    };
    
    document.querySelector('form').appendChild(saveButton);
})();

Example 3: Fetch external data:

// ==UserScript==
// @name         Weather Widget
// @match        https://example.com/*
// @grant        GM_xmlhttpRequest
// @grant        GM_addStyle
// ==/UserScript==

(function() {
    'use strict';
    
    GM_addStyle(`
        #weather-widget {
            position: fixed;
            top: 10px;
            right: 10px;
            padding: 10px;
            background: white;
            border: 1px solid #ccc;
            border-radius: 5px;
            box-shadow: 0 2px 5px rgba(0,0,0,0.2);
            z-index: 9999;
        }
    `);
    
    const widget = document.createElement('div');
    widget.id = 'weather-widget';
    widget.textContent = 'Loading weather...';
    document.body.appendChild(widget);
    
    GM_xmlhttpRequest({
        method: 'GET',
        url: 'https://api.weatherapi.com/v1/current.json?key=YOUR_KEY&q=London',
        onload: function(response) {
            const data = JSON.parse(response.responseText);
            widget.innerHTML = `
                <strong>${data.location.name}</strong><br>
                ${data.current.temp_c}°C, ${data.current.condition.text}
            `;
        }
    });
})();

Browser Extensions: Full Browser Integration

WebExtensions API

WebExtensions is a cross-browser API standard supported by:

  • Chrome/Chromium (Manifest V3)

  • Firefox

  • Edge

  • Safari (partial support)

Manifest V2 vs V3:

Feature Manifest V2 Manifest V3
Background scripts Persistent pages Service workers
Host permissions permissions host_permissions
Blocking webRequest Synchronous Declarative (limited)
Remote code eval(), remote scripts Forbidden
Status Deprecated (2024+) Current standard

Extension Structure

Directory layout:

my-extension/

├│── manifest.json # Required: metadata and configuration

├│── background.js # Service worker (MV3) or background script (MV2)

├│── content-script.js # Runs in page context

├│── popup.html # Extension popup UI

├│── popup.js # Popup logic

├│── options.html # Settings page

├│── options.js # Settings logic

├│── icons/

││ ├── icon16.png

││ ├── icon48.png

││ └── icon128.png └── lib/ └── external-lib.js

manifest.json (Manifest V3)

Complete example:

{
  "manifest_version": 3,
  "name": "Example Extension",
  "version": "1.0.0",
  "description": "Demonstrates browser extension capabilities",
  "icons": {
    "16": "icons/icon16.png",
    "48": "icons/icon48.png",
    "128": "icons/icon128.png"
  },
  "action": {
    "default_popup": "popup.html",
    "default_icon": {
      "16": "icons/icon16.png",
      "48": "icons/icon48.png"
    },
    "default_title": "Click to open"
  },
  "background": {
    "service_worker": "background.js"
  },
  "content_scripts": [
    {
      "matches": ["https://example.com/*"],
      "js": ["content-script.js"],
      "css": ["styles.css"],
      "run_at": "document_end"
    }
  ],
  "permissions": [
    "storage",
    "tabs",
    "activeTab",
    "notifications"
  ],
  "host_permissions": [
    "https://api.example.com/*"
  ],
  "options_page": "options.html",
  "web_accessible_resources": [
    {
      "resources": ["images/*"],
      "matches": ["https://example.com/*"]
    }
  ]
}

Key fields:

Field Purpose
manifest_version Must be 3 (or 2 for legacy)
name, version, description Extension metadata
icons Extension icons (various sizes)
action Browser toolbar button (popup UI)
background Background service worker
content_scripts Scripts injected into pages
permissions API permissions required
host_permissions Cross-origin request permissions
options_page Settings page

Content Scripts: Isolated Worlds

Content scripts run in an isolated world:

┌│─────────────────────────────────────────┐

││ Web Page Context │

││ • Page’s JavaScript │

││ • Page’s variables/functions │

││ • Shared DOM │ └─────────────────────────────────────────┘ ↕ (DOM only)

┌│─────────────────────────────────────────┐

││ Content Script Context │

││ • Extension’s JavaScript │

││ • Cannot access page variables │

││ • Shared DOM │

││ • Limited chrome.* APIs │ └─────────────────────────────────────────┘ ↕ (message passing)

┌│─────────────────────────────────────────┐

││ Background Script Context │

││ • Full chrome.* API access │

││ • Persistent state │

││ • No DOM access │ └─────────────────────────────────────────┘

content-script.js:

// This runs in isolated world
console.log('Content script loaded');

// Can access DOM
const title = document.querySelector('h1').textContent;

// Can modify DOM
const banner = document.createElement('div');
banner.textContent = 'Extension is active!';
banner.style.cssText = 'position:fixed; top:0; left:0; right:0; background:yellow; padding:10px; z-index:99999;';
document.body.prepend(banner);

// CANNOT access page variables
// console.log(window.somePageVariable);  // undefined

// Can use limited chrome APIs
chrome.runtime.sendMessage({ title: title }, (response) => {
    console.log('Background responded:', response);
});

// Listen for messages from background
chrome.runtime.onMessage.addListener((message, sender, sendResponse) => {
    if (message.action === 'highlightText') {
        document.querySelectorAll('p').forEach(p => {
            p.style.backgroundColor = 'yellow';
        });
        sendResponse({ done: true });
    }
});

Injecting into page context (to access page variables):

// content-script.js
function injectScript() {
    const script = document.createElement('script');
    script.textContent = `
        // This code runs in page context
        console.log('Injected script running');
        console.log(window.somePageVariable);  // Now accessible
        
        // Send data back to content script via custom events
        window.postMessage({ type: 'FROM_PAGE', data: somePageVariable }, '*');
    `;
    document.documentElement.appendChild(script);
    script.remove();
}

// Listen for messages from page
window.addEventListener('message', (event) => {
    if (event.source === window && event.data.type === 'FROM_PAGE') {
        console.log('Data from page:', event.data.data);
    }
});

injectScript();

Background Scripts: Service Workers (MV3)

background.js (service worker in MV3):

// Service workers must be event-driven (no persistent state)

// Installation
chrome.runtime.onInstalled.addListener((details) => {
    if (details.reason === 'install') {
        console.log('Extension installed');
        // Set default settings
        chrome.storage.sync.set({ enabled: true });
    } else if (details.reason === 'update') {
        console.log('Extension updated');
    }
});

// Listen for messages from content scripts
chrome.runtime.onMessage.addListener((message, sender, sendResponse) => {
    console.log('Message from content script:', message);
    console.log('Sender tab:', sender.tab.id);
    
    // Async operation
    (async () => {
        const data = await fetchData();
        sendResponse({ data: data });
    })();
    
    return true;  // Required for async sendResponse
});

// Browser action (toolbar button) clicked
chrome.action.onClicked.addListener((tab) => {
    console.log('Extension icon clicked on tab:', tab.id);
    
    // Send message to content script
    chrome.tabs.sendMessage(tab.id, { action: 'highlightText' });
});

// Tab updated
chrome.tabs.onUpdated.addListener((tabId, changeInfo, tab) => {
    if (changeInfo.status === 'complete' && tab.url.includes('example.com')) {
        console.log('Tab loaded:', tab.url);
        
        // Inject content script dynamically
        chrome.scripting.executeScript({
            target: { tabId: tabId },
            files: ['content-script.js']
        });
    }
});

// Web request interception (declarative in MV3)
chrome.declarativeNetRequest.updateDynamicRules({
    addRules: [{
        id: 1,
        priority: 1,
        action: { type: 'block' },
        condition: {
            urlFilter: '*://ads.example.com/*',
            resourceTypes: ['script', 'image']
        }
    }],
    removeRuleIds: []
});

Service worker lifecycle:

// Service workers can be terminated after inactivity
// Must re-establish state on each wake-up

// BAD: This won't persist
let counter = 0;

chrome.runtime.onMessage.addListener((msg) => {
    counter++;  // Lost when service worker terminates
});

// GOOD: Use chrome.storage
chrome.runtime.onMessage.addListener(async (msg) => {
    const { counter = 0 } = await chrome.storage.local.get('counter');
    await chrome.storage.local.set({ counter: counter + 1 });
});

Message Passing

Content script ↔︎ Background:

// content-script.js → background.js
chrome.runtime.sendMessage(
    { type: 'GET_DATA', url: window.location.href },
    (response) => {
        console.log('Response:', response);
    }
);

// background.js listening
chrome.runtime.onMessage.addListener((message, sender, sendResponse) => {
    if (message.type === 'GET_DATA') {
        fetchData(message.url).then(data => {
            sendResponse({ data: data });
        });
        return true;  // Keep message channel open for async
    }
});

Long-lived connections:

// content-script.js
const port = chrome.runtime.connect({ name: 'my-channel' });

port.postMessage({ type: 'INIT' });

port.onMessage.addListener((msg) => {
    console.log('Received:', msg);
});

// background.js
chrome.runtime.onConnect.addListener((port) => {
    console.log('Connected:', port.name);
    
    port.onMessage.addListener((msg) => {
        console.log('Message:', msg);
        port.postMessage({ response: 'acknowledged' });
    });
});

Cross-extension messaging:

// Send to another extension
chrome.runtime.sendMessage(
    'other-extension-id',
    { type: 'HELLO' },
    (response) => {
        console.log('Response from other extension:', response);
    }
);

Storage API

Types of storage:

// 1. chrome.storage.local (local to device, ~5MB)
await chrome.storage.local.set({ key: 'value' });
const { key } = await chrome.storage.local.get('key');

// 2. chrome.storage.sync (synced across devices, ~100KB)
await chrome.storage.sync.set({ theme: 'dark' });
const { theme } = await chrome.storage.sync.get('theme');

// 3. chrome.storage.session (session-only, MV3)
await chrome.storage.session.set({ tempData: '...' });

// Save multiple items
await chrome.storage.local.set({
    username: 'Alice',
    settings: { theme: 'dark', lang: 'en' }
});

// Get multiple items
const items = await chrome.storage.local.get(['username', 'settings']);
console.log(items.username);
console.log(items.settings.theme);

// Get all items
const allItems = await chrome.storage.local.get(null);

// Remove item
await chrome.storage.local.remove('username');

// Clear all
await chrome.storage.local.clear();

// Listen for changes
chrome.storage.onChanged.addListener((changes, areaName) => {
    for (let [key, { oldValue, newValue }] of Object.entries(changes)) {
        console.log(`${key} changed from ${oldValue} to ${newValue} in ${areaName}`);
    }
});

Tabs API

// Get current tab
const [tab] = await chrome.tabs.query({ active: true, currentWindow: true });
console.log('Current URL:', tab.url);

// Create new tab
const newTab = await chrome.tabs.create({
    url: 'https://example.com',
    active: true  // Switch to new tab
});

// Update tab
await chrome.tabs.update(tab.id, {
    url: 'https://example.org'
});

// Close tab
await chrome.tabs.remove(tab.id);

// Get all tabs
const tabs = await chrome.tabs.query({});

// Query tabs by URL pattern
const exampleTabs = await chrome.tabs.query({
    url: '*://example.com/*'
});

// Reload tab
await chrome.tabs.reload(tab.id);

// Execute script in tab
await chrome.scripting.executeScript({
    target: { tabId: tab.id },
    func: () => {
        document.body.style.backgroundColor = 'red';
    }
});

// Inject CSS
await chrome.scripting.insertCSS({
    target: { tabId: tab.id },
    css: 'body { background: blue !important; }'
});

Other Useful APIs

Notifications:

chrome.notifications.create({
    type: 'basic',
    iconUrl: 'icons/icon48.png',
    title: 'Extension Notification',
    message: 'Something happened!',
    priority: 2
});

Context menus (right-click menu):

// background.js
chrome.runtime.onInstalled.addListener(() => {
    chrome.contextMenus.create({
        id: 'search-selection',
        title: 'Search "%s" on Example',
        contexts: ['selection']
    });
});

chrome.contextMenus.onClicked.addListener((info, tab) => {
    if (info.menuItemId === 'search-selection') {
        const query = encodeURIComponent(info.selectionText);
        chrome.tabs.create({
            url: `https://example.com/search?q=${query}`
        });
    }
});

Alarms (scheduled tasks):

// Create alarm
chrome.alarms.create('fetchData', {
    periodInMinutes: 30
});

// Listen for alarm
chrome.alarms.onAlarm.addListener((alarm) => {
    if (alarm.name === 'fetchData') {
        console.log('Time to fetch data!');
        fetchData();
    }
});

Cookies:

// Get cookies
const cookies = await chrome.cookies.getAll({
    url: 'https://example.com'
});

// Set cookie
await chrome.cookies.set({
    url: 'https://example.com',
    name: 'session',
    value: 'abc123'
});

// Remove cookie
await chrome.cookies.remove({
    url: 'https://example.com',
    name: 'session'
});

Web request (Manifest V3 - declarative only):

// Block ads
chrome.declarativeNetRequest.updateDynamicRules({
    addRules: [
        {
            id: 1,
            priority: 1,
            action: { type: 'block' },
            condition: {
                urlFilter: '*://ads.example.com/*',
                resourceTypes: ['script', 'image', 'sub_frame']
            }
        }
    ],
    removeRuleIds: []
});

// Redirect
chrome.declarativeNetRequest.updateDynamicRules({
    addRules: [
        {
            id: 2,
            priority: 1,
            action: {
                type: 'redirect',
                redirect: { url: 'https://example.org/alternative.js' }
            },
            condition: {
                urlFilter: 'https://example.com/script.js',
                resourceTypes: ['script']
            }
        }
    ],
    removeRuleIds: []
});

popup.html:

<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8">
    <style>
        body {
            width: 300px;
            padding: 10px;
            font-family: Arial, sans-serif;
        }
        button {
            width: 100%;
            padding: 10px;
            margin: 5px 0;
        }
    </style>
</head>
<body>
    <h3>Extension Popup</h3>
    <button id="highlight">Highlight Page</button>
    <button id="screenshot">Take Screenshot</button>
    <div id="status"></div>
    <script src="popup.js"></script>
</body>
</html>

popup.js:

document.getElementById('highlight').addEventListener('click', async () => {
    const [tab] = await chrome.tabs.query({ active: true, currentWindow: true });
    
    chrome.tabs.sendMessage(tab.id, { action: 'highlight' }, (response) => {
        document.getElementById('status').textContent = 'Highlighted!';
    });
});

document.getElementById('screenshot').addEventListener('click', async () => {
    const dataUrl = await chrome.tabs.captureVisibleTab(null, {
        format: 'png'
    });
    
    // Download screenshot
    chrome.downloads.download({
        url: dataUrl,
        filename: 'screenshot.png'
    });
});

Complete Example: YouTube Enhancer

manifest.json:

{
  "manifest_version": 3,
  "name": "YouTube Enhancer",
  "version": "1.0.0",
  "permissions": ["storage"],
  "content_scripts": [
    {
      "matches": ["*://*.youtube.com/*"],
      "js": ["content.js"],
      "run_at": "document_end"
    }
  ],
  "action": {
    "default_popup": "popup.html"
  }
}

content.js:

(async function() {
    'use strict';
    
    // Get settings
    const { autoHD = true, hideComments = false } = 
        await chrome.storage.sync.get(['autoHD', 'hideComments']);
    
    // Auto-switch to HD
    if (autoHD) {
        const observer = new MutationObserver(() => {
            const video = document.querySelector('video');
            if (video && video.getAvailableQualityLevels) {
                const levels = video.getAvailableQualityLevels();
                if (levels.includes('hd1080')) {
                    video.setPlaybackQualityRange('hd1080');
                }
            }
        });
        
        observer.observe(document.body, { childList: true, subtree: true });
    }
    
    // Hide comments
    if (hideComments) {
        const style = document.createElement('style');
        style.textContent = '#comments { display: none !important; }';
        document.head.appendChild(style);
    }
    
    // Add custom button
    function addCustomButton() {
        const controls = document.querySelector('.ytp-right-controls');
        if (!controls || document.getElementById('custom-btn')) return;
        
        const button = document.createElement('button');
        button.id = 'custom-btn';
        button.textContent = '⚡';
        button.className = 'ytp-button';
        button.title = 'Custom action';
        button.onclick = () => {
            alert('Custom button clicked!');
        };
        
        controls.prepend(button);
    }
    
    // Wait for player to load
    const playerObserver = new MutationObserver(() => {
        if (document.querySelector('.ytp-right-controls')) {
            addCustomButton();
            playerObserver.disconnect();
        }
    });
    
    playerObserver.observe(document.body, { childList: true, subtree: true });
})();

Security Considerations

Content Security Policy (CSP)

Extensions have strict CSP (Manifest V3):

// ❌ FORBIDDEN in MV3:
eval('console.log("test")');
new Function('return 1')();

// ❌ FORBIDDEN: Inline scripts
<script>alert('test')</script>

// ❌ FORBIDDEN: Remote scripts
<script src="https://example.com/script.js"></script>

// ✅ ALLOWED: External scripts in extension package
<script src="local-script.js"></script>

Workarounds:

// Instead of eval, use safe alternatives
const data = JSON.parse(jsonString);

// Instead of new Function, use pre-defined functions
const operations = {
    add: (a, b) => a + b,
    multiply: (a, b) => a * b
};
const result = operations[operation](x, y);

Permissions

Principle of least privilege:

{
  "permissions": [
    "activeTab"  //  Only current tab
    // "tabs"    //  Avoid: All tabs' URLs
  ],
  "host_permissions": [
    "https://api.example.com/*"  //  Specific domain
    // "*://*/*"                 //  Avoid: All websites
  ]
}

Optional permissions (request at runtime):

{
  "optional_permissions": ["downloads"],
  "optional_host_permissions": ["*://*/*"]
}
// Request permission when needed
document.getElementById('enable').addEventListener('click', async () => {
    const granted = await chrome.permissions.request({
        permissions: ['downloads']
    });
    
    if (granted) {
        console.log('Permission granted!');
    }
});

// Check if permission is granted
const hasPermission = await chrome.permissions.contains({
    permissions: ['downloads']
});

XSS Prevention

Never inject unsanitized content:

// ❌ DANGEROUS
const userInput = message.text;
document.body.innerHTML = userInput;  // XSS vulnerability!

// ✅ SAFE
const userInput = message.text;
document.body.textContent = userInput;  // Text only, no HTML

// ✅ SAFE (if HTML needed)
const sanitized = DOMPurify.sanitize(userInput);
document.body.innerHTML = sanitized;

Publishing Extensions

Chrome Web Store

  1. Create developer account ($5 one-time fee)

  2. Prepare assets:

    • Icon (128×128 PNG)

    • Screenshots (1280×800 or 640×400)

    • Promotional images (optional)

    • Description, screenshots

  3. Zip extension:

zip -r extension.zip * -x "*.git*" -x "*node_modules*"
  1. Upload to dashboard

  2. Fill metadata

  3. Submit for review (1-3 days typically)

Firefox Add-ons

  1. Create Mozilla account (free)

  2. Sign extension:

web-ext sign --api-key=$API_KEY --api-secret=$API_SECRET
  1. Upload .xpi file

  2. Review (automated + manual for some extensions)


Summary

Userscripts:

  • Quick page modifications

  • Require userscript manager (Tampermonkey)

  • Limited API (GM_* functions)

  • Easy to share and install

Browser extensions:

  • Full browser integration

  • WebExtensions API (cross-browser)

  • Manifest V3 (service workers, declarative)

  • Isolated worlds (content scripts)

  • Message passing architecture

  • Rich APIs (tabs, storage, notifications, etc.)

Key concepts:

  • Content scripts: Run in isolated world, access DOM

  • Background scripts: Service workers, full API access

  • Message passing: Communication between contexts

  • Storage: chrome.storage for persistence

  • Security: CSP, permissions, XSS prevention

Extensions and userscripts extend ECMA-262 JavaScript with browser-specific capabilities, allowing developers to enhance web experiences beyond what standard web pages can achieve.


Chapter 11: JavaScript as a Compilation Target

Introduction: The Shift in Perspective

For most of its history, JavaScript was written directly by developers. However, as web applications grew in complexity and developers sought to use other languages, JavaScript increasingly became a compilation target—the output of compilers rather than hand-written code.

Why compile to JavaScript?

  1. Universal runtime: Every browser executes JavaScript

  2. No installation: Zero-friction deployment

  3. Performance: Modern engines (V8, SpiderMonkey) are highly optimized

  4. Ecosystem: Rich library and tooling support

  5. Portability: Write once, run everywhere

Evolution timeline:

1995-2005: Hand-written JavaScript 2006-2010: JavaScript libraries (jQuery, Prototype) 2009: CoffeeScript (first major compile-to-JS language) 2012: TypeScript, Dart 2013: React JSX 2015: Babel (ES6+ → ES5 transpilation) 2015: asm.js (C/C++ → optimizable JS subset) 2017: WebAssembly MVP (binary compilation target) 2020+: WASM + JavaScript interop dominance


Categories of Compilation to JavaScript

1. Transpilation (Source-to-Source)

Definition: Converting code from one high-level language to another high-level language (JavaScript).

Examples:

TypeScript → JavaScript ES6+ → ES5 (Babel) JSX → JavaScript CoffeeScript → JavaScript Dart → JavaScript

Characteristics:

  • Readable output: Generated JS often resembles input

  • Similar abstraction level: Both input and output are high-level

  • Debugging: Source maps map output back to input

  • Type erasure: Type information (if present) is stripped

2. Compilation (Low-level to High-level)

Definition: Converting code from lower-level languages (C, C++, Rust) to JavaScript.

Examples:

C/C++ → asm.js → JavaScript C/C++ → WASM → (JS glue code) Rust → WASM → (JS interop)

Characteristics:

  • Less readable output: Heavily optimized, machine-like code

  • Performance focus: Targets asm.js subset or WebAssembly

  • Memory management: Manual memory via ArrayBuffer/TypedArrays

  • FFI required: Foreign function interface for host APIs


Transpilers: TypeScript, Babel, and Beyond

TypeScript: Static Typing for JavaScript

TypeScript adds static type checking to JavaScript while remaining a superset of valid JavaScript.

Type system features:

// Basic types
let name: string = "Alice";
let age: number = 30;
let active: boolean = true;

// Arrays and tuples
let numbers: number[] = [1, 2, 3];
let tuple: [string, number] = ["Alice", 30];

// Objects
interface User {
    name: string;
    age: number;
    email?: string;  // Optional property
}

const user: User = {
    name: "Bob",
    age: 25
};

// Functions
function greet(name: string): string {
    return `Hello, ${name}!`;
}

// Generics
function identity<T>(arg: T): T {
    return arg;
}

const result = identity<string>("test");

// Union types
function process(value: string | number): void {
    if (typeof value === "string") {
        console.log(value.toUpperCase());
    } else {
        console.log(value.toFixed(2));
    }
}

// Type aliases
type Point = { x: number; y: number };
type Callback = (data: string) => void;

// Enums
enum Direction {
    Up,
    Down,
    Left,
    Right
}

// Classes with access modifiers
class Animal {
    private name: string;
    protected age: number;
    
    constructor(name: string, age: number) {
        this.name = name;
        this.age = age;
    }
    
    public speak(): void {
        console.log(`${this.name} makes a sound`);
    }
}

Compilation process:

# Install TypeScript
npm install -g typescript

# Compile single file
tsc example.ts
# Output: example.js

# Compile with config
tsc --project tsconfig.json

tsconfig.json:

{
  "compilerOptions": {
    "target": "ES2020",
    "module": "ESNext",
    "lib": ["ES2020", "DOM"],
    "strict": true,
    "esModuleInterop": true,
    "skipLibCheck": true,
    "outDir": "./dist",
    "rootDir": "./src",
    "sourceMap": true,
    "declaration": true
  },
  "include": ["src/**/*"],
  "exclude": ["node_modules"]
}

Input vs. Output:

// Input: example.ts
interface Person {
    name: string;
    age: number;
}

function greet(person: Person): string {
    return `Hello, ${person.name}!`;
}

const user: Person = { name: "Alice", age: 30 };
console.log(greet(user));
// Output: example.js (types erased)
function greet(person) {
    return `Hello, ${person.name}!`;
}

const user = { name: "Alice", age: 30 };
console.log(greet(user));

Type erasure: All type annotations are removed during compilation. No runtime type checking occurs.

Babel: Modern JavaScript to Legacy JavaScript

Babel transpiles modern JavaScript (ES6+) to older versions (ES5) for browser compatibility.

Common transformations:

// Input: ES6+
const greet = (name) => `Hello, ${name}!`;

class Counter {
    count = 0;
    
    increment() {
        this.count++;
    }
}

const [first, ...rest] = [1, 2, 3, 4];
const obj = { x: 1, y: 2, ...other };

async function fetchData() {
    const response = await fetch('/api/data');
    return response.json();
}
// Output: ES5
"use strict";

var greet = function greet(name) {
    return "Hello, ".concat(name, "!");
};

var Counter = function Counter() {
    _classCallCheck(this, Counter);
    this.count = 0;
};

Counter.prototype.increment = function increment() {
    this.count++;
};

var first = [1, 2, 3, 4][0];
var rest = [1, 2, 3, 4].slice(1);

var obj = Object.assign({ x: 1, y: 2 }, other);

function fetchData() {
    return regeneratorRuntime.async(function fetchData$(_context) {
        while (1) {
            switch (_context.prev = _context.next) {
                case 0:
                    _context.next = 2;
                    return regeneratorRuntime.awrap(fetch('/api/data'));
                case 2:
                    response = _context.sent;
                    return _context.abrupt("return", response.json());
            }
        }
    });
}

Configuration (.babelrc):

{
  "presets": [
    ["@babel/preset-env", {
      "targets": "> 0.25%, not dead",
      "useBuiltIns": "usage",
      "corejs": 3
    }]
  ],
  "plugins": [
    "@babel/plugin-proposal-class-properties",
    "@babel/plugin-transform-runtime"
  ]
}

Polyfills: Babel can inject polyfills for missing APIs:

// Input
const arr = [1, 2, 3];
const doubled = arr.map(x => x * 2);
const hasTwo = arr.includes(2);  // ES2016 method

// Output with polyfill injection
require("core-js/modules/es.array.includes");

const arr = [1, 2, 3];
const doubled = arr.map(x => x * 2);
const hasTwo = arr.includes(2);  // Polyfill loaded

JSX: Declarative UI Syntax

JSX (JavaScript XML) embeds XML-like syntax in JavaScript, primarily used by React.

Input (JSX):

const App = ({ name, count }) => {
    const [state, setState] = React.useState(0);
    
    const handleClick = () => {
        setState(state + 1);
    };
    
    return (
        <div className="container">
            <h1>Hello, {name}!</h1>
            <p>Count: {count}</p>
            <button onClick={handleClick}>
                Increment
            </button>
            {state > 5 && <p>State is greater than 5</p>}
            <List items={[1, 2, 3]} />
        </div>
    );
};

Output (JavaScript):

const App = ({ name, count }) => {
    const [state, setState] = React.useState(0);
    
    const handleClick = () => {
        setState(state + 1);
    };
    
    return React.createElement(
        "div",
        { className: "container" },
        React.createElement("h1", null, "Hello, ", name, "!"),
        React.createElement("p", null, "Count: ", count),
        React.createElement(
            "button",
            { onClick: handleClick },
            "Increment"
        ),
        state > 5 && React.createElement("p", null, "State is greater than 5"),
        React.createElement(List, { items: [1, 2, 3] })
    );
};

Transformation details:

  • Tags → React.createElement(type, props, ...children)

  • Attributes → Props object

  • Self-closing tags → Single element

  • Expressions in {} → Evaluated JavaScript

Other Notable Transpilers

CoffeeScript (2009):

# Input: CoffeeScript
square = (x) -> x * x
numbers = [1..10]
squares = (square num for num in numbers)
// Output: JavaScript
var square, numbers, squares;

square = function(x) {
    return x * x;
};

numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];

squares = (function() {
    var i, len, results;
    results = [];
    for (i = 0, len = numbers.length; i < len; i++) {
        num = numbers[i];
        results.push(square(num));
    }
    return results;
})();

Dart (with dart2js):

// Input: Dart
void main() {
    var greeting = 'Hello, World!';
    print(greeting);
    
    var numbers = [1, 2, 3, 4, 5];
    var doubled = numbers.map((n) => n * 2).toList();
    print(doubled);
}

Output is heavily optimized JavaScript (several thousand lines for even simple programs due to runtime library).


asm.js: The Optimizable Subset

What is asm.js?

asm.js is a strict subset of JavaScript designed to be optimized ahead-of-time (AOT) by JavaScript engines.

Key principles:

  1. Statically typed (through type annotations via bitwise ops)

  2. No garbage collection (manual memory management)

  3. Predictable performance (no dynamic behavior)

  4. Validation (can be verified as valid asm.js)

Example asm.js module:

function MyModule(stdlib, foreign, heap) {
    "use asm";
    
    // Imports
    var sqrt = stdlib.Math.sqrt;
    var log = foreign.log;
    
    // Heap (typed array view)
    var HEAP32 = new stdlib.Int32Array(heap);
    var HEAPF64 = new stdlib.Float64Array(heap);
    
    // Functions
    function add(x, y) {
        x = x | 0;      // x is int32
        y = y | 0;      // y is int32
        return (x + y) | 0;
    }
    
    function distance(x1, y1, x2, y2) {
        x1 = +x1;       // x1 is double
        y1 = +y1;       // y1 is double
        x2 = +x2;       // x2 is double
        y2 = +y2;       // y2 is double
        
        var dx = 0.0;
        var dy = 0.0;
        
        dx = x2 - x1;
        dy = y2 - y1;
        
        return +sqrt(dx * dx + dy * dy);
    }
    
    function processArray(start, length) {
        start = start | 0;
        length = length | 0;
        
        var i = 0;
        var sum = 0;
        
        for (i = 0; (i | 0) < (length | 0); i = (i + 1) | 0) {
            sum = (sum + HEAP32[(start + i) >> 2]) | 0;
        }
        
        return sum | 0;
    }
    
    return {
        add: add,
        distance: distance,
        processArray: processArray
    };
}

// Usage
const stdlib = {
    Math: Math,
    Int32Array: Int32Array,
    Float64Array: Float64Array
};

const foreign = {
    log: console.log
};

const heap = new ArrayBuffer(1024 * 1024);  // 1MB

const module = MyModule(stdlib, foreign, heap);

console.log(module.add(5, 3));                    // 8
console.log(module.distance(0, 0, 3, 4));         // 5

Type Annotations via Coercion

asm.js uses bitwise/unary operations for type annotations:

function typed(x, y) {
    "use asm";
    
    x = x | 0;          // x: int32
    y = +y;             // y: double
    
    var i = 0;          // i: int32 (inferred)
    var f = 0.0;        // f: double (inferred)
    
    i = (x + 10) | 0;   // Ensure int32
    f = y + 3.14;       // Already double
    
    return +f;          // Return double
}

Type coercion operators:

Operator Type Example
\| 0 int32 x = x \| 0
+ (unary) double y = +y
~~ int32 x = ~~x
>>> 0 uint32 x = x >>> 0

Memory Model

asm.js uses typed array views over ArrayBuffer:

function MemoryModule(stdlib, foreign, heap) {
    "use asm";
    
    var HEAP8 = new stdlib.Int8Array(heap);
    var HEAP16 = new stdlib.Int16Array(heap);
    var HEAP32 = new stdlib.Int32Array(heap);
    var HEAPF32 = new stdlib.Float32Array(heap);
    var HEAPF64 = new stdlib.Float64Array(heap);
    
    function writeInt32(offset, value) {
        offset = offset | 0;
        value = value | 0;
        
        HEAP32[offset >> 2] = value;
    }
    
    function readInt32(offset) {
        offset = offset | 0;
        return HEAP32[offset >> 2] | 0;
    }
    
    function writeFloat64(offset, value) {
        offset = offset | 0;
        value = +value;
        
        HEAPF64[offset >> 3] = value;
    }
    
    return {
        writeInt32: writeInt32,
        readInt32: readInt32,
        writeFloat64: writeFloat64
    };
}

Byte offset calculations:

  • Int32: offset >> 2 (divide by 4)

  • Float64: offset >> 3 (divide by 8)

  • Int16: offset >> 1 (divide by 2)

Emscripten: C/C++ to asm.js

Emscripten compiles C/C++ to asm.js using LLVM.

Compilation pipeline:

C/C++ source ↓ Clang (frontend) ↓ LLVM IR (intermediate representation) ↓ LLVM optimization passes ↓ Emscripten backend ↓ asm.js output

Example C code:

// example.c
#include <stdio.h>
#include <emscripten.h>

EMSCRIPTEN_KEEPALIVE
int fibonacci(int n) {
    if (n <= 1) return n;
    return fibonacci(n - 1) + fibonacci(n - 2);
}

int main() {
    printf("fib(10) = %d\n", fibonacci(10));
    return 0;
}

Compile:

emcc example.c -o example.js -s EXPORTED_FUNCTIONS='["_fibonacci"]' -s EXPORTED_RUNTIME_METHODS='["ccall"]'

Generated output (simplified):

// example.js (thousands of lines)
var Module = {
    // ... runtime setup ...
};

function _fibonacci(n) {
    n = n | 0;
    var $0 = 0, $1 = 0;
    
    if ((n | 0) <= 1) {
        return n | 0;
    }
    
    $0 = _fibonacci((n - 1) | 0) | 0;
    $1 = _fibonacci((n - 2) | 0) | 0;
    
    return ($0 + $1) | 0;
}

// Exported interface
Module.ccall = function(name, returnType, argTypes, args) {
    // ... calling convention ...
};

Usage from JavaScript:

// Load generated module
const result = Module.ccall('fibonacci', 'number', ['number'], [10]);
console.log(result);  // 55

Performance characteristics:

  • 2-4x slower than native C/C++

  • Much faster than regular JavaScript (2-10x speedup)

  • Predictable: No JIT warm-up time


WebAssembly: The Binary Compilation Target

WebAssembly vs. asm.js

From the reference materials provided, we understand that WebAssembly (WASM) evolved from the asm.js concept as a true binary format rather than a JavaScript subset.

Key differences:

┌│────────────────────────────────────────────────────┐

││ asm.js vs WebAssembly │

├│────────────────────────────────────────────────────┤

││ asm.js │ WebAssembly │

├│────────────────────────────┼───────────────────────┤

││ Text format (JavaScript) │ Binary format (.wasm)│

││ Large file size │ Compact (~50% smaller)│

││ Parsing overhead │ Fast decode/validate │

││ ~2x native speed │ ~1.5x native speed │

││ JIT compilation │ AOT compilation │

││ JavaScript subset │ Independent bytecode │ └────────────────────────────────────────────────────┘

WASM Module Structure

Binary format (.wasm) consists of sections:

┌│─────────────────────────────────────┐

││ WebAssembly Module │

├│─────────────────────────────────────┤

││ Magic number: 0x00 0x61 0x73 0x6d │

││ Version: 0x01 0x00 0x00 0x00 │

├│─────────────────────────────────────┤

││ 1. Type Section │

││ Function signatures │

├│─────────────────────────────────────┤

││ 2. Import Section │

││ External functions/memory │

├│─────────────────────────────────────┤

││ 3. Function Section │

││ Function type indices │

├│─────────────────────────────────────┤

││ 4. Memory Section │

││ Linear memory definition │

├│─────────────────────────────────────┤

││ 5. Export Section │

││ Exported functions/memory │

├│─────────────────────────────────────┤

││ 6. Code Section │

││ Function bodies (bytecode) │

├│─────────────────────────────────────┤

││ 7. Data Section │

││ Initial memory contents │ └─────────────────────────────────────┘

Text Format (WAT)

WebAssembly Text format for human readability:

;; example.wat
(module
  ;; Import console.log from JavaScript
  (import "env" "log" (func $log (param i32)))
  
  ;; Define memory (1 page = 64KB)
  (memory 1)
  
  ;; Export memory to JavaScript
  (export "memory" (memory 0))
  
  ;; Function: add two numbers
  (func $add (param $x i32) (param $y i32) (result i32)
    local.get $x
    local.get $y
    i32.add
  )
  
  ;; Export add function
  (export "add" (func $add))
  
  ;; Function: fibonacci
  (func $fib (param $n i32) (result i32)
    (local $a i32)
    (local $b i32)
    (local $temp i32)
    (local $i i32)
    
    ;; Base cases
    (if (i32.le_s (local.get $n) (i32.const 1))
      (then
        (return (local.get $n))
      )
    )
    
    ;; Initialize
    (local.set $a (i32.const 0))
    (local.set $b (i32.const 1))
    (local.set $i (i32.const 2))
    
    ;; Loop
    (block $break
      (loop $continue
        ;; temp = a + b
        (local.set $temp
          (i32.add (local.get $a) (local.get $b))
        )
        
        ;; a = b
        (local.set $a (local.get $b))
        
        ;; b = temp
        (local.set $b (local.get $temp))
        
        ;; i++
        (local.set $i
          (i32.add (local.get $i) (i32.const 1))
        )
        
        ;; if i <= n, continue
        (br_if $continue
          (i32.le_s (local.get $i) (local.get $n))
        )
      )
    )
    
    (local.get $b)
  )
  
  (export "fib" (func $fib))
)

Compile WAT to WASM:

wat2wasm example.wat -o example.wasm

Loading and Using WASM in JavaScript

Basic loading:

// Fetch and instantiate
const response = await fetch('example.wasm');
const buffer = await response.arrayBuffer();
const module = await WebAssembly.instantiate(buffer, {
    env: {
        log: (x) => console.log('WASM log:', x)
    }
});

const { add, fib, memory } = module.instance.exports;

console.log(add(5, 3));     // 8
console.log(fib(10));       // 55

// Access linear memory
const view = new Uint8Array(memory.buffer);
console.log(view[0]);       // First byte of memory

Streaming compilation (more efficient):

const { instance } = await WebAssembly.instantiateStreaming(
    fetch('example.wasm'),
    {
        env: {
            log: console.log
        }
    }
);

const { add, fib } = instance.exports;

Memory Sharing Between JS and WASM

Linear memory is shared via ArrayBuffer:

// JavaScript side
const memory = new WebAssembly.Memory({ initial: 1 });  // 1 page = 64KB

const { writeString, readString } = await WebAssembly.instantiateStreaming(
    fetch('string.wasm'),
    {
        env: { memory }
    }
).then(m => m.instance.exports);

// Write string to WASM memory
const encoder = new TextEncoder();
const text = "Hello, WASM!";
const bytes = encoder.encode(text);

const view = new Uint8Array(memory.buffer);
view.set(bytes, 0);  // Write at offset 0

// Call WASM function (processes string in memory)
const length = bytes.length;
writeString(0, length);

// Read back result
const resultBytes = view.slice(0, length);
const decoder = new TextDecoder();
console.log(decoder.decode(resultBytes));

WAT code for string processing:

(module
  (memory (import "env" "memory") 1)
  
  (func $writeString (param $offset i32) (param $length i32)
    (local $i i32)
    (local $byte i32)
    
    (local.set $i (i32.const 0))
    
    (block $break
      (loop $continue
        ;; Get byte
        (local.set $byte
          (i32.load8_u (i32.add (local.get $offset) (local.get $i)))
        )
        
        ;; Convert to uppercase (if lowercase letter)
        (if (i32.and
              (i32.ge_u (local.get $byte) (i32.const 97))
              (i32.le_u (local.get $byte) (i32.const 122))
            )
          (then
            (local.set $byte
              (i32.sub (local.get $byte) (i32.const 32))
            )
            (i32.store8
              (i32.add (local.get $offset) (local.get $i))
              (local.get $byte)
            )
          )
        )
        
        ;; Increment
        (local.set $i (i32.add (local.get $i) (i32.const 1)))
        
        ;; Loop condition
        (br_if $continue
          (i32.lt_u (local.get $i) (local.get $length))
        )
      )
    )
  )
  
  (export "writeString" (func $writeString))
)

Compiling C/C++ to WASM with Emscripten

Modern Emscripten targets WASM instead of asm.js:

// example.c
#include <emscripten.h>
#include <math.h>

EMSCRIPTEN_KEEPALIVE
double calculate(double x, double y) {
    return sqrt(x * x + y * y);
}

EMSCRIPTEN_KEEPALIVE
int* createArray(int size) {
    int* arr = (int*)malloc(size * sizeof(int));
    for (int i = 0; i < size; i++) {
        arr[i] = i * i;
    }
    return arr;
}

EMSCRIPTEN_KEEPALIVE
void freeArray(int* arr) {
    free(arr);
}

Compile to WASM:

emcc example.c -o example.js \

  -s WASM=1 \

  -s EXPORTED_FUNCTIONS='["_calculate","_createArray","_freeArray"]' \

  -s EXPORTED_RUNTIME_METHODS='["ccall","cwrap"]' \

  -s ALLOW_MEMORY_GROWTH=1

Generated files:

  • example.wasm: Binary module

  • example.js: JavaScript glue code (loading, memory management, exports)

Usage:

// Load generated module
const Module = await createModule();

// Call C function directly
const result = Module._calculate(3.0, 4.0);
console.log(result);  // 5.0

// Or use ccall wrapper
const result2 = Module.ccall(
    'calculate',    // Function name
    'number',       // Return type
    ['number', 'number'],  // Argument types
    [3.0, 4.0]      // Arguments
);

// Create array in WASM memory
const ptr = Module._createArray(10);

// Access array
const HEAP32 = new Int32Array(Module.HEAP32.buffer);
const offset = ptr >> 2;  // Convert byte pointer to int32 index

for (let i = 0; i < 10; i++) {
    console.log(HEAP32[offset + i]);  // 0, 1, 4, 9, 16, ...
}

// Free memory
Module._freeArray(ptr);

Compiling Rust to WASM

Rust has excellent WASM support:

// src/lib.rs
use wasm_bindgen::prelude::*;

// Export to JavaScript
#[wasm_bindgen]
pub fn greet(name: &str) -> String {
    format!("Hello, {}!", name)
}

#[wasm_bindgen]
pub fn fibonacci(n: u32) -> u32 {
    match n {
        0 => 0,
        1 => 1,
        _ => fibonacci(n - 1) + fibonacci(n - 2),
    }
}

#[wasm_bindgen]
pub struct Point {
    x: f64,
    y: f64,
}

#[wasm_bindgen]
impl Point {
    #[wasm_bindgen(constructor)]
    pub fn new(x: f64, y: f64) -> Point {
        Point { x, y }
    }
    
    pub fn distance(&self, other: &Point) -> f64 {
        let dx = self.x - other.x;
        let dy = self.y - other.y;
        (dx * dx + dy * dy).sqrt()
    }
}

Build:

# Install wasm-pack
cargo install wasm-pack

# Build for web
wasm-pack build --target web

Generated files (in pkg/):

  • example_bg.wasm: WASM module

  • example.js: JavaScript bindings

  • example.d.ts: TypeScript definitions

Usage:

import init, { greet, fibonacci, Point } from './pkg/example.js';

// Initialize WASM module
await init();

// Call functions
console.log(greet("Alice"));      // "Hello, Alice!"
console.log(fibonacci(10));       // 55

// Use exported class
const p1 = new Point(0, 0);
const p2 = new Point(3, 4);
console.log(p1.distance(p2));     // 5.0

Performance Considerations

When to Use WASM

Good use cases:

  • CPU-intensive computations (image processing, cryptography, compression)

  • Existing C/C++/Rust codebases

  • Predictable performance requirements

  • Games and simulations

  • Audio/video processing

Poor use cases:

  • DOM manipulation (JS is faster due to zero overhead)

  • Simple logic (overhead of WASM call not justified)

  • Frequent string operations (encoding/decoding overhead)

  • Heavy JS interop (boundary crossing has cost)

Benchmark: JS vs. WASM

Fibonacci (recursive):

// JavaScript
function fib(n) {
    if (n <= 1) return n;
    return fib(n - 1) + fib(n - 2);
}

console.time('JS fib(40)');
console.log(fib(40));
console.timeEnd('JS fib(40)');
// ~800ms (varies by engine)
// C compiled to WASM
int fib(int n) {
    if (n <= 1) return n;
    return fib(n - 1) + fib(n - 2);
}

// From JavaScript
console.time('WASM fib(40)');
console.log(Module._fib(40));
console.timeEnd('WASM fib(40)');
// ~400ms (2x faster)

Image processing (pixel manipulation):

// JavaScript
function grayscale(imageData) {
    const data = imageData.data;
    for (let i = 0; i < data.length; i += 4) {
        const avg = (data[i] + data[i+1] + data[i+2]) / 3;
        data[i] = data[i+1] = data[i+2] = avg;
    }
}
// 1920×1080 image: ~15ms
// C compiled to WASM
void grayscale(unsigned char* data, int length) {
    for (int i = 0; i < length; i += 4) {
        unsigned char avg = (data[i] + data[i+1] + data[i+2]) / 3;
        data[i] = data[i+1] = data[i+2] = avg;
    }
}
// 1920×1080 image: ~5ms (3x faster)

Call Overhead

Crossing JS/WASM boundary has cost:

// Many small calls: SLOWER
for (let i = 0; i < 1000000; i++) {
    Module._add(i, i);  // Call overhead × 1M
}

// One large call: FASTER
const result = Module._processArray(ptr, 1000000);

Rule of thumb: Minimize boundary crossings. Do bulk work in WASM.


Source Maps and Debugging

Source Maps

Source maps map generated code back to original source.

TypeScript example:

tsc --sourceMap example.ts

Generated files:

  • example.js: Compiled JavaScript

  • example.js.map: Source map

Source map structure (JSON):

{
  "version": 3,
  "file": "example.js",
  "sourceRoot": "",
  "sources": ["example.ts"],
  "names": [],
  "mappings": "AAAA,IAAM,KAAK,GAAG,CAAC,CAAC,EAAE,CAAC..."
}

Browser support: Modern browsers automatically load source maps and show original TypeScript in DevTools.

Debugging WASM

DWARF debug info (C/C++):

emcc example.c -o example.js -g
# Includes source-level debug information

Chrome DevTools supports WASM debugging:

  1. Set breakpoints in WASM code

  2. Inspect local variables

  3. Step through instructions

  4. View call stack

Console:

// View WASM text representation
WebAssembly.Module.exports(module);

// Disassemble function
console.log(instance.exports.add.toString());

Tooling Ecosystem

Build Tools

Webpack:

// webpack.config.js
module.exports = {
    entry: './src/index.js',
    output: {
        filename: 'bundle.js'
    },
    module: {
        rules: [
            {
                test: /\.tsx?$/,
                use: 'ts-loader',
                exclude: /node_modules/
            },
            {
                test: /\.jsx?$/,
                use: 'babel-loader'
            }
        ]
    },
    experiments: {
        asyncWebAssembly: true
    }
};

Vite (modern, faster):

// vite.config.js
export default {
    build: {
        target: 'es2020'
    },
    optimizeDeps: {
        exclude: ['example.wasm']
    }
};

Package Managers

npm/yarn support WASM packages:

npm install @example/wasm-package
import init, { process } from '@example/wasm-package';

await init();
const result = process(data);

Summary

JavaScript as a compilation target has transformed web development:

Transpilers (source-to-source):

  • TypeScript: Static typing, compile-time checks

  • Babel: Modern JS → Legacy JS, polyfills

  • JSX: Declarative UI → React.createElement

  • Output is readable, high-level JavaScript

Compilers (low-level to JS/WASM):

  • asm.js: Optimizable JS subset, manual memory management

  • WebAssembly: Binary format, near-native performance

  • Emscripten: C/C++ → WASM/asm.js

  • Rust: First-class WASM support via wasm-bindgen

Key insights:

  • WASM is not a replacement for JavaScript, but a complement

  • Use WASM for CPU-intensive tasks

  • Use JS for DOM manipulation and high-level logic

  • Minimize JS ↔︎ WASM boundary crossings

  • Source maps enable debugging of generated code

Evolution: Hand-written JS → Transpiled JS → asm.js → WebAssembly

The future is polyglot: write in any language, compile to WASM, interoperate seamlessly with JavaScript in the browser runtime defined by ECMA-262 and extended by Web APIs.


Chapter 12: Building a Source-to-Source Compiler

Introduction: Anatomy of a Transpiler

A source-to-source compiler (transpiler) transforms code from one high-level language to another. Unlike traditional compilers that target machine code, transpilers maintain the abstraction level.

This chapter builds a complete mini-transpiler that converts a simple ML-like language to JavaScript.

Source language features: let x = 5 let y = x + 3 let square = fn(n) => n * n print(square(y))

Target (JavaScript):

const x = 5;
const y = x + 3;
const square = (n) => n * n;
console.log(square(y));

Compiler pipeline:

┌│─────────────────────────────────────────────────────┐

││ Compilation Phases │

├│─────────────────────────────────────────────────────┤

││ │

││ Source Code │

││ ↓ │

││ 1. Lexical Analysis (Lexer/Tokenizer) │

││ └→ Token stream │

││ ↓ │

││ 2. Syntax Analysis (Parser) │

││ └→ Abstract Syntax Tree (AST) │

││ ↓ │

││ 3. Semantic Analysis (optional) │

││ └→ Annotated AST / Symbol tables │

││ ↓ │

││ 4. Optimization (optional) │

││ └→ Transformed AST │

││ ↓ │

││ 5. Code Generation (Emitter) │

││ └→ Target code │

││ │ └─────────────────────────────────────────────────────┘


Phase 1: Lexical Analysis (Tokenization)

Token Definition

A token is the smallest meaningful unit of source code.

Token types:

const TokenType = {
    // Literals
    NUMBER: 'NUMBER',
    STRING: 'STRING',
    IDENTIFIER: 'IDENTIFIER',
    
    // Keywords
    LET: 'LET',
    FN: 'FN',
    IF: 'IF',
    ELSE: 'ELSE',
    RETURN: 'RETURN',
    PRINT: 'PRINT',
    
    // Operators
    PLUS: 'PLUS',
    MINUS: 'MINUS',
    STAR: 'STAR',
    SLASH: 'SLASH',
    PERCENT: 'PERCENT',
    
    EQUAL: 'EQUAL',
    EQUAL_EQUAL: 'EQUAL_EQUAL',
    BANG_EQUAL: 'BANG_EQUAL',
    LESS: 'LESS',
    LESS_EQUAL: 'LESS_EQUAL',
    GREATER: 'GREATER',
    GREATER_EQUAL: 'GREATER_EQUAL',
    
    // Delimiters
    LPAREN: 'LPAREN',
    RPAREN: 'RPAREN',
    LBRACE: 'LBRACE',
    RBRACE: 'RBRACE',
    COMMA: 'COMMA',
    ARROW: 'ARROW',
    
    // Special
    NEWLINE: 'NEWLINE',
    EOF: 'EOF'
};

class Token {
    constructor(type, value, line, column) {
        this.type = type;
        this.value = value;
        this.line = line;
        this.column = column;
    }
    
    toString() {
        return `Token(${this.type}, ${JSON.stringify(this.value)}, ${this.line}:${this.column})`;
    }
}

Lexer Implementation

The lexer scans source code character-by-character and produces tokens:

class Lexer {
    constructor(source) {
        this.source = source;
        this.pos = 0;
        this.line = 1;
        this.column = 1;
        this.current = this.source[0] || null;
    }
    
    // Advance to next character
    advance() {
        if (this.current === '\n') {
            this.line++;
            this.column = 1;
        } else {
            this.column++;
        }
        
        this.pos++;
        this.current = this.pos < this.source.length ? this.source[this.pos] : null;
    }
    
    // Peek ahead without consuming
    peek(offset = 1) {
        const pos = this.pos + offset;
        return pos < this.source.length ? this.source[pos] : null;
    }
    
    // Skip whitespace (except newlines, which are significant)
    skipWhitespace() {
        while (this.current && /[ \t\r]/.test(this.current)) {
            this.advance();
        }
    }
    
    // Skip comments
    skipComment() {
        if (this.current === '#') {
            while (this.current && this.current !== '\n') {
                this.advance();
            }
        }
    }
    
    // Read number literal
    readNumber() {
        const startLine = this.line;
        const startColumn = this.column;
        let numStr = '';
        
        while (this.current && /[0-9]/.test(this.current)) {
            numStr += this.current;
            this.advance();
        }
        
        // Handle decimal point
        if (this.current === '.' && /[0-9]/.test(this.peek())) {
            numStr += this.current;
            this.advance();
            
            while (this.current && /[0-9]/.test(this.current)) {
                numStr += this.current;
                this.advance();
            }
        }
        
        return new Token(
            TokenType.NUMBER,
            parseFloat(numStr),
            startLine,
            startColumn
        );
    }
    
    // Read string literal
    readString() {
        const startLine = this.line;
        const startColumn = this.column;
        const quote = this.current;
        let str = '';
        
        this.advance(); // Skip opening quote
        
        while (this.current && this.current !== quote) {
            if (this.current === '\\') {
                this.advance();
                // Handle escape sequences
                switch (this.current) {
                    case 'n': str += '\n'; break;
                    case 't': str += '\t'; break;
                    case 'r': str += '\r'; break;
                    case '\\': str += '\\'; break;
                    case quote: str += quote; break;
                    default:
                        throw new Error(
                            `Invalid escape sequence \\${this.current} at ${startLine}:${startColumn}`
                        );
                }
                this.advance();
            } else {
                str += this.current;
                this.advance();
            }
        }
        
        if (!this.current) {
            throw new Error(`Unterminated string at ${startLine}:${startColumn}`);
        }
        
        this.advance(); // Skip closing quote
        
        return new Token(TokenType.STRING, str, startLine, startColumn);
    }
    
    // Read identifier or keyword
    readIdentifier() {
        const startLine = this.line;
        const startColumn = this.column;
        let ident = '';
        
        while (this.current && /[a-zA-Z0-9_]/.test(this.current)) {
            ident += this.current;
            this.advance();
        }
        
        // Check if keyword
        const keywords = {
            'let': TokenType.LET,
            'fn': TokenType.FN,
            'if': TokenType.IF,
            'else': TokenType.ELSE,
            'return': TokenType.RETURN,
            'print': TokenType.PRINT
        };
        
        const type = keywords[ident] || TokenType.IDENTIFIER;
        
        return new Token(type, ident, startLine, startColumn);
    }
    
    // Get next token
    nextToken() {
        while (this.current) {
            // Skip whitespace
            if (/[ \t\r]/.test(this.current)) {
                this.skipWhitespace();
                continue;
            }
            
            // Skip comments
            if (this.current === '#') {
                this.skipComment();
                continue;
            }
            
            const line = this.line;
            const column = this.column;
            
            // Newline
            if (this.current === '\n') {
                this.advance();
                return new Token(TokenType.NEWLINE, '\\n', line, column);
            }
            
            // Numbers
            if (/[0-9]/.test(this.current)) {
                return this.readNumber();
            }
            
            // Strings
            if (this.current === '"' || this.current === "'") {
                return this.readString();
            }
            
            // Identifiers and keywords
            if (/[a-zA-Z_]/.test(this.current)) {
                return this.readIdentifier();
            }
            
            // Two-character operators
            if (this.current === '=' && this.peek() === '>') {
                this.advance();
                this.advance();
                return new Token(TokenType.ARROW, '=>', line, column);
            }
            
            if (this.current === '=' && this.peek() === '=') {
                this.advance();
                this.advance();
                return new Token(TokenType.EQUAL_EQUAL, '==', line, column);
            }
            
            if (this.current === '!' && this.peek() === '=') {
                this.advance();
                this.advance();
                return new Token(TokenType.BANG_EQUAL, '!=', line, column);
            }
            
            if (this.current === '<' && this.peek() === '=') {
                this.advance();
                this.advance();
                return new Token(TokenType.LESS_EQUAL, '<=', line, column);
            }
            
            if (this.current === '>' && this.peek() === '=') {
                this.advance();
                this.advance();
                return new Token(TokenType.GREATER_EQUAL, '>=', line, column);
            }
            
            // Single-character tokens
            const singleChar = {
                '+': TokenType.PLUS,
                '-': TokenType.MINUS,
                '*': TokenType.STAR,
                '/': TokenType.SLASH,
                '%': TokenType.PERCENT,
                '=': TokenType.EQUAL,
                '<': TokenType.LESS,
                '>': TokenType.GREATER,
                '(': TokenType.LPAREN,
                ')': TokenType.RPAREN,
                '{': TokenType.LBRACE,
                '}': TokenType.RBRACE,
                ',': TokenType.COMMA
            };
            
            if (this.current in singleChar) {
                const token = new Token(
                    singleChar[this.current],
                    this.current,
                    line,
                    column
                );
                this.advance();
                return token;
            }
            
            throw new Error(
                `Unexpected character '${this.current}' at ${line}:${column}`
            );
        }
        
        return new Token(TokenType.EOF, null, this.line, this.column);
    }
    
    // Tokenize entire source
    tokenize() {
        const tokens = [];
        let token;
        
        do {
            token = this.nextToken();
            // Skip newlines for simplicity (could be kept for semicolon inference)
            if (token.type !== TokenType.NEWLINE) {
                tokens.push(token);
            }
        } while (token.type !== TokenType.EOF);
        
        return tokens;
    }
}

Example usage:

const source = `
let x = 42
let greeting = "Hello, World!"
let add = fn(a, b) => a + b
print(add(x, 8))
`;

const lexer = new Lexer(source);
const tokens = lexer.tokenize();

tokens.forEach(token => console.log(token.toString()));

// Output:
// Token(LET, "let", 2:1)
// Token(IDENTIFIER, "x", 2:5)
// Token(EQUAL, "=", 2:7)
// Token(NUMBER, 42, 2:9)
// Token(LET, "let", 3:1)
// Token(IDENTIFIER, "greeting", 3:5)
// Token(EQUAL, "=", 3:14)
// Token(STRING, "Hello, World!", 3:16)
// Token(LET, "let", 4:1)
// Token(IDENTIFIER, "add", 4:5)
// Token(EQUAL, "=", 4:9)
// Token(FN, "fn", 4:11)
// ...

Phase 2: Syntax Analysis (Parsing)

Abstract Syntax Tree (AST)

The AST represents the hierarchical structure of the program.

AST node types:

// Base class
class ASTNode {
    constructor(type) {
        this.type = type;
    }
}

// Program (root node)
class Program extends ASTNode {
    constructor(statements) {
        super('Program');
        this.statements = statements;
    }
}

// Variable declaration: let x = 5
class LetStatement extends ASTNode {
    constructor(name, value) {
        super('LetStatement');
        this.name = name;      // Identifier node
        this.value = value;    // Expression node
    }
}

// Identifier: x
class Identifier extends ASTNode {
    constructor(name) {
        super('Identifier');
        this.name = name;
    }
}

// Number literal: 42
class NumberLiteral extends ASTNode {
    constructor(value) {
        super('NumberLiteral');
        this.value = value;
    }
}

// String literal: "hello"
class StringLiteral extends ASTNode {
    constructor(value) {
        super('StringLiteral');
        this.value = value;
    }
}

// Binary operation: a + b
class BinaryExpression extends ASTNode {
    constructor(left, operator, right) {
        super('BinaryExpression');
        this.left = left;
        this.operator = operator;
        this.right = right;
    }
}

// Function call: foo(a, b)
class CallExpression extends ASTNode {
    constructor(callee, args) {
        super('CallExpression');
        this.callee = callee;
        this.args = args;
    }
}

// Function expression: fn(x) => x * 2
class FunctionExpression extends ASTNode {
    constructor(params, body) {
        super('FunctionExpression');
        this.params = params;  // Array of Identifier
        this.body = body;      // Expression or BlockStatement
    }
}

// Block statement: { ... }
class BlockStatement extends ASTNode {
    constructor(statements) {
        super('BlockStatement');
        this.statements = statements;
    }
}

// If statement: if (cond) { ... } else { ... }
class IfStatement extends ASTNode {
    constructor(condition, consequent, alternate) {
        super('IfStatement');
        this.condition = condition;
        this.consequent = consequent;
        this.alternate = alternate;  // Can be null
    }
}

// Return statement: return expr
class ReturnStatement extends ASTNode {
    constructor(value) {
        super('ReturnStatement');
        this.value = value;
    }
}

// Print statement: print(expr)
class PrintStatement extends ASTNode {
    constructor(expression) {
        super('PrintStatement');
        this.expression = expression;
    }
}

Recursive Descent Parser

Parser uses recursive descent to build the AST:

class Parser {
    constructor(tokens) {
        this.tokens = tokens;
        this.pos = 0;
        this.current = this.tokens[0];
    }
    
    // Advance to next token
    advance() {
        this.pos++;
        this.current = this.pos < this.tokens.length ? this.tokens[this.pos] : null;
    }
    
    // Check current token type
    check(type) {
        return this.current && this.current.type === type;
    }
    
    // Consume expected token or throw error
    expect(type, message) {
        if (!this.check(type)) {
            throw new Error(
                message || `Expected ${type} but got ${this.current?.type} at ${this.current?.line}:${this.current?.column}`
            );
        }
        const token = this.current;
        this.advance();
        return token;
    }
    
    // Parse entire program
    parse() {
        const statements = [];
        
        while (!this.check(TokenType.EOF)) {
            statements.push(this.parseStatement());
        }
        
        return new Program(statements);
    }
    
    // Parse statement
    parseStatement() {
        if (this.check(TokenType.LET)) {
            return this.parseLetStatement();
        }
        
        if (this.check(TokenType.PRINT)) {
            return this.parsePrintStatement();
        }
        
        if (this.check(TokenType.IF)) {
            return this.parseIfStatement();
        }
        
        if (this.check(TokenType.RETURN)) {
            return this.parseReturnStatement();
        }
        
        if (this.check(TokenType.LBRACE)) {
            return this.parseBlockStatement();
        }
        
        // Expression statement (for side effects)
        const expr = this.parseExpression();
        return expr;
    }
    
    // Parse: let x = expr
    parseLetStatement() {
        this.expect(TokenType.LET);
        
        const name = this.expect(TokenType.IDENTIFIER);
        this.expect(TokenType.EQUAL);
        
        const value = this.parseExpression();
        
        return new LetStatement(
            new Identifier(name.value),
            value
        );
    }
    
    // Parse: print(expr)
    parsePrintStatement() {
        this.expect(TokenType.PRINT);
        this.expect(TokenType.LPAREN);
        
        const expr = this.parseExpression();
        
        this.expect(TokenType.RPAREN);
        
        return new PrintStatement(expr);
    }
    
    // Parse: if (cond) { ... } else { ... }
    parseIfStatement() {
        this.expect(TokenType.IF);
        this.expect(TokenType.LPAREN);
        
        const condition = this.parseExpression();
        
        this.expect(TokenType.RPAREN);
        
        const consequent = this.parseBlockStatement();
        
        let alternate = null;
        if (this.check(TokenType.ELSE)) {
            this.advance();
            alternate = this.check(TokenType.IF) 
                ? this.parseIfStatement()
                : this.parseBlockStatement();
        }
        
        return new IfStatement(condition, consequent, alternate);
    }
    
    // Parse: return expr
    parseReturnStatement() {
        this.expect(TokenType.RETURN);
        
        const value = this.parseExpression();
        
        return new ReturnStatement(value);
    }
    
    // Parse: { stmt1 stmt2 ... }
    parseBlockStatement() {
        this.expect(TokenType.LBRACE);
        
        const statements = [];
        
        while (!this.check(TokenType.RBRACE) && !this.check(TokenType.EOF)) {
            statements.push(this.parseStatement());
        }
        
        this.expect(TokenType.RBRACE);
        
        return new BlockStatement(statements);
    }
    
    // Parse expression (entry point)
    parseExpression() {
        return this.parseComparison();
    }
    
    // Parse: expr == expr, expr != expr, etc.
    parseComparison() {
        let left = this.parseAddition();
        
        while (this.check(TokenType.EQUAL_EQUAL) ||
               this.check(TokenType.BANG_EQUAL) ||
               this.check(TokenType.LESS) ||
               this.check(TokenType.LESS_EQUAL) ||
               this.check(TokenType.GREATER) ||
               this.check(TokenType.GREATER_EQUAL)) {
            const operator = this.current.value;
            this.advance();
            const right = this.parseAddition();
            left = new BinaryExpression(left, operator, right);
        }
        
        return left;
    }
    
    // Parse: expr + expr, expr - expr
    parseAddition() {
        let left = this.parseMultiplication();
        
        while (this.check(TokenType.PLUS) || this.check(TokenType.MINUS)) {
            const operator = this.current.value;
            this.advance();
            const right = this.parseMultiplication();
            left = new BinaryExpression(left, operator, right);
        }
        
        return left;
    }
    
    // Parse: expr * expr, expr / expr, expr % expr
    parseMultiplication() {
        let left = this.parseCall();
        
        while (this.check(TokenType.STAR) || 
               this.check(TokenType.SLASH) ||
               this.check(TokenType.PERCENT)) {
            const operator = this.current.value;
            this.advance();
            const right = this.parseCall();
            left = new BinaryExpression(left, operator, right);
        }
        
        return left;
    }
    
    // Parse: primary(arg1, arg2, ...)
    parseCall() {
        let expr = this.parsePrimary();
        
        while (this.check(TokenType.LPAREN)) {
            this.advance();
            
            const args = [];
            
            if (!this.check(TokenType.RPAREN)) {
                args.push(this.parseExpression());
                
                while (this.check(TokenType.COMMA)) {
                    this.advance();
                    args.push(this.parseExpression());
                }
            }
            
            this.expect(TokenType.RPAREN);
            
            expr = new CallExpression(expr, args);
        }
        
        return expr;
    }
    
    // Parse primary expressions
    parsePrimary() {
        // Number literal
        if (this.check(TokenType.NUMBER)) {
            const value = this.current.value;
            this.advance();
            return new NumberLiteral(value);
        }
        
        // String literal
        if (this.check(TokenType.STRING)) {
            const value = this.current.value;
            this.advance();
            return new StringLiteral(value);
        }
        
        // Identifier
        if (this.check(TokenType.IDENTIFIER)) {
            const name = this.current.value;
            this.advance();
            return new Identifier(name);
        }
        
        // Function expression: fn(params) => body
        if (this.check(TokenType.FN)) {
            return this.parseFunctionExpression();
        }
        
        // Parenthesized expression
        if (this.check(TokenType.LPAREN)) {
            this.advance();
            const expr = this.parseExpression();
            this.expect(TokenType.RPAREN);
            return expr;
        }
        
        throw new Error(
            `Unexpected token ${this.current?.type} at ${this.current?.line}:${this.current?.column}`
        );
    }
    
    // Parse: fn(x, y) => expr  or  fn(x, y) { ... }
    parseFunctionExpression() {
        this.expect(TokenType.FN);
        this.expect(TokenType.LPAREN);
        
        const params = [];
        
        if (!this.check(TokenType.RPAREN)) {
            params.push(new Identifier(this.expect(TokenType.IDENTIFIER).value));
            
            while (this.check(TokenType.COMMA)) {
                this.advance();
                params.push(new Identifier(this.expect(TokenType.IDENTIFIER).value));
            }
        }
        
        this.expect(TokenType.RPAREN);
        this.expect(TokenType.ARROW);
        
        // Arrow function with expression body
        const body = this.check(TokenType.LBRACE)
            ? this.parseBlockStatement()
            : this.parseExpression();
        
        return new FunctionExpression(params, body);
    }
}

Example usage:

const source = `
let x = 10
let double = fn(n) => n * 2
let result = double(x)
print(result)
`;

const lexer = new Lexer(source);
const tokens = lexer.tokenize();

const parser = new Parser(tokens);
const ast = parser.parse();

console.log(JSON.stringify(ast, null, 2));

Output AST (simplified):

{
  "type": "Program",
  "statements": [
    {
      "type": "LetStatement",
      "name": { "type": "Identifier", "name": "x" },
      "value": { "type": "NumberLiteral", "value": 10 }
    },
    {
      "type": "LetStatement",
      "name": { "type": "Identifier", "name": "double" },
      "value": {
        "type": "FunctionExpression",
        "params": [{ "type": "Identifier", "name": "n" }],
        "body": {
          "type": "BinaryExpression",
          "left": { "type": "Identifier", "name": "n" },
          "operator": "*",
          "right": { "type": "NumberLiteral", "value": 2 }
        }
      }
    },
    {
      "type": "LetStatement",
      "name": { "type": "Identifier", "name": "result" },
      "value": {
        "type": "CallExpression",
        "callee": { "type": "Identifier", "name": "double" },
        "args": [{ "type": "Identifier", "name": "x" }]
      }
    },
    {
      "type": "PrintStatement",
      "expression": { "type": "Identifier", "name": "result" }
    }
  ]
}

Phase 3: Semantic Analysis (Optional)

Semantic analysis checks for logical errors that syntax alone cannot catch.

Symbol Table

Track variable declarations and scopes:

class SymbolTable {
    constructor(parent = null) {
        this.parent = parent;
        this.symbols = new Map();
    }
    
    define(name, info) {
        if (this.symbols.has(name)) {
            throw new Error(`Variable '${name}' already declared in this scope`);
        }
        this.symbols.set(name, info);
    }
    
    resolve(name) {
        if (this.symbols.has(name)) {
            return this.symbols.get(name);
        }
        
        if (this.parent) {
            return this.parent.resolve(name);
        }
        
        return null;
    }
    
    enterScope() {
        return new SymbolTable(this);
    }
}

Semantic Analyzer

class SemanticAnalyzer {
    constructor() {
        this.globalScope = new SymbolTable();
        this.currentScope = this.globalScope;
        this.errors = [];
    }
    
    analyze(ast) {
        this.visit(ast);
        return this.errors;
    }
    
    visit(node) {
        const methodName = `visit${node.type}`;
        if (this[methodName]) {
            return this[methodName](node);
        }
        throw new Error(`No visit method for ${node.type}`);
    }
    
    visitProgram(node) {
        node.statements.forEach(stmt => this.visit(stmt));
    }
    
    visitLetStatement(node) {
        // Check if already declared in current scope
        if (this.currentScope.symbols.has(node.name.name)) {
            this.errors.push(
                `Variable '${node.name.name}' already declared in this scope`
            );
        }
        
        // Define variable
        this.currentScope.define(node.name.name, {
            type: 'variable',
            node: node
        });
        
        // Analyze value expression
        this.visit(node.value);
    }
    
    visitIdentifier(node) {
        // Check if variable is declared
        const symbol = this.currentScope.resolve(node.name);
        
        if (!symbol) {
            this.errors.push(`Undefined variable '${node.name}'`);
        }
    }
    
    visitBinaryExpression(node) {
        this.visit(node.left);
        this.visit(node.right);
    }
    
    visitCallExpression(node) {
        this.visit(node.callee);
        node.args.forEach(arg => this.visit(arg));
    }
    
    visitFunctionExpression(node) {
        // Enter new scope for function
        const previousScope = this.currentScope;
        this.currentScope = this.currentScope.enterScope();
        
        // Define parameters
        node.params.forEach(param => {
            this.currentScope.define(param.name, {
                type: 'parameter',
                node: param
            });
        });
        
        // Analyze body
        this.visit(node.body);
        
        // Restore scope
        this.currentScope = previousScope;
    }
    
    visitBlockStatement(node) {
        const previousScope = this.currentScope;
        this.currentScope = this.currentScope.enterScope();
        
        node.statements.forEach(stmt => this.visit(stmt));
        
        this.currentScope = previousScope;
    }
    
    visitIfStatement(node) {
        this.visit(node.condition);
        this.visit(node.consequent);
        if (node.alternate) {
            this.visit(node.alternate);
        }
    }
    
    visitReturnStatement(node) {
        this.visit(node.value);
    }
    
    visitPrintStatement(node) {
        this.visit(node.expression);
    }
    
    visitNumberLiteral(node) {
        // Nothing to check
    }
    
    visitStringLiteral(node) {
        // Nothing to check
    }
}

Usage:

const analyzer = new SemanticAnalyzer();
const errors = analyzer.analyze(ast);

if (errors.length > 0) {
    console.error('Semantic errors:');
    errors.forEach(err => console.error('  -', err));
} else {
    console.log('✓ Semantic analysis passed');
}

Phase 4: Code Generation

JavaScript Code Emitter

Generate JavaScript from AST:

class JavaScriptEmitter {
    constructor() {
        this.indent = 0;
    }
    
    emit(ast) {
        return this.visit(ast);
    }
    
    visit(node) {
        const methodName = `visit${node.type}`;
        if (this[methodName]) {
            return this[methodName](node);
        }
        throw new Error(`No emit method for ${node.type}`);
    }
    
    getIndent() {
        return '  '.repeat(this.indent);
    }
    
    visitProgram(node) {
        return node.statements
            .map(stmt => this.visit(stmt))
            .join('\n') + '\n';
    }
    
    visitLetStatement(node) {
        const name = this.visit(node.name);
        const value = this.visit(node.value);
        return `${this.getIndent()}const ${name} = ${value};`;
    }
    
    visitIdentifier(node) {
        return node.name;
    }
    
    visitNumberLiteral(node) {
        return String(node.value);
    }
    
    visitStringLiteral(node) {
        // Escape special characters
        const escaped = node.value
            .replace(/\\/g, '\\\\')
            .replace(/"/g, '\\"')
            .replace(/\n/g, '\\n')
            .replace(/\t/g, '\\t');
        return `"${escaped}"`;
    }
    
    visitBinaryExpression(node) {
        const left = this.visit(node.left);
        const right = this.visit(node.right);
        
        // Add parentheses for clarity
        return `(${left} ${node.operator} ${right})`;
    }
    
    visitCallExpression(node) {
        const callee = this.visit(node.callee);
        const args = node.args.map(arg => this.visit(arg)).join(', ');
        return `${callee}(${args})`;
    }
    
    visitFunctionExpression(node) {
        const params = node.params.map(p => this.visit(p)).join(', ');
        
        // Single expression body
        if (node.body.type !== 'BlockStatement') {
            const body = this.visit(node.body);
            return `(${params}) => ${body}`;
        }
        
        // Block body
        this.indent++;
        const body = node.body.statements
            .map(stmt => this.visit(stmt))
            .join('\n');
        this.indent--;
        
        return `(${params}) => {\n${body}\n${this.getIndent()}}`;
    }
    
    visitBlockStatement(node) {
        this.indent++;
        const statements = node.statements
            .map(stmt => this.visit(stmt))
            .join('\n');
        this.indent--;
        
        return `${this.getIndent()}{\n${statements}\n${this.getIndent()}}`;
    }
    
    visitIfStatement(node) {
        const condition = this.visit(node.condition);
        const consequent = this.visit(node.consequent);
        
        let code = `${this.getIndent()}if (${condition}) ${consequent}`;
        
        if (node.alternate) {
            const alternate = this.visit(node.alternate);
            
            // Else-if
            if (node.alternate.type === 'IfStatement') {
                code += ` else ${alternate.trimStart()}`;
            } else {
                code += ` else ${alternate}`;
            }
        }
        
        return code;
    }
    
    visitReturnStatement(node) {
        const value = this.visit(node.value);
        return `${this.getIndent()}return ${value};`;
    }
    
    visitPrintStatement(node) {
        const expr = this.visit(node.expression);
        return `${this.getIndent()}console.log(${expr});`;
    }
}

Usage:

const emitter = new JavaScriptEmitter();
const jsCode = emitter.emit(ast);

console.log(jsCode);

Output:

const x = 10;
const double = (n) => (n * 2);
const result = double(x);
console.log(result);

Complete Compiler Pipeline

Putting it all together:

class Compiler {
    constructor() {
        this.lexer = null;
        this.parser = null;
        this.analyzer = null;
        this.emitter = null;
    }
    
    compile(source, options = {}) {
        const {
            skipSemanticAnalysis = false,
            outputAST = false
        } = options;
        
        try {
            // Phase 1: Lexical analysis
            this.lexer = new Lexer(source);
            const tokens = this.lexer.tokenize();
            
            // Phase 2: Parsing
            this.parser = new Parser(tokens);
            const ast = this.parser.parse();
            
            if (outputAST) {
                console.log('AST:', JSON.stringify(ast, null, 2));
            }
            
            // Phase 3: Semantic analysis (optional)
            if (!skipSemanticAnalysis) {
                this.analyzer = new SemanticAnalyzer();
                const errors = this.analyzer.analyze(ast);
                
                if (errors.length > 0) {
                    throw new Error(
                        'Semantic errors:\n' + errors.map(e => '  - ' + e).join('\n')
                    );
                }
            }
            
            // Phase 4: Code generation
            this.emitter = new JavaScriptEmitter();
            const jsCode = this.emitter.emit(ast);
            
            return {
                success: true,
                code: jsCode,
                ast: ast
            };
            
        } catch (error) {
            return {
                success: false,
                error: error.message,
                stack: error.stack
            };
        }
    }
    
    compileAndRun(source) {
        const result = this.compile(source);
        
        if (!result.success) {
            console.error('Compilation failed:');
            console.error(result.error);
            return;
        }
        
        console.log('Generated JavaScript:');
        console.log(result.code);
        console.log('\nExecution:');
        
        try {
            eval(result.code);
        } catch (error) {
            console.error('Runtime error:', error.message);
        }
    }
}

Example usage:

const compiler = new Compiler();

const source = `
# Factorial example
let factorial = fn(n) => {
  if (n <= 1) {
    return 1
  }
  return n * factorial(n - 1)
}

let result = factorial(5)
print(result)
`;

compiler.compileAndRun(source);

Output:

Generated JavaScript: const factorial = (n) => { if ((n <= 1)) { return 1; } return (n * factorial((n - 1))); }; const result = factorial(5); console.log(result);

Execution: 120


Optimization Techniques

Constant Folding

Evaluate constant expressions at compile time:

class ConstantFolder {
    visit(node) {
        if (node.type === 'BinaryExpression') {
            return this.foldBinaryExpression(node);
        }
        
        // Recursively fold children
        for (const key in node) {
            if (node[key] && typeof node[key] === 'object') {
                if (Array.isArray(node[key])) {
                    node[key] = node[key].map(child => this.visit(child));
                } else if (node[key].type) {
                    node[key] = this.visit(node[key]);
                }
            }
        }
        
        return node;
    }
    
    foldBinaryExpression(node) {
        // Fold operands first
        node.left = this.visit(node.left);
        node.right = this.visit(node.right);
        
        // Check if both operands are literals
        if (node.left.type === 'NumberLiteral' && node.right.type === 'NumberLiteral') {
            const left = node.left.value;
            const right = node.right.value;
            let result;
            
            switch (node.operator) {
                case '+': result = left + right; break;
                case '-': result = left - right; break;
                case '*': result = left * right; break;
                case '/': result = left / right; break;
                case '%': result = left % right; break;
                case '==': result = left === right ? 1 : 0; break;
                case '!=': result = left !== right ? 1 : 0; break;
                case '<': result = left < right ? 1 : 0; break;
                case '<=': result = left <= right ? 1 : 0; break;
                case '>': result = left > right ? 1 : 0; break;
                case '>=': result = left >= right ? 1 : 0; break;
                default: return node;
            }
            
            return new NumberLiteral(result);
        }
        
        return node;
    }
}

Usage:

// Before: let x = 2 + 3 * 4
// After:  let x = 14

const folder = new ConstantFolder();
const optimizedAST = folder.visit(ast);

Dead Code Elimination

Remove unreachable code:

class DeadCodeEliminator {
    visit(node) {
        if (node.type === 'BlockStatement') {
            return this.eliminateDeadCode(node);
        }
        
        // Recursively process children
        for (const key in node) {
            if (node[key] && typeof node[key] === 'object') {
                if (Array.isArray(node[key])) {
                    node[key] = node[key].map(child => this.visit(child));
                } else if (node[key].type) {
                    node[key] = this.visit(node[key]);
                }
            }
        }
        
        return node;
    }
    
    eliminateDeadCode(node) {
        const statements = [];
        let reachable = true;
        
        for (const stmt of node.statements) {
            if (!reachable) {
                // Skip unreachable code
                continue;
            }
            
            statements.push(this.visit(stmt));
            
            // Return makes subsequent code unreachable
            if (stmt.type === 'ReturnStatement') {
                reachable = false;
            }
        }
        
        node.statements = statements;
        return node;
    }
}

Error Handling and Reporting

Better Error Messages

class CompilerError extends Error {
    constructor(message, line, column, source) {
        super(message);
        this.line = line;
        this.column = column;
        this.source = source;
    }
    
    format() {
        const lines = this.source.split('\n');
        const errorLine = lines[this.line - 1];
        
        const pointer = ' '.repeat(this.column - 1) + '^';
        
        return `
Error at ${this.line}:${this.column}
${this.message}

${this.line} | ${errorLine}
     ${pointer}
`;
    }
}

Enhanced lexer with better errors:

throw new CompilerError(
    `Unexpected character '${this.current}'`,
    this.line,
    this.column,
    this.source
);

Testing the Compiler

Unit Tests

function test(name, fn) {
    try {
        fn();
        console.log(`✓ ${name}`);
    } catch (error) {
        console.error(`✗ ${name}`);
        console.error(`  ${error.message}`);
    }
}

function assertEquals(actual, expected) {
    if (JSON.stringify(actual) !== JSON.stringify(expected)) {
        throw new Error(`Expected ${JSON.stringify(expected)} but got ${JSON.stringify(actual)}`);
    }
}

// Lexer tests
test('Lexer tokenizes numbers', () => {
    const lexer = new Lexer('42 3.14');
    const tokens = lexer.tokenize();
    assertEquals(tokens[0].type, TokenType.NUMBER);
    assertEquals(tokens[0].value, 42);
    assertEquals(tokens[1].type, TokenType.NUMBER);
    assertEquals(tokens[1].value, 3.14);
});

test('Lexer tokenizes strings', () => {
    const lexer = new Lexer('"hello world"');
    const tokens = lexer.tokenize();
    assertEquals(tokens[0].type, TokenType.STRING);
    assertEquals(tokens[0].value, 'hello world');
});

// Parser tests
test('Parser parses let statement', () => {
    const tokens = new Lexer('let x = 42').tokenize();
    const parser = new Parser(tokens);
    const ast = parser.parse();
    assertEquals(ast.statements[0].type, 'LetStatement');
    assertEquals(ast.statements[0].name.name, 'x');
    assertEquals(ast.statements[0].value.value, 42);
});

// Emitter tests
test('Emitter generates correct JavaScript', () => {
    const tokens = new Lexer('let x = 42').tokenize();
    const parser = new Parser(tokens);
    const ast = parser.parse();
    const emitter = new JavaScriptEmitter();
    const code = emitter.emit(ast);
    assertEquals(code.trim(), 'const x = 42;');
});

// Integration tests
test('Compile and execute factorial', () => {
    const source = `
let fac = fn(n) => {
  if (n <= 1) {
    return 1
  }
  return n * fac(n - 1)
}
let result = fac(5)
`;
    
    const compiler = new Compiler();
    const { success, code } = compiler.compile(source);
    
    assertEquals(success, true);
    
    let result;
    eval(code + 'result');
    assertEquals(result, 120);
});

Summary

We built a complete source-to-source compiler from scratch:

Phase 1: Lexical Analysis

  • Tokenize source into tokens

  • Handle numbers, strings, identifiers, keywords, operators

  • Track line/column for error messages

Phase 2: Syntax Analysis

  • Build Abstract Syntax Tree (AST) via recursive descent parsing

  • Define AST node types for all language constructs

  • Handle operator precedence correctly

Phase 3: Semantic Analysis (Optional)

  • Symbol table tracks variable declarations

  • Check for undefined variables

  • Enforce scoping rules

Phase 4: Code Generation

  • Traverse AST and emit JavaScript

  • Handle indentation and formatting

  • Map source constructs to target equivalents

Optimization:

  • Constant folding: Evaluate constants at compile-time

  • Dead code elimination: Remove unreachable code

Error handling:

  • Provide helpful error messages with source location

  • Format errors with context and pointer

This mini-compiler demonstrates the core principles used by production transpilers like TypeScript, Babel, and others. The same architecture scales to more complex languages with additional features like type systems, generics, and advanced optimizations.


Chapter 13: WebAssembly Fundamentals

Introduction: A New Compilation Target for the Web

WebAssembly (Wasm) is a binary instruction format designed as a portable compilation target for high-level languages. It represents a fundamental shift in web development capabilities.

Key characteristics:

┌│─────────────────────────────────────────────────────┐

││ WebAssembly Core Properties │

├│─────────────────────────────────────────────────────┤

││ │

││ ✓ Binary format (fast to parse/decode) │

││ ✓ Stack-based virtual machine │

││ ✓ Strongly typed (validated before execution) │

││ ✓ Memory-safe (sandboxed execution) │

││ ✓ Near-native performance │

││ ✓ Language-agnostic compilation target │

││ ✓ Designed to work alongside JavaScript │

││ │ └─────────────────────────────────────────────────────┘

Design goals (from specification):

  • Safe: Sandboxed execution with memory safety

  • Fast: Near-native speed with efficient validation

  • Portable: Architecture-independent binary format

  • Compact: Small binary size for fast network transfer


WebAssembly vs. JavaScript

Complementary Technologies

WebAssembly does NOT replace JavaScript – they work together:

// JavaScript: High-level, dynamic, great for DOM/APIs
document.addEventListener('click', async (e) => {
    // Load WebAssembly module
    const wasm = await loadWasmModule();
    
    // JavaScript handles UI logic
    const imageData = getImageFromCanvas();
    
    // WebAssembly handles computation
    const processed = wasm.processImage(imageData);
    
    // JavaScript updates DOM
    displayResult(processed);
});

When to Use Each

Use JavaScript for:

  • DOM manipulation

  • Event handling

  • Rapid prototyping

  • String/text processing

  • Small computations

  • Async I/O operations

Use WebAssembly for:

  • CPU-intensive computations

  • Image/video processing

  • Physics simulations

  • Cryptography

  • Game engines

  • Porting existing C/C++/Rust code

Performance Comparison

Parsing/Load Time: JavaScript: ████████████████░░░░ (slower - must parse text) WebAssembly: ████░░░░░░░░░░░░░░░░ (faster - binary format)

Execution Speed: JavaScript: ████████████░░░░░░░░ (JIT-compiled, optimized) WebAssembly: ██████████████████░░ (near-native, predictable)

Startup Cost: JavaScript: ████░░░░░░░░░░░░░░░░ (low - runs immediately) WebAssembly: ████████░░░░░░░░░░░░ (higher - compile + instantiate)

Memory Usage: JavaScript: ████████████████████ (GC overhead) WebAssembly: ████████░░░░░░░░░░░░ (manual control, compact)

Rule of thumb: If a task takes <16ms, JavaScript is fine. For longer computations, consider WebAssembly.


WebAssembly Architecture

Stack-Based Virtual Machine

WebAssembly uses a stack machine model:

Traditional Register Machine: ADD r1, r2, r3 ; r1 = r2 + r3

WebAssembly Stack Machine: local.get 0 ; push local[0] onto stack local.get 1 ; push local[1] onto stack i32.add ; pop two values, push sum local.set 2 ; pop result, store in local[2]

Stack operations visualization:

Instruction Stack State ───────────────── ───────────── [ ] i32.const 5 [ 5 ] i32.const 3 [ 5, 3 ] i32.add [ 8 ] i32.const 2 [ 8, 2 ] i32.mul [ 16 ]

Module Structure

A WebAssembly module is organized into sections:

┌│──────────────────────────────────────────┐

││ WebAssembly Module │

├│──────────────────────────────────────────┤

││ │

││ Section 1: Type │

││ - Function signatures │

││ │

││ Section 2: Import │

││ - Imported functions/memory/tables │

││ │

││ Section 3: Function │

││ - Function type indices │

││ │

││ Section 4: Table │

││ - Indirect function call tables │

││ │

││ Section 5: Memory │

││ - Linear memory definitions │

││ │

││ Section 6: Global │

││ - Global variables │

││ │

││ Section 7: Export │

││ - Exported functions/memory │

││ │

││ Section 8: Start │

││ - Initialization function │

││ │

││ Section 9: Element │

││ - Table initialization │

││ │

││ Section 10: Code │

││ - Function bodies (bytecode) │

││ │

││ Section 11: Data │

││ - Memory initialization │

││ │ └──────────────────────────────────────────┘


WebAssembly Type System

Value Types

WebAssembly supports four numeric types:

i32  ; 32-bit integer
i64  ; 64-bit integer
f32  ; 32-bit floating-point (IEEE 754)
f64  ; 64-bit floating-point (IEEE 754)

And reference types (newer specification):

funcref    ; Reference to function
externref  ; Reference to external (JS) object

Function Signatures

Functions have typed parameters and results:

;; Function signature: (param i32 i32) (result i32)
(func $add (param $a i32) (param $b i32) (result i32)
    local.get $a
    local.get $b
    i32.add
)

Multiple return values (MVP extension):

;; Returns two values
(func $divmod (param $a i32) (param $b i32) (result i32 i32)
    local.get $a
    local.get $b
    i32.div_u
    local.get $a
    local.get $b
    i32.rem_u
)

WAT: WebAssembly Text Format

S-Expression Syntax

WAT (WebAssembly Text) is the human-readable representation:

(module
  ;; Import JavaScript function
  (import "env" "log" (func $log (param i32)))
  
  ;; Define memory
  (memory 1)
  
  ;; Export memory
  (export "memory" (memory 0))
  
  ;; Add function
  (func $add (param $a i32) (param $b i32) (result i32)
    local.get $a
    local.get $b
    i32.add
  )
  
  ;; Export function
  (export "add" (func $add))
  
  ;; Factorial function (recursive)
  (func $factorial (param $n i32) (result i32)
    (if (result i32)
      (i32.le_s (local.get $n) (i32.const 1))
      (then
        (i32.const 1)
      )
      (else
        (i32.mul
          (local.get $n)
          (call $factorial
            (i32.sub (local.get $n) (i32.const 1))
          )
        )
      )
    )
  )
  
  (export "factorial" (func $factorial))
)

Folded vs. Linear Format

Folded (S-expression):

(i32.add
  (i32.const 2)
  (i32.mul
    (i32.const 3)
    (i32.const 4)
  )
)

Linear (instruction sequence):

i32.const 2
i32.const 3
i32.const 4
i32.mul
i32.add

Both represent: 2+(3×4)=142 + (3 \times 4) = 14


Core Instructions

Arithmetic Operations

Integer operations:

i32.add      ; Addition
i32.sub      ; Subtraction
i32.mul      ; Multiplication
i32.div_s    ; Signed division
i32.div_u    ; Unsigned division
i32.rem_s    ; Signed remainder
i32.rem_u    ; Unsigned remainder

i32.and      ; Bitwise AND
i32.or       ; Bitwise OR
i32.xor      ; Bitwise XOR
i32.shl      ; Shift left
i32.shr_s    ; Arithmetic shift right
i32.shr_u    ; Logical shift right
i32.rotl     ; Rotate left
i32.rotr     ; Rotate right

Floating-point operations:

f64.add      ; Addition
f64.sub      ; Subtraction
f64.mul      ; Multiplication
f64.div      ; Division
f64.sqrt     ; Square root
f64.min      ; Minimum
f64.max      ; Maximum
f64.ceil     ; Ceiling
f64.floor    ; Floor
f64.abs      ; Absolute value
f64.neg      ; Negation

Comparison Operations

i32.eq       ; Equal
i32.ne       ; Not equal
i32.lt_s     ; Less than (signed)
i32.lt_u     ; Less than (unsigned)
i32.le_s     ; Less or equal (signed)
i32.gt_s     ; Greater than (signed)
i32.ge_s     ; Greater or equal (signed)
i32.eqz      ; Equal to zero

Control Flow

Structured control flow (no goto):

;; If-then-else
(if (result i32)
  (i32.lt_s (local.get $x) (i32.const 0))
  (then
    (i32.const -1)
  )
  (else
    (i32.const 1)
  )
)

;; Block (labeled)
(block $my_block
  ;; code
  br $my_block  ;; branch to end of block
  ;; unreachable
)

;; Loop
(loop $continue
  ;; code
  (br_if $continue (i32.const 1))  ;; conditional branch
)

Example: Loop to sum 1..n:

(func $sum (param $n i32) (result i32)
  (local $i i32)
  (local $sum i32)
  
  ;; i = 0, sum = 0
  (local.set $i (i32.const 0))
  (local.set $sum (i32.const 0))
  
  (loop $continue
    ;; sum += i
    (local.set $sum
      (i32.add (local.get $sum) (local.get $i))
    )
    
    ;; i++
    (local.set $i
      (i32.add (local.get $i) (i32.const 1))
    )
    
    ;; if (i <= n) continue loop
    (br_if $continue
      (i32.le_s (local.get $i) (local.get $n))
    )
  )
  
  (local.get $sum)
)

Linear Memory

Memory Model

WebAssembly uses linear memory: a contiguous, resizable byte array.

┌│────────────────────────────────────────────────────┐

││ WebAssembly Linear Memory │

├│────────────────────────────────────────────────────┤

││ │

││ Address 0x0000: [byte][byte][byte][byte] … │

││ Address 0x0004: [byte][byte][byte][byte] … │

││ Address 0x0008: [byte][byte][byte][byte] … │

││ … │

││ │

││ Each page = 64 KiB (65,536 bytes) │

││ Maximum size = 4 GiB (65,536 pages) │

││ │ └────────────────────────────────────────────────────┘

Memory Operations

Define memory:

;; Define 1 page (64 KiB) of memory, max 10 pages
(memory 1 10)

;; Export memory to JavaScript
(export "memory" (memory 0))

Load/store operations:

;; Load 32-bit integer from address
i32.load (offset)

;; Store 32-bit integer to address
i32.store (offset)

;; Load/store with different sizes
i32.load8_s   ; Load signed 8-bit, extend to 32-bit
i32.load8_u   ; Load unsigned 8-bit
i32.load16_s  ; Load signed 16-bit
i32.store8    ; Store low 8 bits
i32.store16   ; Store low 16 bits

Example: Read/write memory:

(func $writeInt (param $addr i32) (param $value i32)
  (i32.store
    (local.get $addr)
    (local.get $value)
  )
)

(func $readInt (param $addr i32) (result i32)
  (i32.load (local.get $addr))
)

Memory Growth

Dynamic memory allocation:

;; Grow memory by n pages, returns previous size or -1
(memory.grow (i32.const 1))

;; Get current memory size in pages
(memory.size)

JavaScript Interoperability

Loading WebAssembly

Streaming compilation (recommended):

async function loadWasm(url) {
    const response = await fetch(url);
    const { instance, module } = await WebAssembly.instantiateStreaming(
        response,
        importObject
    );
    return instance;
}

const wasmInstance = await loadWasm('module.wasm');

From buffer (for Node.js or non-streaming):

const fs = require('fs');

const wasmBuffer = fs.readFileSync('module.wasm');
const wasmModule = await WebAssembly.compile(wasmBuffer);
const wasmInstance = await WebAssembly.instantiate(wasmModule, importObject);

Calling Wasm from JavaScript

// Call exported Wasm function
const result = wasmInstance.exports.add(5, 3);
console.log(result); // 8

// Access exported memory
const memory = wasmInstance.exports.memory;
const view = new Uint8Array(memory.buffer);

// Read from memory
const value = new Int32Array(memory.buffer, 0, 1)[0];

// Write to memory
new Int32Array(memory.buffer, 4, 1)[0] = 42;

Importing JavaScript Functions

Import object:

const importObject = {
    env: {
        // Import JavaScript function
        log: (x) => console.log('Wasm says:', x),
        
        // Import global
        globalValue: new WebAssembly.Global({
            value: 'i32',
            mutable: true
        }, 42)
    }
};

Use in Wasm:

(module
  ;; Import JS function
  (import "env" "log" (func $log (param i32)))
  
  ;; Import JS global
  (import "env" "globalValue" (global $g i32))
  
  (func $test
    ;; Call imported function
    (call $log (global.get $g))
  )
)

Passing Complex Data

Strings (no native string type):

// JavaScript → Wasm: Write UTF-8 string to memory
function writeString(instance, str, addr) {
    const memory = new Uint8Array(instance.exports.memory.buffer);
    const encoded = new TextEncoder().encode(str);
    memory.set(encoded, addr);
    return encoded.length;
}

// Wasm → JavaScript: Read UTF-8 string from memory
function readString(instance, addr, length) {
    const memory = new Uint8Array(instance.exports.memory.buffer);
    const bytes = memory.slice(addr, addr + length);
    return new TextDecoder().decode(bytes);
}

Arrays:

// Pass array to Wasm
function passArray(instance, array) {
    const memory = new Float32Array(instance.exports.memory.buffer);
    memory.set(array, 0); // Write at address 0
    
    // Call Wasm function with array address and length
    instance.exports.processArray(0, array.length);
}

// Get array from Wasm
function getArray(instance, addr, length) {
    const memory = new Float32Array(instance.exports.memory.buffer);
    return Array.from(memory.slice(addr / 4, addr / 4 + length));
}

Building WebAssembly

From WAT to WASM

Using WABT (WebAssembly Binary Toolkit):

# Install wabt
npm install -g wabt

# Compile WAT to WASM
wat2wasm module.wat -o module.wasm

# Disassemble WASM to WAT
wasm2wat module.wasm -o module.wat

# Validate WASM module
wasm-validate module.wasm

Example module (add.wat):

(module
  (func $add (param $a i32) (param $b i32) (result i32)
    local.get $a
    local.get $b
    i32.add
  )
  (export "add" (func $add))
)

Compile and use:

wat2wasm add.wat -o add.wasm
const wasm = await loadWasm('add.wasm');
console.log(wasm.exports.add(10, 20)); // 30

Compilation from High-Level Languages

From C/C++ (covered in detail in Chapter 14):

emcc source.c -o output.wasm \

  -s EXPORTED_FUNCTIONS='["_myFunction"]' \

  -s STANDALONE_WASM

From Rust (covered in detail in Chapter 15):

cargo build --target wasm32-unknown-unknown --release

From AssemblyScript (TypeScript-like):

// module.ts
export function add(a: i32, b: i32): i32 {
  return a + b;
}
asc module.ts -o module.wasm --optimize

Performance Characteristics

Advantages

1. Fast parsing/loading: Binary format → Direct decoding → No parsing overhead vs. JavaScript text → Lexing → Parsing → AST → Bytecode

2. Predictable performance:

  • No JIT warmup delays

  • Ahead-of-time compilation

  • Consistent execution times

3. Compact size: Typical size comparison: JavaScript (minified): 100 KB WebAssembly: 30-50 KB (40-50% smaller)

4. Memory efficiency:

  • Manual memory management

  • No garbage collection overhead

  • Precise control over layout

Limitations

1. Startup cost:

// Compile + instantiate time
const t0 = performance.now();
const wasm = await WebAssembly.instantiateStreaming(fetch('module.wasm'));
const t1 = performance.now();
console.log(`Startup: ${t1 - t0}ms`);

Typical: 10-100ms depending on module size.

2. JS ↔︎ Wasm boundary cost:

// Expensive if called millions of times
for (let i = 0; i < 1_000_000; i++) {
    wasm.exports.smallFunction(i);  // Boundary crossing
}

// Better: Do work in Wasm
wasm.exports.processMillionItems();  // One crossing

3. No direct DOM access:

WebAssembly cannot directly manipulate DOM – must call JavaScript.

4. Limited debugging:

Binary format makes debugging harder (though improving with source maps).


Practical Example: Image Processing

Grayscale Filter

Wasm module (filter.wat):

(module
  (memory (export "memory") 1)
  
  ;; Convert RGBA image to grayscale
  ;; addr: start address of image data
  ;; length: number of pixels
  (func $grayscale (param $addr i32) (param $length i32)
    (local $i i32)
    (local $r i32)
    (local $g i32)
    (local $b i32)
    (local $gray i32)
    (local $offset i32)
    
    (loop $continue
      ;; Calculate byte offset (4 bytes per pixel)
      (local.set $offset
        (i32.mul (local.get $i) (i32.const 4))
      )
      
      ;; Load R, G, B
      (local.set $r (i32.load8_u (i32.add (local.get $addr) (local.get $offset))))
      (local.set $g (i32.load8_u (i32.add (local.get $addr) (i32.add (local.get $offset) (i32.const 1)))))
      (local.set $b (i32.load8_u (i32.add (local.get $addr) (i32.add (local.get $offset) (i32.const 2)))))
      
      ;; Calculate grayscale: 0.299*R + 0.587*G + 0.114*B
      ;; Using integer approximation: (77*R + 150*G + 29*B) / 256
      (local.set $gray
        (i32.div_u
          (i32.add
            (i32.add
              (i32.mul (local.get $r) (i32.const 77))
              (i32.mul (local.get $g) (i32.const 150))
            )
            (i32.mul (local.get $b) (i32.const 29))
          )
          (i32.const 256)
        )
      )
      
      ;; Write grayscale value to R, G, B
      (i32.store8 (i32.add (local.get $addr) (local.get $offset)) (local.get $gray))
      (i32.store8 (i32.add (local.get $addr) (i32.add (local.get $offset) (i32.const 1))) (local.get $gray))
      (i32.store8 (i32.add (local.get $addr) (i32.add (local.get $offset) (i32.const 2))) (local.get $gray))
      
      ;; i++
      (local.set $i (i32.add (local.get $i) (i32.const 1)))
      
      ;; Continue if i < length
      (br_if $continue (i32.lt_u (local.get $i) (local.get $length)))
    )
  )
  
  (export "grayscale" (func $grayscale))
)

JavaScript integration:

async function applyGrayscaleFilter(imageData) {
    // Load Wasm module
    const wasm = await loadWasm('filter.wasm');
    
    // Get memory view
    const memory = new Uint8Array(wasm.exports.memory.buffer);
    
    // Copy image data to Wasm memory
    memory.set(imageData.data, 0);
    
    // Process in Wasm
    const pixelCount = imageData.width * imageData.height;
    wasm.exports.grayscale(0, pixelCount);
    
    // Copy result back
    imageData.data.set(memory.subarray(0, imageData.data.length));
    
    return imageData;
}

// Usage with Canvas
const canvas = document.getElementById('myCanvas');
const ctx = canvas.getContext('2d');
const imageData = ctx.getImageData(0, 0, canvas.width, canvas.height);

await applyGrayscaleFilter(imageData);

ctx.putImageData(imageData, 0, 0);

Performance comparison:

// Pure JavaScript version
function grayscaleJS(imageData) {
    const data = imageData.data;
    for (let i = 0; i < data.length; i += 4) {
        const gray = 0.299 * data[i] + 0.587 * data[i+1] + 0.114 * data[i+2];
        data[i] = data[i+1] = data[i+2] = gray;
    }
}

// Benchmark
const imageData = ctx.getImageData(0, 0, 1920, 1080);

console.time('JavaScript');
grayscaleJS(imageData);
console.timeEnd('JavaScript');
// JavaScript: ~12ms

console.time('WebAssembly');
await applyGrayscaleFilter(imageData);
console.timeEnd('WebAssembly');
// WebAssembly: ~4ms

// ~3x faster!

Summary

WebAssembly fundamentals:

1. Architecture:

  • Stack-based virtual machine

  • Strongly typed instruction set

  • Linear memory model

  • Module-based structure

2. Type system:

  • Four numeric types: i32, i64, f32, f64

  • Reference types: funcref, externref

  • Function signatures with typed params/results

3. Text format (WAT):

  • S-expression syntax

  • Human-readable representation

  • Converts to binary .wasm format

4. Core features:

  • Arithmetic/logic operations

  • Structured control flow (no goto)

  • Memory load/store operations

  • Function calls (direct and indirect)

5. JavaScript interop:

  • Load modules with WebAssembly.instantiateStreaming()

  • Call exported functions

  • Share linear memory

  • Import JavaScript functions

6. Performance:

  • Fast parsing (binary format)

  • Near-native execution speed

  • Predictable performance

  • Small binary size

  • Boundary crossing overhead

7. Use cases:

  • CPU-intensive computations

  • Image/video processing

  • Physics simulations

  • Porting existing C/C++/Rust code

  • Performance-critical algorithms

WebAssembly provides a safe, fast, portable compilation target that complements JavaScript, enabling new classes of web applications with near-native performance. In the next chapters, we’ll explore compiling from high-level languages like C/C++ and Rust to WebAssembly.


Chapter 14: WebAssembly Text Format (WAT)

Introduction: Understanding WAT

WebAssembly Text Format (WAT) is the human-readable representation of WebAssembly modules, using S-expression syntax similar to Lisp. While most developers compile to WebAssembly from high-level languages, understanding WAT is crucial for:

  • Debugging: Reading compiled output

  • Learning: Understanding WebAssembly’s instruction set

  • Manual optimization: Fine-tuning critical sections

  • Tool development: Building compilers and analyzers

┌│─────────────────────────────────────────────────────┐

││ WebAssembly Representations │

├│─────────────────────────────────────────────────────┤

││ │

││ High-level code (C/Rust/etc.) │

││ ↓ │

││ WebAssembly Text (.wat) ←→ Binary (.wasm) │

││ ↓ │

││ JavaScript integration │

││ │ └─────────────────────────────────────────────────────┘

Note: WAT was originally called “Wast”, but the community standardized on “WAT” (.wat file extension).


S-Expression Syntax

Basic Structure

WAT uses S-expressions (symbolic expressions):

;; S-expression format
(keyword arguments...)

;; Examples
(i32.add (i32.const 1) (i32.const 2))

;; Nested
(i32.mul
  (i32.add (i32.const 2) (i32.const 3))
  (i32.const 4)
)
;; Evaluates to: (2 + 3) * 4 = 20

Parentheses rules:

  • Opening ( starts an operation

  • Closing ) completes it

  • Everything is explicitly nested

Comments

;; Single-line comment

(; 
   Multi-line comment
   spanning multiple lines
;)

;;; Documentation comment (convention)
;;; Used to document functions/modules

Module Structure

Minimal Module

Every WAT file defines a module:

(module
  ;; Module contents go here
)

Complete Module Template

(module
  ;; 1. Type definitions
  (type $functype (func (param i32) (result i32)))
  
  ;; 2. Imports
  (import "env" "log" (func $log (param i32)))
  (import "js" "memory" (memory 1))
  
  ;; 3. Function declarations
  (func $internal ...)
  (func $exported ...)
  
  ;; 4. Table definitions
  (table 10 funcref)
  
  ;; 5. Memory definitions
  (memory 1)
  
  ;; 6. Global variables
  (global $counter (mut i32) (i32.const 0))
  
  ;; 7. Exports
  (export "exported" (func $exported))
  (export "memory" (memory 0))
  
  ;; 8. Start function (initialization)
  (start $init)
  
  ;; 9. Element segments (table initialization)
  (elem (i32.const 0) $func1 $func2)
  
  ;; 10. Data segments (memory initialization)
  (data (i32.const 0) "Hello, World!")
)

Value Types

Numeric Types

i32  ;; 32-bit integer
i64  ;; 64-bit integer
f32  ;; 32-bit floating-point (IEEE 754)
f64  ;; 64-bit floating-point (IEEE 754)

Reference Types

funcref    ;; Reference to a function
externref  ;; Reference to external (JavaScript) object

Type Usage

;; Local variables
(local $x i32)
(local $y f64)
(local $fn funcref)

;; Parameters
(param $a i32)
(param $b f64)

;; Results
(result i32)
(result f64 f64)  ;; Multiple results

Constants and Literals

Integer Constants

;; Decimal
(i32.const 42)
(i32.const -17)

;; Hexadecimal
(i32.const 0x2A)
(i32.const 0xFF)

;; Binary
(i32.const 0b101010)

;; Underscores for readability
(i32.const 1_000_000)
(i32.const 0xFF_FF_FF_FF)

Floating-Point Constants

;; Standard notation
(f32.const 3.14159)
(f32.const -2.5)

;; Scientific notation
(f64.const 1.23e-4)
(f64.const 6.022e23)

;; Special values
(f32.const nan)          ;; NaN
(f32.const inf)          ;; Infinity
(f32.const -inf)         ;; Negative infinity

;; Hexadecimal float
(f64.const 0x1.921fb54442d18p+1)  ;; π

Functions

Basic Function Definition

;; Unnamed function
(func (param i32 i32) (result i32)
  local.get 0
  local.get 1
  i32.add
)

;; Named function with named parameters
(func $add (param $a i32) (param $b i32) (result i32)
  local.get $a
  local.get $b
  i32.add
)

Local Variables

(func $compute (param $x i32) (result i32)
  ;; Declare local variables
  (local $temp i32)
  (local $result i32)
  
  ;; Use locals
  (local.set $temp (i32.mul (local.get $x) (i32.const 2)))
  (local.set $result (i32.add (local.get $temp) (i32.const 1)))
  
  (local.get $result)
)

Shorthand for multiple locals:

;; Multiple declarations
(local $a i32)
(local $b i32)
(local $c i32)

;; Equivalent shorthand
(local $a $b $c i32)

Multiple Return Values

;; Function returning two values
(func $swap (param $a i32) (param $b i32) (result i32 i32)
  local.get $b
  local.get $a
)

;; Usage
(func $test
  (local $x i32)
  (local $y i32)
  
  (i32.const 10)
  (i32.const 20)
  (call $swap)
  (local.set $y)  ;; Pop second result
  (local.set $x)  ;; Pop first result
)

Instructions

Stack Manipulation

;; Push constant onto stack
(i32.const 42)

;; Get local variable (push to stack)
(local.get $varname)
(local.get 0)  ;; By index

;; Set local variable (pop from stack)
(local.set $varname)

;; Tee: set local AND keep value on stack
(local.tee $varname)

;; Get global variable
(global.get $globalname)

;; Set global variable
(global.set $globalname)

Arithmetic Instructions

;; Integer arithmetic
i32.add       ;; Addition
i32.sub       ;; Subtraction
i32.mul       ;; Multiplication
i32.div_s     ;; Signed division
i32.div_u     ;; Unsigned division
i32.rem_s     ;; Signed remainder
i32.rem_u     ;; Unsigned remainder

;; Example: (a * b) + c
(func $formula (param $a i32) (param $b i32) (param $c i32) (result i32)
  (i32.add
    (i32.mul (local.get $a) (local.get $b))
    (local.get $c)
  )
)

Bitwise Operations

i32.and       ;; Bitwise AND
i32.or        ;; Bitwise OR
i32.xor       ;; Bitwise XOR
i32.shl       ;; Shift left
i32.shr_s     ;; Arithmetic shift right
i32.shr_u     ;; Logical shift right
i32.rotl      ;; Rotate left
i32.rotr      ;; Rotate right
i32.clz       ;; Count leading zeros
i32.ctz       ;; Count trailing zeros
i32.popcnt    ;; Count set bits

Example: Check if power of 2:

(func $isPowerOf2 (param $n i32) (result i32)
  ;; n != 0 && (n & (n-1)) == 0
  (i32.and
    (i32.ne (local.get $n) (i32.const 0))
    (i32.eqz
      (i32.and
        (local.get $n)
        (i32.sub (local.get $n) (i32.const 1))
      )
    )
  )
)

Comparison Instructions

i32.eq        ;; Equal
i32.ne        ;; Not equal
i32.lt_s      ;; Less than (signed)
i32.lt_u      ;; Less than (unsigned)
i32.le_s      ;; Less or equal (signed)
i32.le_u      ;; Less or equal (unsigned)
i32.gt_s      ;; Greater than (signed)
i32.gt_u      ;; Greater than (unsigned)
i32.ge_s      ;; Greater or equal (signed)
i32.ge_u      ;; Greater or equal (unsigned)
i32.eqz       ;; Equal to zero

Conversion Instructions

;; Wrap (truncate)
i32.wrap_i64          ;; i64 → i32 (truncate)

;; Extend
i64.extend_i32_s      ;; i32 → i64 (sign extend)
i64.extend_i32_u      ;; i32 → i64 (zero extend)

;; Truncate float to int
i32.trunc_f32_s       ;; f32 → i32 (signed)
i32.trunc_f32_u       ;; f32 → i32 (unsigned)
i32.trunc_f64_s       ;; f64 → i32 (signed)

;; Convert int to float
f32.convert_i32_s     ;; i32 → f32 (signed)
f32.convert_i32_u     ;; i32 → f32 (unsigned)

;; Promote/Demote
f64.promote_f32       ;; f32 → f64
f32.demote_f64        ;; f64 → f32

;; Reinterpret bits
i32.reinterpret_f32   ;; Reinterpret f32 bits as i32
f32.reinterpret_i32   ;; Reinterpret i32 bits as f32

Example: Float to int with rounding:

(func $roundToInt (param $x f64) (result i32)
  (i32.trunc_f64_s
    (f64.add (local.get $x) (f64.const 0.5))
  )
)

Control Flow

Blocks

Block: Creates a label for branching to the end:

(block $label (result i32)
  ;; code
  (br $label)  ;; Jump to end of block
  ;; unreachable code
  (i32.const 42)  ;; Block result
)

Example: Early exit:

(func $absoluteValue (param $x i32) (result i32)
  (block $done (result i32)
    ;; If x >= 0, return x immediately
    (br_if $done
      (local.get $x)
      (i32.ge_s (local.get $x) (i32.const 0))
    )
    
    ;; Otherwise return -x
    (i32.sub (i32.const 0) (local.get $x))
  )
)

Loops

Loop: Creates a label for branching to the beginning:

(loop $continue
  ;; code
  (br $continue)  ;; Jump to start of loop
)

Example: Sum 1 to n:

(func $sumToN (param $n i32) (result i32)
  (local $i i32)
  (local $sum i32)
  
  (local.set $i (i32.const 1))
  (local.set $sum (i32.const 0))
  
  (loop $continue
    ;; sum += i
    (local.set $sum
      (i32.add (local.get $sum) (local.get $i))
    )
    
    ;; i++
    (local.set $i
      (i32.add (local.get $i) (i32.const 1))
    )
    
    ;; if (i <= n) continue
    (br_if $continue
      (i32.le_s (local.get $i) (local.get $n))
    )
  )
  
  (local.get $sum)
)

If-Then-Else

;; Without result
(if (i32.gt_s (local.get $x) (i32.const 0))
  (then
    ;; code when true
  )
  (else
    ;; code when false
  )
)

;; With result
(if (result i32)
  (i32.lt_s (local.get $x) (i32.const 0))
  (then
    (i32.const -1)
  )
  (else
    (i32.const 1)
  )
)

Example: Max function:

(func $max (param $a i32) (param $b i32) (result i32)
  (if (result i32)
    (i32.gt_s (local.get $a) (local.get $b))
    (then (local.get $a))
    (else (local.get $b))
  )
)

Select

Select: Ternary operator equivalent:

;; select(value_if_true, value_if_false, condition)
(select
  (i32.const 100)  ;; value if true
  (i32.const 0)    ;; value if false
  (i32.gt_s (local.get $x) (i32.const 0))  ;; condition
)

;; Equivalent to: x > 0 ? 100 : 0

Branch Instructions

br $label           ;; Unconditional branch
br_if $label        ;; Conditional branch (pops condition)
br_table $l1 $l2 $default  ;; Switch-case style

return              ;; Return from function
unreachable         ;; Trap execution

Example: Switch statement:

(func $getDayName (param $day i32) (result i32)
  (block $default (result i32)
    (block $case6 (result i32)
      (block $case5 (result i32)
        (block $case4 (result i32)
          (block $case3 (result i32)
            (block $case2 (result i32)
              (block $case1 (result i32)
                (block $case0 (result i32)
                  ;; Jump table
                  (br_table $case0 $case1 $case2 $case3 
                            $case4 $case5 $case6 $default
                    (local.get $day)
                  )
                )
                ;; Case 0: Sunday
                (return (i32.const 0))
              )
              ;; Case 1: Monday
              (return (i32.const 1))
            )
            ;; Case 2: Tuesday
            (return (i32.const 2))
          )
          ;; ... etc
        )
      )
    )
    ;; Default case
    (i32.const -1)
  )
)

Memory Operations

Memory Definition

;; Define 1 page (64KB) minimum, 10 pages maximum
(memory 1 10)

;; Export memory
(export "memory" (memory 0))

;; Import memory from JavaScript
(import "js" "mem" (memory 1))

Load Instructions

;; Load 32-bit integer
i32.load (offset) (align)

;; Load with specific size
i32.load8_s   ;; Load signed 8-bit, extend to 32-bit
i32.load8_u   ;; Load unsigned 8-bit
i32.load16_s  ;; Load signed 16-bit
i32.load16_u  ;; Load unsigned 16-bit

;; Offset and alignment
(i32.load offset=4 align=4)

Example: Read integer from address:

(func $readInt (param $addr i32) (result i32)
  (i32.load (local.get $addr))
)

Store Instructions

;; Store 32-bit integer
i32.store (offset) (align)

;; Store with specific size
i32.store8    ;; Store low 8 bits
i32.store16   ;; Store low 16 bits

;; Example with offset
(i32.store offset=8 align=4)

Example: Write integer to address:

(func $writeInt (param $addr i32) (param $value i32)
  (i32.store
    (local.get $addr)
    (local.get $value)
  )
)

Memory Size and Growth

;; Get current memory size in pages
(memory.size)

;; Grow memory by n pages (returns previous size or -1)
(memory.grow (i32.const 1))

Example: Allocate memory:

(func $malloc (param $size i32) (result i32)
  (local $oldSize i32)
  (local $pagesNeeded i32)
  
  ;; Calculate pages needed
  (local.set $pagesNeeded
    (i32.div_u
      (i32.add (local.get $size) (i32.const 65535))
      (i32.const 65536)
    )
  )
  
  ;; Get current size
  (local.set $oldSize (memory.size))
  
  ;; Grow memory
  (memory.grow (local.get $pagesNeeded))
  (drop)
  
  ;; Return address (start of new memory)
  (i32.mul (local.get $oldSize) (i32.const 65536))
)

Data Segments

Initialize memory with data:

;; Active segment (written at instantiation)
(data (i32.const 0) "Hello, World!\00")

;; With offset expression
(data (offset (i32.const 1024)) "\01\02\03\04")

;; Passive segment (copied manually)
(data $mydata "Some data")

;; Copy passive segment
(memory.init $mydata
  (i32.const 0)      ;; Destination address
  (i32.const 0)      ;; Source offset
  (i32.const 9)      ;; Length
)

Tables and Indirect Calls

Table Definition

;; Define table of function references
(table 10 20 funcref)  ;; min 10, max 20 entries

;; Export table
(export "table" (table 0))

Element Segments

Initialize table with functions:

(func $func1 ...)
(func $func2 ...)
(func $func3 ...)

;; Active element (written at instantiation)
(elem (i32.const 0) $func1 $func2 $func3)

;; With offset expression
(elem (offset (i32.const 5)) $func1 $func2)

Indirect Function Calls

;; Call function by table index
call_indirect (type $signature)

;; Example
(type $binop (func (param i32 i32) (result i32)))

(func $add (param i32 i32) (result i32)
  (i32.add (local.get 0) (local.get 1))
)

(func $sub (param i32 i32) (result i32)
  (i32.sub (local.get 0) (local.get 1))
)

(table 2 funcref)
(elem (i32.const 0) $add $sub)

(func $compute (param $op i32) (param $a i32) (param $b i32) (result i32)
  (call_indirect (type $binop)
    (local.get $a)
    (local.get $b)
    (local.get $op)  ;; Table index
  )
)

Global Variables

Immutable Globals

;; Constant global
(global $pi f64 (f64.const 3.14159265359))

;; Get global value
(func $circumference (param $r f64) (result f64)
  (f64.mul
    (f64.mul (local.get $r) (f64.const 2.0))
    (global.get $pi)
  )
)

Mutable Globals

;; Mutable global
(global $counter (mut i32) (i32.const 0))

;; Increment counter
(func $increment
  (global.set $counter
    (i32.add (global.get $counter) (i32.const 1))
  )
)

;; Get counter value
(func $getCount (result i32)
  (global.get $counter)
)

Imported/Exported Globals

;; Import global from JavaScript
(import "env" "maxValue" (global $max i32))

;; Export global to JavaScript
(global $status (mut i32) (i32.const 0))
(export "status" (global $status))

Imports and Exports

Importing Functions

;; Import function from JavaScript
(import "env" "log" (func $log (param i32)))
(import "console" "error" (func $error (param i32 i32)))

;; Use imported function
(func $debug (param $x i32)
  (call $log (local.get $x))
)

Exporting Functions

;; Export function to JavaScript
(func $add (param $a i32) (param $b i32) (result i32)
  (i32.add (local.get $a) (local.get $b))
)
(export "add" (func $add))

;; Export with different name
(func $internalName ...)
(export "externalName" (func $internalName))

Importing Memory/Tables

;; Import memory
(import "js" "memory" (memory 1))

;; Import table
(import "env" "table" (table 10 funcref))

;; Import global
(import "env" "timestamp" (global $timestamp i64))

Exporting Memory/Tables

;; Export memory
(memory 1)
(export "memory" (memory 0))

;; Export table
(table 10 funcref)
(export "table" (table 0))

;; Export global
(global $version i32 (i32.const 1))
(export "version" (global $version))

Practical Examples

Example 1: Fibonacci

(module
  ;; Recursive Fibonacci
  (func $fib (param $n i32) (result i32)
    (if (result i32)
      (i32.le_s (local.get $n) (i32.const 1))
      (then
        (local.get $n)
      )
      (else
        (i32.add
          (call $fib (i32.sub (local.get $n) (i32.const 1)))
          (call $fib (i32.sub (local.get $n) (i32.const 2)))
        )
      )
    )
  )
  
  (export "fib" (func $fib))
)

Example 2: String Length

(module
  (memory 1)
  (export "memory" (memory 0))
  
  ;; Calculate length of null-terminated string
  (func $strlen (param $addr i32) (result i32)
    (local $len i32)
    
    (local.set $len (i32.const 0))
    
    (loop $continue
      ;; Load byte at current position
      (if (i32.load8_u
            (i32.add (local.get $addr) (local.get $len)))
        (then
          ;; Not null, increment length
          (local.set $len (i32.add (local.get $len) (i32.const 1)))
          (br $continue)
        )
      )
    )
    
    (local.get $len)
  )
  
  (export "strlen" (func $strlen))
  
  ;; Initialize with test string
  (data (i32.const 0) "Hello, WAT!\00")
)

Example 3: Array Sum

(module
  (memory 1)
  (export "memory" (memory 0))
  
  ;; Sum array of i32 values
  ;; @param addr: start address
  ;; @param length: number of elements
  (func $sumArray (param $addr i32) (param $length i32) (result i32)
    (local $i i32)
    (local $sum i32)
    
    (local.set $i (i32.const 0))
    (local.set $sum (i32.const 0))
    
    (loop $continue
      ;; Load element at index i
      (local.set $sum
        (i32.add
          (local.get $sum)
          (i32.load
            (i32.add
              (local.get $addr)
              (i32.mul (local.get $i) (i32.const 4))
            )
          )
        )
      )
      
      ;; i++
      (local.set $i (i32.add (local.get $i) (i32.const 1)))
      
      ;; Continue if i < length
      (br_if $continue
        (i32.lt_u (local.get $i) (local.get $length))
      )
    )
    
    (local.get $sum)
  )
  
  (export "sumArray" (func $sumArray))
)

Example 4: Pointer-Based Data Structure

(module
  (memory 1)
  (export "memory" (memory 0))
  
  ;; Linked list node: [value: i32, next: i32]
  ;; Node size: 8 bytes
  
  (global $heapPtr (mut i32) (i32.const 0))
  
  ;; Allocate node
  (func $allocNode (result i32)
    (local $ptr i32)
    
    (local.set $ptr (global.get $heapPtr))
    (global.set $heapPtr
      (i32.add (global.get $heapPtr) (i32.const 8))
    )
    
    (local.get $ptr)
  )
  
  ;; Create node with value
  (func $createNode (param $value i32) (result i32)
    (local $node i32)
    
    (local.set $node (call $allocNode))
    
    ;; Set value
    (i32.store (local.get $node) (local.get $value))
    
    ;; Set next to null (0)
    (i32.store offset=4 (local.get $node) (i32.const 0))
    
    (local.get $node)
  )
  
  ;; Get value from node
  (func $getValue (param $node i32) (result i32)
    (i32.load (local.get $node))
  )
  
  ;; Get next pointer
  (func $getNext (param $node i32) (result i32)
    (i32.load offset=4 (local.get $node))
  )
  
  ;; Set next pointer
  (func $setNext (param $node i32) (param $next i32)
    (i32.store offset=4 (local.get $node) (local.get $next))
  )
  
  ;; Sum linked list
  (func $sumList (param $head i32) (result i32)
    (local $sum i32)
    (local $current i32)
    
    (local.set $sum (i32.const 0))
    (local.set $current (local.get $head))
    
    (loop $continue
      (if (local.get $current)
        (then
          ;; Add value
          (local.set $sum
            (i32.add (local.get $sum) (call $getValue (local.get $current)))
          )
          
          ;; Move to next
          (local.set $current (call $getNext (local.get $current)))
          
          (br $continue)
        )
      )
    )
    
    (local.get $sum)
  )
  
  (export "createNode" (func $createNode))
  (export "setNext" (func $setNext))
  (export "sumList" (func $sumList))
)

Converting WAT to WASM

Using WABT Tools

# Install WABT (WebAssembly Binary Toolkit)
npm install -g wabt

# Compile WAT to WASM
wat2wasm module.wat -o module.wasm

# Disassemble WASM to WAT
wasm2wat module.wasm -o module.wat

# Validate WASM
wasm-validate module.wasm

# View binary structure
wasm-objdump -x module.wasm

# View disassembly
wasm-objdump -d module.wasm

Using in JavaScript

// Load and instantiate
const response = await fetch('module.wasm');
const { instance } = await WebAssembly.instantiateStreaming(response);

// Call exported function
const result = instance.exports.fib(10);
console.log(result); // 55

Debugging WAT

Adding Debug Information

(module
  ;; Name section (custom section for debugging)
  (@name "MyModule")
  
  (func $add (param $a i32) (param $b i32) (result i32)
    (@name "add")
    
    ;; Local names
    (local.get $a (@name "a"))
    (local.get $b (@name "b"))
    i32.add
  )
)

Common Debugging Techniques

1. Import console.log:

(import "console" "log" (func $log (param i32)))

(func $debug
  (call $log (i32.const 42))  ;; Log value
)

2. Use unreachable for breakpoints:

(func $test
  ;; code
  unreachable  ;; Trap here
  ;; more code
)

3. Return intermediate values:

(func $compute (param $x i32) (result i32 i32)
  (local $intermediate i32)
  
  (local.set $intermediate (i32.mul (local.get $x) (i32.const 2)))
  
  ;; Return both intermediate and final
  (local.get $intermediate)
  (i32.add (local.get $intermediate) (i32.const 1))
)

Best Practices

1. Use Named Parameters and Locals

;; Good: Named and clear
(func $calculateArea (param $width f64) (param $height f64) (result f64)
  (f64.mul (local.get $width) (local.get $height))
)

;; Bad: Unnamed
(func (param f64) (param f64) (result f64)
  (f64.mul (local.get 0) (local.get 1))
)

2. Add Comments

;; Calculate compound interest
;; Formula: A = P(1 + r)^t
(func $compoundInterest 
  (param $principal f64)  ;; Initial amount
  (param $rate f64)       ;; Annual interest rate
  (param $time i32)       ;; Years
  (result f64)            ;; Final amount
  
  ;; Implementation...
)

3. Use Local Variables for Clarity

;; Good: Clear intent
(func $pythagorean (param $a f64) (param $b f64) (result f64)
  (local $aSquared f64)
  (local $bSquared f64)
  
  (local.set $aSquared (f64.mul (local.get $a) (local.get $a)))
  (local.set $bSquared (f64.mul (local.get $b) (local.get $b)))
  
  (f64.sqrt (f64.add (local.get $aSquared) (local.get $bSquared)))
)

4. Prefer Folded Format for Readability

;; Folded (more readable)
(i32.add
  (i32.mul (local.get $x) (i32.const 2))
  (i32.const 1)
)

;; Linear (harder to read)
local.get $x
i32.const 2
i32.mul
i32.const 1
i32.add

5. Validate Early

# Always validate before deploying
wat2wasm module.wat --validate

Summary

WebAssembly Text Format (WAT) provides:

1. S-expression syntax:

  • Lisp-like parentheses

  • Explicit nesting

  • Human-readable representation

2. Module structure:

  • Types, imports, functions

  • Memory, tables, globals

  • Exports, initialization

3. Type system:

  • Four numeric types: i32, i64, f32, f64

  • Reference types: funcref, externref

4. Instructions:

  • Stack-based operations

  • Arithmetic, logic, comparison

  • Memory load/store

  • Control flow (block, loop, if)

5. Memory model:

  • Linear memory array

  • Load/store with offsets

  • Data segments for initialization

6. Functions:

  • Named parameters/locals

  • Multiple return values

  • Direct and indirect calls

7. Tooling:

  • WABT for conversion/validation

  • Browser DevTools support

  • Source maps for debugging

Understanding WAT is essential for working with WebAssembly at a low level, debugging compiled output, and building tools that generate WebAssembly code. In the next chapter, we’ll explore compiling high-level languages (C/C++) to WebAssembly.


Chapter 15: JavaScript and WebAssembly Interop

Introduction: Bridging Two Worlds

JavaScript-WebAssembly interoperability is the foundation of practical WebAssembly applications. While WebAssembly excels at CPU-intensive computations, it relies on JavaScript for:

  • I/O operations (network, file system, DOM)

  • Complex data structures (objects, arrays, strings)

  • Browser APIs (Canvas, WebGL, Audio)

  • User interaction and event handling

┌│────────────────────────────────────────────────────┐

││ JavaScript Host Environment │

├│────────────────────────────────────────────────────┤

││ │

││ ┌──────────────┐ ┌──────────────┐ │

││ │ JavaScript │ ←────→ │ WebAssembly │ │

││ │ Engine │ Interop│ Module │ │

││ └──────────────┘ └──────────────┘ │

││ ↓ ↓ │

││ ┌──────────────┐ ┌──────────────┐ │

││ │ JS Objects │ │ Linear Memory│ │

││ │ DOM APIs │ │ (Typed Data)│ │

││ └──────────────┘ └──────────────┘ │

││ │ └────────────────────────────────────────────────────┘

Key Challenge: The fundamental mismatch between JavaScript’s dynamic, garbage-collected objects and WebAssembly’s static, linear memory model.


Loading WebAssembly Modules

Basic Loading Pattern

// Modern async approach (recommended)
async function loadWasm(url) {
  try {
    const response = await fetch(url);
    const { instance, module } = await WebAssembly.instantiateStreaming(response);
    return instance;
  } catch (error) {
    console.error('Failed to load WebAssembly:', error);
    throw error;
  }
}

// Usage
const wasmInstance = await loadWasm('module.wasm');
const result = wasmInstance.exports.add(5, 3);
console.log(result); // 8

Synchronous Loading (Node.js or with ArrayBuffer)

// Node.js
const fs = require('fs');

function loadWasmSync(filepath) {
  const buffer = fs.readFileSync(filepath);
  const module = new WebAssembly.Module(buffer);
  const instance = new WebAssembly.Instance(module);
  return instance;
}

// Browser (after fetching)
async function loadWithArrayBuffer(url) {
  const response = await fetch(url);
  const buffer = await response.arrayBuffer();
  
  const module = new WebAssembly.Module(buffer);
  const instance = new WebAssembly.Instance(module);
  return instance;
}

With Import Object

async function loadWithImports(url, importObject) {
  const response = await fetch(url);
  const { instance } = await WebAssembly.instantiateStreaming(
    response,
    importObject
  );
  return instance;
}

// Define imports
const importObject = {
  env: {
    log: (value) => console.log('WASM says:', value),
    abort: (msg, file, line, column) => {
      console.error(`Abort at ${file}:${line}:${column} - ${msg}`);
    }
  },
  js: {
    memory: new WebAssembly.Memory({ initial: 1 })
  }
};

const instance = await loadWithImports('module.wasm', importObject);

Data Type Mapping

Primitive Types

WebAssembly ↔︎ JavaScript type correspondence:

// WebAssembly types → JavaScript types
/*
  i32 → Number (32-bit integer)
  i64 → BigInt
  f32 → Number (32-bit float)
  f64 → Number (64-bit float)

*/

// Example module
const wasmCode = `
(module
  (func $testTypes 
    (param $int i32) 
    (param $bigint i64) 
    (param $float f32) 
    (param $double f64)
    (result f64)
    
    ;; Convert and add all values
    (f64.add
      (f64.add
        (f64.convert_i32_s (local.get $int))
        (f64.convert_i64_s (local.get $bigint))
      )
      (f64.add
        (f64.promote_f32 (local.get $float))
        (local.get $double)
      )
    )
  )
  
  (export "testTypes" (func $testTypes))
)
`;

// JavaScript usage
const instance = await loadWasmFromText(wasmCode);

const result = instance.exports.testTypes(
  10,              // i32
  BigInt(20),      // i64 (must use BigInt!)
  3.14,            // f32
  2.71828          // f64
);

console.log(result); // 35.85828

Important: i64 parameters and returns require BigInt in JavaScript:

// Correct
wasmFunc(42, BigInt(100));

// Error: Cannot convert number to BigInt
wasmFunc(42, 100);

Type Coercion Table

Wasm Type JS Input JS Output Notes
i32 Number Number Truncated to 32-bit
i64 BigInt BigInt Must use BigInt
f32 Number Number Precision loss possible
f64 Number Number Full precision
funcref Function/null Function Reference types (MVP+)
externref Any JS value Same value Opaque reference (MVP+)

Memory Management

Accessing Linear Memory

// WAT module with memory
const wasmCode = `
(module
  (memory 1)  ;; 1 page = 64KB
  (export "memory" (memory 0))
  
  ;; Store integer at address
  (func $writeInt (param $addr i32) (param $value i32)
    (i32.store (local.get $addr) (local.get $value))
  )
  
  ;; Read integer from address
  (func $readInt (param $addr i32) (result i32)
    (i32.load (local.get $addr))
  )
  
  (export "writeInt" (func $writeInt))
  (export "readInt" (func $readInt))
)
`;

const instance = await loadWasmFromText(wasmCode);

// Access memory from JavaScript
const memory = instance.exports.memory;
const buffer = memory.buffer;

// Create typed array views
const int32View = new Int32Array(buffer);
const uint8View = new Uint8Array(buffer);
const float64View = new Float64Array(buffer);

// Write via JavaScript
int32View[0] = 42;
int32View[1] = 100;

// Read via WebAssembly
console.log(instance.exports.readInt(0));  // 42
console.log(instance.exports.readInt(4));  // 100

// Write via WebAssembly
instance.exports.writeInt(8, 999);

// Read via JavaScript
console.log(int32View[2]); // 999

Shared Memory Creation

// JavaScript creates memory, shares with Wasm
const memory = new WebAssembly.Memory({
  initial: 1,    // 1 page (64KB)
  maximum: 10    // Max 10 pages (640KB)
});

const importObject = {
  js: { memory }
};

const wasmCode = `
(module
  (import "js" "memory" (memory 1))
  
  (func $init
    ;; Initialize memory
    (i32.store (i32.const 0) (i32.const 42))
  )
  
  (export "init" (func $init))
)
`;

const instance = await loadWasmFromText(wasmCode, importObject);

// Both JS and Wasm share the same memory
instance.exports.init();

const view = new Int32Array(memory.buffer);
console.log(view[0]); // 42

Memory Growth

const wasmCode = `
(module
  (memory 1)
  (export "memory" (memory 0))
  
  ;; Grow memory by n pages
  (func $growMemory (param $pages i32) (result i32)
    (memory.grow (local.get $pages))
  )
  
  (export "growMemory" (func $growMemory))
)
`;

const instance = await loadWasmFromText(wasmCode);
const memory = instance.exports.memory;

console.log('Initial size:', memory.buffer.byteLength); // 65536 (64KB)

// Grow by 2 pages
const oldSize = instance.exports.growMemory(2);
console.log('Previous size (pages):', oldSize); // 1
console.log('New size:', memory.buffer.byteLength); // 196608 (192KB)

// IMPORTANT: buffer reference is now stale!
// Must get new buffer reference after growth
const newBuffer = memory.buffer;
const newView = new Uint8Array(newBuffer);

Critical Warning: After memory.grow(), the old buffer reference becomes detached. Always re-obtain the buffer:

// ❌ Wrong: Stale buffer reference
const oldBuffer = memory.buffer;
const oldView = new Uint8Array(oldBuffer);

instance.exports.growMemory(1);

oldView[0] = 42; // TypeError: detached ArrayBuffer

// ✅ Correct: Get new buffer
function getMemoryView(memory) {
  return new Uint8Array(memory.buffer);
}

let view = getMemoryView(memory);
instance.exports.growMemory(1);
view = getMemoryView(memory); // Refresh reference
view[0] = 42; // OK

String Handling

The String Problem

WebAssembly has no native string type. Strings must be:

  1. Encoded as byte sequences in linear memory

  2. Shared via memory addresses (pointers)

  3. Decoded on the receiving end

JavaScript → WebAssembly (Passing Strings)

// JavaScript side: String encoding utilities
class WasmStringHelper {
  constructor(wasmInstance) {
    this.instance = wasmInstance;
    this.memory = wasmInstance.exports.memory;
    this.encoder = new TextEncoder();
    this.decoder = new TextDecoder('utf-8');
  }
  
  // Get current memory view (refresh after growth)
  getMemoryView() {
    return new Uint8Array(this.memory.buffer);
  }
  
  // Allocate space for string in Wasm memory
  allocateString(str) {
    const bytes = this.encoder.encode(str);
    const len = bytes.length;
    
    // Allocate memory in Wasm (assumes malloc export)
    const ptr = this.instance.exports.malloc(len + 1); // +1 for null terminator
    
    // Copy string bytes
    const view = this.getMemoryView();
    view.set(bytes, ptr);
    view[ptr + len] = 0; // Null terminator
    
    return { ptr, len };
  }
  
  // Read C-style string from memory (null-terminated)
  readCString(ptr) {
    const view = this.getMemoryView();
    let end = ptr;
    
    // Find null terminator
    while (view[end] !== 0) end++;
    
    // Decode bytes
    return this.decoder.decode(view.subarray(ptr, end));
  }
  
  // Read string with known length
  readString(ptr, length) {
    const view = this.getMemoryView();
    return this.decoder.decode(view.subarray(ptr, ptr + length));
  }
}

// Example usage
const wasmCode = `
(module
  (memory 1)
  (export "memory" (memory 0))
  
  (global $heapPtr (mut i32) (i32.const 0))
  
  ;; Simple malloc
  (func $malloc (param $size i32) (result i32)
    (local $ptr i32)
    (local.set $ptr (global.get $heapPtr))
    (global.set $heapPtr
      (i32.add (global.get $heapPtr) (local.get $size))
    )
    (local.get $ptr)
  )
  
  ;; String length (C-style)
  (func $strlen (param $ptr i32) (result i32)
    (local $len i32)
    (loop $continue
      (if (i32.load8_u (i32.add (local.get $ptr) (local.get $len)))
        (then
          (local.set $len (i32.add (local.get $len) (i32.const 1)))
          (br $continue)
        )
      )
    )
    (local.get $len)
  )
  
  ;; Reverse string in place
  (func $reverseString (param $ptr i32) (param $len i32)
    (local $i i32)
    (local $j i32)
    (local $temp i32)
    
    (local.set $i (i32.const 0))
    (local.set $j (i32.sub (local.get $len) (i32.const 1)))
    
    (loop $continue
      (if (i32.lt_s (local.get $i) (local.get $j))
        (then
          ;; Swap bytes
          (local.set $temp
            (i32.load8_u (i32.add (local.get $ptr) (local.get $i)))
          )
          (i32.store8
            (i32.add (local.get $ptr) (local.get $i))
            (i32.load8_u (i32.add (local.get $ptr) (local.get $j)))
          )
          (i32.store8
            (i32.add (local.get $ptr) (local.get $j))
            (local.get $temp)
          )
          
          (local.set $i (i32.add (local.get $i) (i32.const 1)))
          (local.set $j (i32.sub (local.get $j) (i32.const 1)))
          (br $continue)
        )
      )
    )
  )
  
  (export "malloc" (func $malloc))
  (export "strlen" (func $strlen))
  (export "reverseString" (func $reverseString))
)
`;

const instance = await loadWasmFromText(wasmCode);
const helper = new WasmStringHelper(instance);

// Pass string to WebAssembly
const inputStr = "Hello, WebAssembly!";
const { ptr, len } = helper.allocateString(inputStr);

console.log('String at address:', ptr);
console.log('Length:', len);

// Reverse string in Wasm memory
instance.exports.reverseString(ptr, len);

// Read result
const reversed = helper.readCString(ptr);
console.log('Reversed:', reversed); // "!ylbmessAbeW ,olleH"

WebAssembly → JavaScript (Returning Strings)

// Pattern 1: Return pointer, JS reads from memory
const wasmCode = `
(module
  (memory 1)
  (export "memory" (memory 0))
  
  ;; Store greeting in memory
  (data (i32.const 0) "Hello from Wasm!")
  
  ;; Return pointer to string
  (func $getGreeting (result i32)
    (i32.const 0)
  )
  
  (export "getGreeting" (func $getGreeting))
)
`;

const instance = await loadWasmFromText(wasmCode);
const helper = new WasmStringHelper(instance);

const ptr = instance.exports.getGreeting();
const greeting = helper.readCString(ptr);
console.log(greeting); // "Hello from Wasm!"

// Pattern 2: Return pointer + length
const wasmCode2 = `
(module
  (memory 1)
  (export "memory" (memory 0))
  
  (global $strPtr i32 (i32.const 0))
  (global $strLen i32 (i32.const 16))
  
  (data (i32.const 0) "Hello from Wasm!")
  
  (func $getStringPtr (result i32)
    (global.get $strPtr)
  )
  
  (func $getStringLen (result i32)
    (global.get $strLen)
  )
  
  (export "getStringPtr" (func $getStringPtr))
  (export "getStringLen" (func $getStringLen))
)
`;

const instance2 = await loadWasmFromText(wasmCode2);
const helper2 = new WasmStringHelper(instance2);

const strPtr = instance2.exports.getStringPtr();
const strLen = instance2.exports.getStringLen();
const message = helper2.readString(strPtr, strLen);
console.log(message); // "Hello from Wasm!"

Advanced: String Builder Pattern

// JavaScript wrapper for string building
class WasmStringBuilder {
  constructor(wasmInstance) {
    this.instance = wasmInstance;
    this.helper = new WasmStringHelper(wasmInstance);
    this.bufferPtr = null;
    this.capacity = 0;
  }
  
  // Initialize buffer
  init(initialCapacity = 256) {
    this.capacity = initialCapacity;
    this.bufferPtr = this.instance.exports.malloc(this.capacity);
  }
  
  // Append string
  append(str) {
    const { ptr } = this.helper.allocateString(str);
    const len = str.length;
    
    // Call Wasm function to append
    this.instance.exports.stringBuilderAppend(
      this.bufferPtr,
      this.capacity,
      ptr,
      len
    );
  }
  
  // Get result
  toString() {
    const len = this.instance.exports.stringBuilderLength(this.bufferPtr);
    return this.helper.readString(this.bufferPtr, len);
  }
}

Complex Data Structures

Arrays

// Passing JavaScript arrays to WebAssembly
class WasmArrayHelper {
  constructor(wasmInstance) {
    this.instance = wasmInstance;
    this.memory = wasmInstance.exports.memory;
  }
  
  // Allocate and copy i32 array
  allocateInt32Array(jsArray) {
    const length = jsArray.length;
    const bytes = length * 4; // 4 bytes per i32
    
    // Allocate
    const ptr = this.instance.exports.malloc(bytes);
    
    // Copy data
    const view = new Int32Array(this.memory.buffer, ptr, length);
    view.set(jsArray);
    
    return { ptr, length };
  }
  
  // Read i32 array from memory
  readInt32Array(ptr, length) {
    const view = new Int32Array(this.memory.buffer, ptr, length);
    return Array.from(view);
  }
  
  // Allocate and copy f64 array
  allocateFloat64Array(jsArray) {
    const length = jsArray.length;
    const bytes = length * 8; // 8 bytes per f64
    
    const ptr = this.instance.exports.malloc(bytes);
    const view = new Float64Array(this.memory.buffer, ptr, length);
    view.set(jsArray);
    
    return { ptr, length };
  }
  
  readFloat64Array(ptr, length) {
    const view = new Float64Array(this.memory.buffer, ptr, length);
    return Array.from(view);
  }
}

// Example: Vector operations
const wasmCode = `
(module
  (memory 1)
  (export "memory" (memory 0))
  
  (global $heapPtr (mut i32) (i32.const 0))
  
  (func $malloc (param $size i32) (result i32)
    (local $ptr i32)
    (local.set $ptr (global.get $heapPtr))
    (global.set $heapPtr
      (i32.add (global.get $heapPtr) (local.get $size))
    )
    (local.get $ptr)
  )
  
  ;; Add two f64 arrays element-wise
  (func $addVectors 
    (param $a i32)      ;; Pointer to array A
    (param $b i32)      ;; Pointer to array B
    (param $result i32) ;; Pointer to result array
    (param $length i32) ;; Array length
    
    (local $i i32)
    (local $offset i32)
    
    (loop $continue
      (if (i32.lt_u (local.get $i) (local.get $length))
        (then
          ;; offset = i * 8 (8 bytes per f64)
          (local.set $offset (i32.mul (local.get $i) (i32.const 8)))
          
          ;; result[i] = a[i] + b[i]
          (f64.store
            (i32.add (local.get $result) (local.get $offset))
            (f64.add
              (f64.load (i32.add (local.get $a) (local.get $offset)))
              (f64.load (i32.add (local.get $b) (local.get $offset)))
            )
          )
          
          (local.set $i (i32.add (local.get $i) (i32.const 1)))
          (br $continue)
        )
      )
    )
  )
  
  ;; Dot product of two vectors
  (func $dotProduct
    (param $a i32)
    (param $b i32)
    (param $length i32)
    (result f64)
    
    (local $i i32)
    (local $sum f64)
    (local $offset i32)
    
    (local.set $sum (f64.const 0))
    
    (loop $continue
      (if (i32.lt_u (local.get $i) (local.get $length))
        (then
          (local.set $offset (i32.mul (local.get $i) (i32.const 8)))
          
          ;; sum += a[i] * b[i]
          (local.set $sum
            (f64.add
              (local.get $sum)
              (f64.mul
                (f64.load (i32.add (local.get $a) (local.get $offset)))
                (f64.load (i32.add (local.get $b) (local.get $offset)))
              )
            )
          )
          
          (local.set $i (i32.add (local.get $i) (i32.const 1)))
          (br $continue)
        )
      )
    )
    
    (local.get $sum)
  )
  
  (export "malloc" (func $malloc))
  (export "addVectors" (func $addVectors))
  (export "dotProduct" (func $dotProduct))
)
`;

const instance = await loadWasmFromText(wasmCode);
const arrayHelper = new WasmArrayHelper(instance);

// JavaScript arrays
const vecA = [1.0, 2.0, 3.0, 4.0];
const vecB = [5.0, 6.0, 7.0, 8.0];

// Allocate in Wasm memory
const { ptr: ptrA } = arrayHelper.allocateFloat64Array(vecA);
const { ptr: ptrB } = arrayHelper.allocateFloat64Array(vecB);
const { ptr: ptrResult } = arrayHelper.allocateFloat64Array(new Array(4).fill(0));

// Vector addition
instance.exports.addVectors(ptrA, ptrB, ptrResult, 4);
const sum = arrayHelper.readFloat64Array(ptrResult, 4);
console.log('A + B =', sum); // [6, 8, 10, 12]

// Dot product
const dot = instance.exports.dotProduct(ptrA, ptrB, 4);
console.log('A · B =', dot); // 70

Structures (Records)

// JavaScript representation of C-like struct
class WasmStructHelper {
  constructor(wasmInstance) {
    this.instance = wasmInstance;
    this.memory = wasmInstance.exports.memory;
  }
  
  getMemoryView() {
    return new DataView(this.memory.buffer);
  }
  
  // Example: Person struct
  // struct Person {
  //   i32 id;       // offset 0
  //   i32 age;      // offset 4
  //   f64 salary;   // offset 8
  // };
  // Total size: 16 bytes
  
  writePerson(ptr, person) {
    const view = this.getMemoryView();
    view.setInt32(ptr, person.id, true);        // Little-endian
    view.setInt32(ptr + 4, person.age, true);
    view.setFloat64(ptr + 8, person.salary, true);
  }
  
  readPerson(ptr) {
    const view = this.getMemoryView();
    return {
      id: view.getInt32(ptr, true),
      age: view.getInt32(ptr + 4, true),
      salary: view.getFloat64(ptr + 8, true)
    };
  }
  
  // Allocate person
  allocatePerson(person) {
    const ptr = this.instance.exports.malloc(16);
    this.writePerson(ptr, person);
    return ptr;
  }
}

// Example usage
const wasmCode = `
(module
  (memory 1)
  (export "memory" (memory 0))
  
  (global $heapPtr (mut i32) (i32.const 0))
  
  (func $malloc (param $size i32) (result i32)
    (local $ptr i32)
    (local.set $ptr (global.get $heapPtr))
    (global.set $heapPtr
      (i32.add (global.get $heapPtr) (local.get $size))
    )
    (local.get $ptr)
  )
  
  ;; Calculate bonus (10% of salary)
  (func $calculateBonus (param $personPtr i32) (result f64)
    (f64.mul
      (f64.load offset=8 (local.get $personPtr))  ;; Load salary
      (f64.const 0.1)
    )
  )
  
  ;; Increment age
  (func $incrementAge (param $personPtr i32)
    (i32.store offset=4
      (local.get $personPtr)
      (i32.add
        (i32.load offset=4 (local.get $personPtr))
        (i32.const 1)
      )
    )
  )
  
  (export "malloc" (func $malloc))
  (export "calculateBonus" (func $calculateBonus))
  (export "incrementAge" (func $incrementAge))
)
`;

const instance = await loadWasmFromText(wasmCode);
const structHelper = new WasmStructHelper(instance);

// Create person
const person = {
  id: 101,
  age: 30,
  salary: 75000.0
};

// Allocate in Wasm
const personPtr = structHelper.allocatePerson(person);

// Calculate bonus
const bonus = instance.exports.calculateBonus(personPtr);
console.log('Bonus:', bonus); // 7500

// Increment age
instance.exports.incrementAge(personPtr);

// Read updated person
const updated = structHelper.readPerson(personPtr);
console.log('Updated:', updated); // { id: 101, age: 31, salary: 75000 }

Importing JavaScript Functions

Basic Function Import

const importObject = {
  env: {
    // Simple logging
    log: (value) => {
      console.log('Wasm log:', value);
    },
    
    // Math operations
    randomFloat: () => Math.random(),
    
    getCurrentTime: () => Date.now(),
    
    // Assertions
    assert: (condition) => {
      if (!condition) {
        throw new Error('Assertion failed');
      }
    }
  }
};

const wasmCode = `
(module
  (import "env" "log" (func $log (param i32)))
  (import "env" "randomFloat" (func $randomFloat (result f64)))
  (import "env" "getCurrentTime" (func $getCurrentTime (result f64)))
  (import "env" "assert" (func $assert (param i32)))
  
  (func $test
    ;; Log a value
    (call $log (i32.const 42))
    
    ;; Get random number
    (local $rand f64)
    (local.set $rand (call $randomFloat))
    
    ;; Assert it's in range [0, 1)
    (call $assert
      (f64.lt (local.get $rand) (f64.const 1.0))
    )
    
    ;; Get timestamp
    (drop (call $getCurrentTime))
  )
  
  (export "test" (func $test))
)
`;

const instance = await loadWasmFromText(wasmCode, importObject);
instance.exports.test();

Callback Pattern

// JavaScript provides callback, Wasm calls it
class WasmWithCallbacks {
  constructor() {
    this.callbacks = new Map();
    this.nextId = 0;
  }
  
  createImportObject() {
    return {
      env: {
        // Register callback
        registerCallback: (callbackId) => {
          console.log('Callback registered:', callbackId);
        },
        
        // Invoke callback
        invokeCallback: (callbackId, value) => {
          const callback = this.callbacks.get(callbackId);
          if (callback) {
            return callback(value);
          }
          return 0;
        }
      }
    };
  }
  
  // JavaScript registers callback
  onEvent(callback) {
    const id = this.nextId++;
    this.callbacks.set(id, callback);
    return id;
  }
}

const manager = new WasmWithCallbacks();

const wasmCode = `
(module
  (import "env" "registerCallback" (func $registerCallback (param i32)))
  (import "env" "invokeCallback" (func $invokeCallback (param i32 i32) (result i32)))
  
  ;; Process data with callback
  (func $processWithCallback (param $callbackId i32) (param $data i32) (result i32)
    ;; Do some processing
    (local $processed i32)
    (local.set $processed (i32.mul (local.get $data) (i32.const 2)))
    
    ;; Invoke JavaScript callback
    (call $invokeCallback 
      (local.get $callbackId)
      (local.get $processed)
    )
  )
  
  (export "processWithCallback" (func $processWithCallback))
)
`;

const instance = await loadWasmFromText(
  wasmCode,
  manager.createImportObject()
);

// Register callback
const callbackId = manager.onEvent((value) => {
  console.log('Callback invoked with:', value);
  return value + 10;
});

// Process data
const result = instance.exports.processWithCallback(callbackId, 5);
console.log('Final result:', result); // Callback invoked with: 10, Final: 20

Exporting to JavaScript

Exporting Functions

const wasmCode = `
(module
  ;; Math utilities
  (func $add (param $a i32) (param $b i32) (result i32)
    (i32.add (local.get $a) (local.get $b))
  )
  
  (func $multiply (param $a f64) (param $b f64) (result f64)
    (f64.mul (local.get $a) (local.get $b))
  )
  
  ;; Export with same name
  (export "add" (func $add))
  
  ;; Export with different name
  (export "mul" (func $multiply))
)
`;

const instance = await loadWasmFromText(wasmCode);

// Access exports
console.log(instance.exports.add(5, 3));      // 8
console.log(instance.exports.mul(2.5, 4.0));  // 10

Exporting Memory

const wasmCode = `
(module
  (memory 2)
  (export "memory" (memory 0))
  
  ;; Initialize with data
  (data (i32.const 0) "WebAssembly")
  
  (func $getDataPtr (result i32)
    (i32.const 0)
  )
  
  (export "getDataPtr" (func $getDataPtr))
)
`;

const instance = await loadWasmFromText(wasmCode);

// Access exported memory
const memory = instance.exports.memory;
const view = new Uint8Array(memory.buffer);

const ptr = instance.exports.getDataPtr();
const decoder = new TextDecoder();
const text = decoder.decode(view.subarray(ptr, ptr + 11));
console.log(text); // "WebAssembly"

Exporting Globals

const wasmCode = `
(module
  ;; Immutable global (constant)
  (global $VERSION i32 (i32.const 100))
  (export "VERSION" (global $VERSION))
  
  ;; Mutable global (state)
  (global $counter (mut i32) (i32.const 0))
  (export "counter" (global $counter))
  
  (func $increment
    (global.set $counter
      (i32.add (global.get $counter) (i32.const 1))
    )
  )
  
  (export "increment" (func $increment))
)
`;

const instance = await loadWasmFromText(wasmCode);

// Read constant
console.log('Version:', instance.exports.VERSION.value); // 100

// Read/write mutable global
console.log('Counter:', instance.exports.counter.value); // 0

instance.exports.increment();
console.log('Counter:', instance.exports.counter.value); // 1

// Set from JavaScript
instance.exports.counter.value = 42;
console.log('Counter:', instance.exports.counter.value); // 42

Exporting Tables

const wasmCode = `
(module
  (type $binop (func (param i32 i32) (result i32)))
  
  (func $add (param i32 i32) (result i32)
    (i32.add (local.get 0) (local.get 1))
  )
  
  (func $subtract (param i32 i32) (result i32)
    (i32.sub (local.get 0) (local.get 1))
  )
  
  (func $multiply (param i32 i32) (result i32)
    (i32.mul (local.get 0) (local.get 1))
  )
  
  ;; Function table
  (table $ops 3 funcref)
  (elem (i32.const 0) $add $subtract $multiply)
  
  (export "ops" (table $ops))
  
  ;; Indirect call wrapper
  (func $calculate (param $op i32) (param $a i32) (param $b i32) (result i32)
    (call_indirect (type $binop)
      (local.get $a)
      (local.get $b)
      (local.get $op)
    )
  )
  
  (export "calculate" (func $calculate))
)
`;

const instance = await loadWasmFromText(wasmCode);

// Call via table index
console.log(instance.exports.calculate(0, 10, 5)); // 15 (add)
console.log(instance.exports.calculate(1, 10, 5)); // 5  (subtract)
console.log(instance.exports.calculate(2, 10, 5)); // 50 (multiply)

// Access table from JavaScript
const table = instance.exports.ops;
console.log('Table length:', table.length); // 3

// Get function from table
const addFunc = table.get(0);
console.log('Direct call:', addFunc(10, 5)); // 15

Performance Considerations

Minimizing Boundary Crossings

❌ Bad: Frequent JS ↔︎ Wasm calls:

// Inefficient: Call Wasm for each element
for (let i = 0; i < 10000; i++) {
  result[i] = wasmInstance.exports.process(data[i]);
}

✅ Good: Batch processing:

// Efficient: Single call with array
const { ptr: dataPtr } = allocateInt32Array(data);
const { ptr: resultPtr } = allocateInt32Array(new Array(10000));

wasmInstance.exports.processArray(dataPtr, resultPtr, 10000);

const result = readInt32Array(resultPtr, 10000);

Memory Access Patterns

✅ Use TypedArrays for bulk operations:

// Fast: Direct memory access
const view = new Float64Array(memory.buffer, ptr, length);
for (let i = 0; i < length; i++) {
  view[i] *= 2.0;
}

// Slower: Individual loads/stores through Wasm
for (let i = 0; i < length; i++) {
  const value = instance.exports.load(ptr + i * 8);
  instance.exports.store(ptr + i * 8, value * 2.0);
}

String Encoding Optimization

✅ Cache encoder/decoder instances:

// Good: Reuse encoder/decoder
class StringHelper {
  constructor() {
    this.encoder = new TextEncoder();
    this.decoder = new TextDecoder('utf-8');
  }
}

// Bad: Create new instances each time
function encodeString(str) {
  return new TextEncoder().encode(str); // Wasteful
}

Avoid Memory Growth During Hot Paths

// Pre-allocate sufficient memory
const memory = new WebAssembly.Memory({
  initial: 100,    // Start with 6.4 MB
  maximum: 1000    // Max 64 MB
});

// Growth invalidates all TypedArray views!
// Avoid growing during performance-critical operations

Error Handling

Catching Wasm Traps

try {
  // This might trap (divide by zero, out of bounds, etc.)
  const result = instance.exports.divide(10, 0);
} catch (error) {
  if (error instanceof WebAssembly.RuntimeError) {
    console.error('Wasm runtime error:', error.message);
    // Handle trap
  } else {
    throw error;
  }
}

Custom Error Handling

// Import error handler
const importObject = {
  env: {
    throwError: (code) => {
      const errors = {
        1: 'Invalid input',
        2: 'Out of bounds',
        3: 'Division by zero'
      };
      throw new Error(errors[code] || 'Unknown error');
    }
  }
};

const wasmCode = `
(module
  (import "env" "throwError" (func $throwError (param i32)))
  
  (func $safeDivide (param $a i32) (param $b i32) (result i32)
    ;; Check for division by zero
    (if (i32.eqz (local.get $b))
      (then
        (call $throwError (i32.const 3))
        (unreachable)
      )
    )
    
    (i32.div_s (local.get $a) (local.get $b))
  )
  
  (export "safeDivide" (func $safeDivide))
)
`;

const instance = await loadWasmFromText(wasmCode, importObject);

try {
  instance.exports.safeDivide(10, 0);
} catch (error) {
  console.error('Error:', error.message); // "Division by zero"
}

Advanced Patterns

Async Wasm Operations

// Wasm is synchronous, but we can wrap in async
class AsyncWasmWorker {
  constructor(wasmInstance) {
    this.instance = wasmInstance;
    this.queue = [];
    this.processing = false;
  }
  
  async compute(data) {
    return new Promise((resolve, reject) => {
      this.queue.push({ data, resolve, reject });
      this.processQueue();
    });
  }
  
  async processQueue() {
    if (this.processing || this.queue.length === 0) return;
    
    this.processing = true;
    
    while (this.queue.length > 0) {
      const { data, resolve, reject } = this.queue.shift();
      
      try {
        // Yield to event loop
        await new Promise(r => setTimeout(r, 0));
        
        // Compute in Wasm
        const result = this.instance.exports.heavyComputation(data);
        resolve(result);
      } catch (error) {
        reject(error);
      }
    }
    
    this.processing = false;
  }
}

Web Workers with Wasm

// main.js
const worker = new Worker('wasm-worker.js');

worker.postMessage({
  type: 'init',
  wasmUrl: 'module.wasm'
});

worker.onmessage = (event) => {
  if (event.data.type === 'result') {
    console.log('Result:', event.data.value);
  }
};

// Send work
worker.postMessage({
  type: 'compute',
  data: [1, 2, 3, 4, 5]
});

// wasm-worker.js
let wasmInstance;

self.onmessage = async (event) => {
  if (event.data.type === 'init') {
    const response = await fetch(event.data.wasmUrl);
    const { instance } = await WebAssembly.instantiateStreaming(response);
    wasmInstance = instance;
    
    self.postMessage({ type: 'ready' });
  }
  
  if (event.data.type === 'compute') {
    // Process in Wasm
    const result = wasmInstance.exports.process(event.data.data);
    
    self.postMessage({
      type: 'result',
      value: result
    });
  }
};

SIMD Operations (Bonus)

// WebAssembly SIMD support (post-MVP feature)
const wasmCode = `
(module
  (memory 1)
  (export "memory" (memory 0))
  
  ;; SIMD vector addition (4x f32)
  (func $addVec4 (param $a i32) (param $b i32) (param $result i32)
    (v128.store
      (local.get $result)
      (f32x4.add
        (v128.load (local.get $a))
        (v128.load (local.get $b))
      )
    )
  )
  
  (export "addVec4" (func $addVec4))
)
`;

// Check for SIMD support
if (typeof WebAssembly.SIMD !== 'undefined') {
  // Use SIMD version
} else {
  // Fallback to scalar version
}

Complete Example: Image Processing

// Image blur using WebAssembly
class ImageProcessor {
  constructor(wasmInstance) {
    this.instance = wasmInstance;
    this.memory = wasmInstance.exports.memory;
  }
  
  // Load image data into Wasm memory
  loadImageData(imageData) {
    const { width, height, data } = imageData;
    const length = data.length;
    
    // Allocate memory
    const ptr = this.instance.exports.malloc(length);
    
    // Copy pixel data
    const view = new Uint8ClampedArray(this.memory.buffer, ptr, length);
    view.set(data);
    
    return { ptr, width, height };
  }
  
  // Blur image
  blur(imageData, radius = 5) {
    const { ptr, width, height } = this.loadImageData(imageData);
    
    // Allocate output buffer
    const outputPtr = this.instance.exports.malloc(imageData.data.length);
    
    // Apply blur in Wasm
    this.instance.exports.boxBlur(
      ptr,
      outputPtr,
      width,
      height,
      radius
    );
    
    // Read result
    const resultView = new Uint8ClampedArray(
      this.memory.buffer,
      outputPtr,
      imageData.data.length
    );
    
    // Create new ImageData
    const result = new ImageData(
      new Uint8ClampedArray(resultView),
      width,
      height
    );
    
    // Free memory
    this.instance.exports.free(ptr);
    this.instance.exports.free(outputPtr);
    
    return result;
  }
}

// Usage with Canvas
const canvas = document.getElementById('myCanvas');
const ctx = canvas.getContext('2d');

// Load Wasm module
const instance = await loadWasm('image-processor.wasm');
const processor = new ImageProcessor(instance);

// Get image data
const imageData = ctx.getImageData(0, 0, canvas.width, canvas.height);

// Process
const blurred = processor.blur(imageData, 5);

// Draw result
ctx.putImageData(blurred, 0, 0);

Summary

JavaScript-WebAssembly interop enables powerful hybrid applications:

Key Concepts:

  1. Loading: instantiateStreaming() for modules

  2. Type mapping: Numbers, BigInt for i64

  3. Memory: Shared linear memory via TypedArrays

  4. Strings: Encode/decode via UTF-8 in memory

  5. Arrays: Pass via memory pointers

  6. Structures: Layout data according to C conventions

  7. Imports: JavaScript functions callable from Wasm

  8. Exports: Functions, memory, globals, tables

Performance Tips:

  • Minimize boundary crossings

  • Batch operations

  • Pre-allocate memory

  • Use TypedArrays for bulk access

  • Cache encoder/decoder instances

Best Practices:

  • Clear ownership of memory allocation

  • Proper error handling

  • Memory growth awareness

  • Use Web Workers for parallelism

The interop layer is where WebAssembly’s computational power meets JavaScript’s rich ecosystem, enabling applications that leverage the strengths of both platforms.


Chapter 16: Building a WebAssembly Compiler

Introduction: Compiling to WebAssembly

Building a compiler that targets WebAssembly involves transforming a high-level source language into the low-level WebAssembly binary format. This chapter demonstrates building a complete compiler for a simple language that generates valid .wasm modules.

Compiler Pipeline Overview:

Source Code ↓

┌│─────────────────┐

││ Lexical Analysis│ → Tokens └─────────────────┘ ↓

┌│─────────────────┐

││ Syntax Analysis │ → AST └─────────────────┘ ↓

┌│─────────────────┐

││ Semantic Check │ → Typed AST └─────────────────┘ ↓

┌│─────────────────┐

││ Code Generation │ → WAT (Text) └─────────────────┘ ↓

┌│─────────────────┐

││ WAT → Binary │ → .wasm file └─────────────────┘


Target Language: SimpleScript

We’ll compile a simplified language with these features:

Syntax:

// Variables and types
let x: i32 = 42;
let y: f64 = 3.14;

// Functions
fn add(a: i32, b: i32): i32 {
  return a + b;
}

// Control flow
if (x > 10) {
  return 1;
} else {
  return 0;
}

// Loops
while (x < 100) {
  x = x + 1;
}

// Export directive
@export
fn main(): i32 {
  return add(5, 3);
}

Supported Types: i32, i64, f32, f64

Operations: +, -, *, /, %, ==, !=, <, >, <=, >=


Phase 1: Lexical Analysis

// Token types
const TokenType = {
  // Literals
  NUMBER: 'NUMBER',
  IDENTIFIER: 'IDENTIFIER',
  
  // Keywords
  LET: 'LET',
  FN: 'FN',
  RETURN: 'RETURN',
  IF: 'IF',
  ELSE: 'ELSE',
  WHILE: 'WHILE',
  
  // Types
  I32: 'I32',
  I64: 'I64',
  F32: 'F32',
  F64: 'F64',
  
  // Symbols
  COLON: 'COLON',
  SEMICOLON: 'SEMICOLON',
  COMMA: 'COMMA',
  LPAREN: 'LPAREN',
  RPAREN: 'RPAREN',
  LBRACE: 'LBRACE',
  RBRACE: 'RBRACE',
  EQUALS: 'EQUALS',
  PLUS: 'PLUS',
  MINUS: 'MINUS',
  STAR: 'STAR',
  SLASH: 'SLASH',
  PERCENT: 'PERCENT',
  
  // Comparisons
  EQ_EQ: 'EQ_EQ',
  NOT_EQ: 'NOT_EQ',
  LT: 'LT',
  GT: 'GT',
  LT_EQ: 'LT_EQ',
  GT_EQ: 'GT_EQ',
  
  // Special
  AT: 'AT',
  EXPORT: 'EXPORT',
  
  EOF: 'EOF'
};

class Token {
  constructor(type, value, line, column) {
    this.type = type;
    this.value = value;
    this.line = line;
    this.column = column;
  }
}

class Lexer {
  constructor(source) {
    this.source = source;
    this.pos = 0;
    this.line = 1;
    this.column = 1;
    
    this.keywords = {
      'let': TokenType.LET,
      'fn': TokenType.FN,
      'return': TokenType.RETURN,
      'if': TokenType.IF,
      'else': TokenType.ELSE,
      'while': TokenType.WHILE,
      'i32': TokenType.I32,
      'i64': TokenType.I64,
      'f32': TokenType.F32,
      'f64': TokenType.F64,
      'export': TokenType.EXPORT
    };
  }
  
  current() {
    return this.source[this.pos];
  }
  
  peek(offset = 1) {
    return this.source[this.pos + offset];
  }
  
  advance() {
    const ch = this.current();
    this.pos++;
    
    if (ch === '\n') {
      this.line++;
      this.column = 1;
    } else {
      this.column++;
    }
    
    return ch;
  }
  
  skipWhitespace() {
    while (this.pos < this.source.length) {
      const ch = this.current();
      
      if (ch === ' ' || ch === '\t' || ch === '\n' || ch === '\r') {
        this.advance();
      } else if (ch === '/' && this.peek() === '/') {
        // Skip line comment
        while (this.current() !== '\n' && this.pos < this.source.length) {
          this.advance();
        }
      } else if (ch === '/' && this.peek() === '*') {
        // Skip block comment
        this.advance(); // /
        this.advance(); // *
        
        while (this.pos < this.source.length) {
          if (this.current() === '*' && this.peek() === '/') {
            this.advance(); // *
            this.advance(); // /
            break;
          }
          this.advance();
        }
      } else {
        break;
      }
    }
  }
  
  readNumber() {
    const start = this.pos;
    const startColumn = this.column;
    let isFloat = false;
    
    while (this.pos < this.source.length) {
      const ch = this.current();
      
      if (ch >= '0' && ch <= '9') {
        this.advance();
      } else if (ch === '.' && !isFloat) {
        isFloat = true;
        this.advance();
      } else {
        break;
      }
    }
    
    const value = this.source.substring(start, this.pos);
    return new Token(
      TokenType.NUMBER,
      isFloat ? parseFloat(value) : parseInt(value),
      this.line,
      startColumn
    );
  }
  
  readIdentifier() {
    const start = this.pos;
    const startColumn = this.column;
    
    while (this.pos < this.source.length) {
      const ch = this.current();
      
      if ((ch >= 'a' && ch <= 'z') ||
          (ch >= 'A' && ch <= 'Z') ||
          (ch >= '0' && ch <= '9') ||
          ch === '_') {
        this.advance();
      } else {
        break;
      }
    }
    
    const value = this.source.substring(start, this.pos);
    const type = this.keywords[value] || TokenType.IDENTIFIER;
    
    return new Token(type, value, this.line, startColumn);
  }
  
  nextToken() {
    this.skipWhitespace();
    
    if (this.pos >= this.source.length) {
      return new Token(TokenType.EOF, null, this.line, this.column);
    }
    
    const ch = this.current();
    const column = this.column;
    
    // Numbers
    if (ch >= '0' && ch <= '9') {
      return this.readNumber();
    }
    
    // Identifiers and keywords
    if ((ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z') || ch === '_') {
      return this.readIdentifier();
    }
    
    // Two-character operators
    if (ch === '=' && this.peek() === '=') {
      this.advance();
      this.advance();
      return new Token(TokenType.EQ_EQ, '==', this.line, column);
    }
    
    if (ch === '!' && this.peek() === '=') {
      this.advance();
      this.advance();
      return new Token(TokenType.NOT_EQ, '!=', this.line, column);
    }
    
    if (ch === '<' && this.peek() === '=') {
      this.advance();
      this.advance();
      return new Token(TokenType.LT_EQ, '<=', this.line, column);
    }
    
    if (ch === '>' && this.peek() === '=') {
      this.advance();
      this.advance();
      return new Token(TokenType.GT_EQ, '>=', this.line, column);
    }
    
    // Single-character tokens
    const single = {
      ':': TokenType.COLON,
      ';': TokenType.SEMICOLON,
      ',': TokenType.COMMA,
      '(': TokenType.LPAREN,
      ')': TokenType.RPAREN,
      '{': TokenType.LBRACE,
      '}': TokenType.RBRACE,
      '=': TokenType.EQUALS,
      '+': TokenType.PLUS,
      '-': TokenType.MINUS,
      '*': TokenType.STAR,
      '/': TokenType.SLASH,
      '%': TokenType.PERCENT,
      '<': TokenType.LT,
      '>': TokenType.GT,
      '@': TokenType.AT
    };
    
    if (ch in single) {
      this.advance();
      return new Token(single[ch], ch, this.line, column);
    }
    
    throw new Error(`Unexpected character '${ch}' at ${this.line}:${column}`);
  }
  
  tokenize() {
    const tokens = [];
    
    while (true) {
      const token = this.nextToken();
      tokens.push(token);
      
      if (token.type === TokenType.EOF) break;
    }
    
    return tokens;
  }
}

Phase 2: Syntax Analysis (Parser)

// AST Node types
class ASTNode {
  constructor(type, line, column) {
    this.type = type;
    this.line = line;
    this.column = column;
  }
}

class Program extends ASTNode {
  constructor(declarations) {
    super('Program', 1, 1);
    this.declarations = declarations;
  }
}

class FunctionDeclaration extends ASTNode {
  constructor(name, params, returnType, body, isExport, line, column) {
    super('FunctionDeclaration', line, column);
    this.name = name;
    this.params = params; // [{ name, type }]
    this.returnType = returnType;
    this.body = body;
    this.isExport = isExport;
  }
}

class VariableDeclaration extends ASTNode {
  constructor(name, varType, initializer, line, column) {
    super('VariableDeclaration', line, column);
    this.name = name;
    this.varType = varType;
    this.initializer = initializer;
  }
}

class ReturnStatement extends ASTNode {
  constructor(expression, line, column) {
    super('ReturnStatement', line, column);
    this.expression = expression;
  }
}

class IfStatement extends ASTNode {
  constructor(condition, thenBranch, elseBranch, line, column) {
    super('IfStatement', line, column);
    this.condition = condition;
    this.thenBranch = thenBranch;
    this.elseBranch = elseBranch;
  }
}

class WhileStatement extends ASTNode {
  constructor(condition, body, line, column) {
    super('WhileStatement', line, column);
    this.condition = condition;
    this.body = body;
  }
}

class BlockStatement extends ASTNode {
  constructor(statements, line, column) {
    super('BlockStatement', line, column);
    this.statements = statements;
  }
}

class ExpressionStatement extends ASTNode {
  constructor(expression, line, column) {
    super('ExpressionStatement', line, column);
    this.expression = expression;
  }
}

class BinaryExpression extends ASTNode {
  constructor(operator, left, right, line, column) {
    super('BinaryExpression', line, column);
    this.operator = operator;
    this.left = left;
    this.right = right;
  }
}

class AssignmentExpression extends ASTNode {
  constructor(name, value, line, column) {
    super('AssignmentExpression', line, column);
    this.name = name;
    this.value = value;
  }
}

class CallExpression extends ASTNode {
  constructor(callee, args, line, column) {
    super('CallExpression', line, column);
    this.callee = callee;
    this.args = args;
  }
}

class Identifier extends ASTNode {
  constructor(name, line, column) {
    super('Identifier', line, column);
    this.name = name;
  }
}

class Literal extends ASTNode {
  constructor(value, valueType, line, column) {
    super('Literal', line, column);
    this.value = value;
    this.valueType = valueType; // 'i32', 'f64', etc.
  }
}

class Parser {
  constructor(tokens) {
    this.tokens = tokens;
    this.pos = 0;
  }
  
  current() {
    return this.tokens[this.pos];
  }
  
  peek(offset = 1) {
    return this.tokens[this.pos + offset];
  }
  
  advance() {
    return this.tokens[this.pos++];
  }
  
  expect(type) {
    const token = this.current();
    
    if (token.type !== type) {
      throw new Error(
        `Expected ${type} but got ${token.type} at ${token.line}:${token.column}`
      );
    }
    
    return this.advance();
  }
  
  match(...types) {
    return types.includes(this.current().type);
  }
  
  // Parse entry point
  parse() {
    const declarations = [];
    
    while (this.current().type !== TokenType.EOF) {
      declarations.push(this.parseDeclaration());
    }
    
    return new Program(declarations);
  }
  
  parseDeclaration() {
    // Check for @export annotation
    let isExport = false;
    
    if (this.match(TokenType.AT)) {
      this.advance();
      this.expect(TokenType.EXPORT);
      isExport = true;
    }
    
    if (this.match(TokenType.FN)) {
      return this.parseFunctionDeclaration(isExport);
    }
    
    throw new Error(`Unexpected token at ${this.current().line}:${this.current().column}`);
  }
  
  parseFunctionDeclaration(isExport) {
    const fnToken = this.expect(TokenType.FN);
    const name = this.expect(TokenType.IDENTIFIER).value;
    
    this.expect(TokenType.LPAREN);
    const params = this.parseParameterList();
    this.expect(TokenType.RPAREN);
    
    this.expect(TokenType.COLON);
    const returnType = this.parseType();
    
    const body = this.parseBlockStatement();
    
    return new FunctionDeclaration(
      name,
      params,
      returnType,
      body,
      isExport,
      fnToken.line,
      fnToken.column
    );
  }
  
  parseParameterList() {
    const params = [];
    
    if (this.match(TokenType.RPAREN)) {
      return params;
    }
    
    do {
      const name = this.expect(TokenType.IDENTIFIER).value;
      this.expect(TokenType.COLON);
      const type = this.parseType();
      
      params.push({ name, type });
      
      if (this.match(TokenType.COMMA)) {
        this.advance();
      } else {
        break;
      }
    } while (true);
    
    return params;
  }
  
  parseType() {
    const token = this.advance();
    
    if ([TokenType.I32, TokenType.I64, TokenType.F32, TokenType.F64].includes(token.type)) {
      return token.value;
    }
    
    throw new Error(`Invalid type at ${token.line}:${token.column}`);
  }
  
  parseBlockStatement() {
    const lbrace = this.expect(TokenType.LBRACE);
    const statements = [];
    
    while (!this.match(TokenType.RBRACE) && !this.match(TokenType.EOF)) {
      statements.push(this.parseStatement());
    }
    
    this.expect(TokenType.RBRACE);
    
    return new BlockStatement(statements, lbrace.line, lbrace.column);
  }
  
  parseStatement() {
    if (this.match(TokenType.LET)) {
      return this.parseVariableDeclaration();
    }
    
    if (this.match(TokenType.RETURN)) {
      return this.parseReturnStatement();
    }
    
    if (this.match(TokenType.IF)) {
      return this.parseIfStatement();
    }
    
    if (this.match(TokenType.WHILE)) {
      return this.parseWhileStatement();
    }
    
    if (this.match(TokenType.LBRACE)) {
      return this.parseBlockStatement();
    }
    
    return this.parseExpressionStatement();
  }
  
  parseVariableDeclaration() {
    const letToken = this.expect(TokenType.LET);
    const name = this.expect(TokenType.IDENTIFIER).value;
    this.expect(TokenType.COLON);
    const varType = this.parseType();
    
    let initializer = null;
    
    if (this.match(TokenType.EQUALS)) {
      this.advance();
      initializer = this.parseExpression();
    }
    
    this.expect(TokenType.SEMICOLON);
    
    return new VariableDeclaration(
      name,
      varType,
      initializer,
      letToken.line,
      letToken.column
    );
  }
  
  parseReturnStatement() {
    const returnToken = this.expect(TokenType.RETURN);
    
    let expression = null;
    
    if (!this.match(TokenType.SEMICOLON)) {
      expression = this.parseExpression();
    }
    
    this.expect(TokenType.SEMICOLON);
    
    return new ReturnStatement(expression, returnToken.line, returnToken.column);
  }
  
  parseIfStatement() {
    const ifToken = this.expect(TokenType.IF);
    this.expect(TokenType.LPAREN);
    const condition = this.parseExpression();
    this.expect(TokenType.RPAREN);
    
    const thenBranch = this.parseStatement();
    
    let elseBranch = null;
    
    if (this.match(TokenType.ELSE)) {
      this.advance();
      elseBranch = this.parseStatement();
    }
    
    return new IfStatement(
      condition,
      thenBranch,
      elseBranch,
      ifToken.line,
      ifToken.column
    );
  }
  
  parseWhileStatement() {
    const whileToken = this.expect(TokenType.WHILE);
    this.expect(TokenType.LPAREN);
    const condition = this.parseExpression();
    this.expect(TokenType.RPAREN);
    
    const body = this.parseStatement();
    
    return new WhileStatement(condition, body, whileToken.line, whileToken.column);
  }
  
  parseExpressionStatement() {
    const expression = this.parseExpression();
    this.expect(TokenType.SEMICOLON);
    
    return new ExpressionStatement(
      expression,
      expression.line,
      expression.column
    );
  }
  
  parseExpression() {
    return this.parseAssignment();
  }
  
  parseAssignment() {
    const expr = this.parseComparison();
    
    if (this.match(TokenType.EQUALS)) {
      this.advance();
      const value = this.parseAssignment();
      
      if (expr instanceof Identifier) {
        return new AssignmentExpression(
          expr.name,
          value,
          expr.line,
          expr.column
        );
      }
      
      throw new Error('Invalid assignment target');
    }
    
    return expr;
  }
  
  parseComparison() {
    let left = this.parseAdditive();
    
    while (this.match(TokenType.EQ_EQ, TokenType.NOT_EQ, 
                       TokenType.LT, TokenType.GT,
                       TokenType.LT_EQ, TokenType.GT_EQ)) {
      const operator = this.advance().type;
      const right = this.parseAdditive();
      
      left = new BinaryExpression(
        operator,
        left,
        right,
        left.line,
        left.column
      );
    }
    
    return left;
  }
  
  parseAdditive() {
    let left = this.parseMultiplicative();
    
    while (this.match(TokenType.PLUS, TokenType.MINUS)) {
      const operator = this.advance().type;
      const right = this.parseMultiplicative();
      
      left = new BinaryExpression(
        operator,
        left,
        right,
        left.line,
        left.column
      );
    }
    
    return left;
  }
  
  parseMultiplicative() {
    let left = this.parsePrimary();
    
    while (this.match(TokenType.STAR, TokenType.SLASH, TokenType.PERCENT)) {
      const operator = this.advance().type;
      const right = this.parsePrimary();
      
      left = new BinaryExpression(
        operator,
        left,
        right,
        left.line,
        left.column
      );
    }
    
    return left;
  }
  
  parsePrimary() {
    // Number literal
    if (this.match(TokenType.NUMBER)) {
      const token = this.advance();
      const valueType = Number.isInteger(token.value) ? 'i32' : 'f64';
      
      return new Literal(token.value, valueType, token.line, token.column);
    }
    
    // Identifier or function call
    if (this.match(TokenType.IDENTIFIER)) {
      const token = this.advance();
      
      if (this.match(TokenType.LPAREN)) {
        return this.parseCallExpression(token);
      }
      
      return new Identifier(token.value, token.line, token.column);
    }
    
    // Parenthesized expression
    if (this.match(TokenType.LPAREN)) {
      this.advance();
      const expr = this.parseExpression();
      this.expect(TokenType.RPAREN);
      return expr;
    }
    
    throw new Error(`Unexpected token at ${this.current().line}:${this.current().column}`);
  }
  
  parseCallExpression(nameToken) {
    this.expect(TokenType.LPAREN);
    
    const args = [];
    
    if (!this.match(TokenType.RPAREN)) {
      do {
        args.push(this.parseExpression());
        
        if (this.match(TokenType.COMMA)) {
          this.advance();
        } else {
          break;
        }
      } while (true);
    }
    
    this.expect(TokenType.RPAREN);
    
    return new CallExpression(
      nameToken.value,
      args,
      nameToken.line,
      nameToken.column
    );
  }
}

Phase 3: Semantic Analysis

class SemanticAnalyzer {
  constructor(ast) {
    this.ast = ast;
    this.scopes = [new Map()]; // Stack of scopes
    this.currentFunction = null;
    this.errors = [];
  }
  
  analyze() {
    this.visitProgram(this.ast);
    
    if (this.errors.length > 0) {
      throw new Error('Semantic errors:\n' + this.errors.join('\n'));
    }
    
    return this.ast;
  }
  
  error(message, node) {
    this.errors.push(`${message} at ${node.line}:${node.column}`);
  }
  
  pushScope() {
    this.scopes.push(new Map());
  }
  
  popScope() {
    this.scopes.pop();
  }
  
  defineSymbol(name, info, node) {
    const scope = this.scopes[this.scopes.length - 1];
    
    if (scope.has(name)) {
      this.error(`Redeclaration of '${name}'`, node);
    }
    
    scope.set(name, info);
  }
  
  lookupSymbol(name) {
    for (let i = this.scopes.length - 1; i >= 0; i--) {
      if (this.scopes[i].has(name)) {
        return this.scopes[i].get(name);
      }
    }
    
    return null;
  }
  
  visitProgram(node) {
    // First pass: collect all function signatures
    for (const decl of node.declarations) {
      if (decl instanceof FunctionDeclaration) {
        const paramTypes = decl.params.map(p => p.type);
        
        this.defineSymbol(decl.name, {
          kind: 'function',
          paramTypes,
          returnType: decl.returnType
        }, decl);
      }
    }
    
    // Second pass: visit function bodies
    for (const decl of node.declarations) {
      this.visitFunctionDeclaration(decl);
    }
  }
  
  visitFunctionDeclaration(node) {
    this.currentFunction = node;
    this.pushScope();
    
    // Define parameters
    for (const param of node.params) {
      this.defineSymbol(param.name, {
        kind: 'variable',
        type: param.type
      }, node);
    }
    
    this.visitBlockStatement(node.body);
    
    this.popScope();
    this.currentFunction = null;
  }
  
  visitBlockStatement(node) {
    this.pushScope();
    
    for (const stmt of node.statements) {
      this.visitStatement(stmt);
    }
    
    this.popScope();
  }
  
  visitStatement(node) {
    if (node instanceof VariableDeclaration) {
      return this.visitVariableDeclaration(node);
    }
    
    if (node instanceof ReturnStatement) {
      return this.visitReturnStatement(node);
    }
    
    if (node instanceof IfStatement) {
      return this.visitIfStatement(node);
    }
    
    if (node instanceof WhileStatement) {
      return this.visitWhileStatement(node);
    }
    
    if (node instanceof BlockStatement) {
      return this.visitBlockStatement(node);
    }
    
    if (node instanceof ExpressionStatement) {
      return this.visitExpression(node.expression);
    }
  }
  
  visitVariableDeclaration(node) {
    if (node.initializer) {
      const initType = this.visitExpression(node.initializer);
      
      // Check type compatibility
      if (initType !== node.varType) {
        this.error(
          `Type mismatch: expected ${node.varType}, got ${initType}`,
          node
        );
      }
    }
    
    this.defineSymbol(node.name, {
      kind: 'variable',
      type: node.varType
    }, node);
  }
  
  visitReturnStatement(node) {
    if (node.expression) {
      const exprType = this.visitExpression(node.expression);
      
      if (exprType !== this.currentFunction.returnType) {
        this.error(
          `Return type mismatch: expected ${this.currentFunction.returnType}, got ${exprType}`,
          node
        );
      }
    }
  }
  
  visitIfStatement(node) {
    this.visitExpression(node.condition);
    this.visitStatement(node.thenBranch);
    
    if (node.elseBranch) {
      this.visitStatement(node.elseBranch);
    }
  }
  
  visitWhileStatement(node) {
    this.visitExpression(node.condition);
    this.visitStatement(node.body);
  }
  
  visitExpression(node) {
    if (node instanceof Literal) {
      return node.valueType;
    }
    
    if (node instanceof Identifier) {
      const symbol = this.lookupSymbol(node.name);
      
      if (!symbol) {
        this.error(`Undefined variable '${node.name}'`, node);
        return 'i32'; // Default for error recovery
      }
      
      return symbol.type;
    }
    
    if (node instanceof BinaryExpression) {
      const leftType = this.visitExpression(node.left);
      const rightType = this.visitExpression(node.right);
      
      if (leftType !== rightType) {
        this.error(
          `Type mismatch in binary operation: ${leftType} vs ${rightType}`,
          node
        );
      }
      
      // Comparison operators return i32 (boolean)
      if ([TokenType.EQ_EQ, TokenType.NOT_EQ, TokenType.LT, 
           TokenType.GT, TokenType.LT_EQ, TokenType.GT_EQ].includes(node.operator)) {
        return 'i32';
      }
      
      return leftType;
    }
    
    if (node instanceof AssignmentExpression) {
      const symbol = this.lookupSymbol(node.name);
      
      if (!symbol) {
        this.error(`Undefined variable '${node.name}'`, node);
        return 'i32';
      }
      
      const valueType = this.visitExpression(node.value);
      
      if (valueType !== symbol.type) {
        this.error(
          `Type mismatch in assignment: expected ${symbol.type}, got ${valueType}`,
          node
        );
      }
      
      return symbol.type;
    }
    
    if (node instanceof CallExpression) {
      const funcSymbol = this.lookupSymbol(node.callee);
      
      if (!funcSymbol || funcSymbol.kind !== 'function') {
        this.error(`Undefined function '${node.callee}'`, node);
        return 'i32';
      }
      
      // Check argument count
      if (node.args.length !== funcSymbol.paramTypes.length) {
        this.error(
          `Expected ${funcSymbol.paramTypes.length} arguments, got ${node.args.length}`,
          node
        );
      }
      
      // Check argument types
      for (let i = 0; i < node.args.length; i++) {
        const argType = this.visitExpression(node.args[i]);
        const expectedType = funcSymbol.paramTypes[i];
        
        if (argType !== expectedType) {
          this.error(
            `Argument ${i + 1} type mismatch: expected ${expectedType}, got ${argType}`,
            node
          );
        }
      }
      
      return funcSymbol.returnType;
    }
    
    return 'i32';
  }
}

Phase 4: Code Generation (WAT)

class WATGenerator {
  constructor(ast) {
    this.ast = ast;
    this.output = [];
    this.indent = 0;
    this.localIndex = 0;
    this.locals = new Map();
  }
  
  generate() {
    this.emit('(module');
    this.indent++;
    
    for (const decl of this.ast.declarations) {
      if (decl instanceof FunctionDeclaration) {
        this.generateFunction(decl);
      }
    }
    
    this.indent--;
    this.emit(')');
    
    return this.output.join('\n');
  }
  
  emit(code) {
    const spaces = '  '.repeat(this.indent);
    this.output.push(spaces + code);
  }
  
  generateFunction(node) {
    this.locals = new Map();
    this.localIndex = 0;
    
    // Allocate parameter indices
    for (const param of node.params) {
      this.locals.set(param.name, this.localIndex++);
    }
    
    // Start function
    this.emit(`(func $${node.name}`);
    this.indent++;
    
    // Parameters
    for (const param of node.params) {
      this.emit(`(param $${param.name} ${param.type})`);
    }
    
    // Return type
    if (node.returnType !== 'void') {
      this.emit(`(result ${node.returnType})`);
    }
    
    // Collect local variables
    const localVars = this.collectLocals(node.body);
    
    for (const [name, type] of localVars) {
      this.emit(`(local $${name} ${type})`);
      this.locals.set(name, this.localIndex++);
    }
    
    // Function body
    this.generateBlockStatement(node.body, false);
    
    this.indent--;
    this.emit(')');
    
    // Export if needed
    if (node.isExport) {
      this.emit(`(export "${node.name}" (func $${node.name}))`);
    }
  }
  
  collectLocals(block) {
    const locals = [];
    
    for (const stmt of block.statements) {
      if (stmt instanceof VariableDeclaration) {
        locals.push([stmt.name, stmt.varType]);
      } else if (stmt instanceof BlockStatement) {
        locals.push(...this.collectLocals(stmt));
      }
    }
    
    return locals;
  }
  
  generateBlockStatement(node, emitBlock = true) {
    if (emitBlock) {
      this.emit('(block');
      this.indent++;
    }
    
    for (const stmt of node.statements) {
      this.generateStatement(stmt);
    }
    
    if (emitBlock) {
      this.indent--;
      this.emit(')');
    }
  }
  
  generateStatement(node) {
    if (node instanceof VariableDeclaration) {
      return this.generateVariableDeclaration(node);
    }
    
    if (node instanceof ReturnStatement) {
      return this.generateReturnStatement(node);
    }
    
    if (node instanceof IfStatement) {
      return this.generateIfStatement(node);
    }
    
    if (node instanceof WhileStatement) {
      return this.generateWhileStatement(node);
    }
    
    if (node instanceof BlockStatement) {
      return this.generateBlockStatement(node, true);
    }
    
    if (node instanceof ExpressionStatement) {
      this.generateExpression(node.expression);
      this.emit('(drop)'); // Discard expression result
    }
  }
  
  generateVariableDeclaration(node) {
    if (node.initializer) {
      this.generateExpression(node.initializer);
      this.emit(`(local.set $${node.name})`);
    }
  }
  
  generateReturnStatement(node) {
    if (node.expression) {
      this.generateExpression(node.expression);
    }
    
    this.emit('(return)');
  }
  
  generateIfStatement(node) {
    this.emit('(if');
    this.indent++;
    
    // Condition
    this.generateExpression(node.condition);
    
    // Then branch
    this.emit('(then');
    this.indent++;
    this.generateStatement(node.thenBranch);
    this.indent--;
    this.emit(')');
    
    // Else branch
    if (node.elseBranch) {
      this.emit('(else');
      this.indent++;
      this.generateStatement(node.elseBranch);
      this.indent--;
      this.emit(')');
    }
    
    this.indent--;
    this.emit(')');
  }
  
  generateWhileStatement(node) {
    this.emit('(block $break');
    this.indent++;
    
    this.emit('(loop $continue');
    this.indent++;
    
    // Check condition
    this.generateExpression(node.condition);
    this.emit('(i32.eqz)');
    this.emit('(br_if $break)');
    
    // Body
    this.generateStatement(node.body);
    
    // Continue loop
    this.emit('(br $continue)');
    
    this.indent--;
    this.emit(')');
    
    this.indent--;
    this.emit(')');
  }
  
  generateExpression(node) {
    if (node instanceof Literal) {
      const instruction = node.valueType === 'i32' ? 'i32.const' : 
                         node.valueType === 'i64' ? 'i64.const' :
                         node.valueType === 'f32' ? 'f32.const' : 'f64.const';
      
      this.emit(`(${instruction} ${node.value})`);
    } else if (node instanceof Identifier) {
      this.emit(`(local.get $${node.name})`);
    } else if (node instanceof BinaryExpression) {
      this.generateBinaryExpression(node);
    } else if (node instanceof AssignmentExpression) {
      this.generateExpression(node.value);
      this.emit(`(local.set $${node.name})`);
      this.emit(`(local.get $${node.name})`); // Assignment is an expression
    } else if (node instanceof CallExpression) {
      // Push arguments
      for (const arg of node.args) {
        this.generateExpression(arg);
      }
      
      this.emit(`(call $${node.callee})`);
    }
  }
  
  generateBinaryExpression(node) {
    this.generateExpression(node.left);
    this.generateExpression(node.right);
    
    // Determine type (assume left operand type)
    const type = this.getExpressionType(node.left);
    
    const opMap = {
      [TokenType.PLUS]: `${type}.add`,
      [TokenType.MINUS]: `${type}.sub`,
      [TokenType.STAR]: `${type}.mul`,
      [TokenType.SLASH]: type.startsWith('i') ? `${type}.div_s` : `${type}.div`,
      [TokenType.PERCENT]: `${type}.rem_s`,
      [TokenType.EQ_EQ]: `${type}.eq`,
      [TokenType.NOT_EQ]: `${type}.ne`,
      [TokenType.LT]: type.startsWith('i') ? `${type}.lt_s` : `${type}.lt`,
      [TokenType.GT]: type.startsWith('i') ? `${type}.gt_s` : `${type}.gt`,
      [TokenType.LT_EQ]: type.startsWith('i') ? `${type}.le_s` : `${type}.le`,
      [TokenType.GT_EQ]: type.startsWith('i') ? `${type}.ge_s` : `${type}.ge`
    };
    
    this.emit(`(${opMap[node.operator]})`);
  }
  
  getExpressionType(node) {
    if (node instanceof Literal) {
      return node.valueType;
    }
    
    return 'i32'; // Default
  }
}

Phase 5: Binary Generation

const wabt = require('wabt')(); // WebAssembly Binary Toolkit

class Compiler {
  compile(source) {
    try {
      // Lexical analysis
      const lexer = new Lexer(source);
      const tokens = lexer.tokenize();
      
      console.log('✓ Lexical analysis complete');
      
      // Syntax analysis
      const parser = new Parser(tokens);
      const ast = parser.parse();
      
      console.log('✓ Syntax analysis complete');
      
      // Semantic analysis
      const analyzer = new SemanticAnalyzer(ast);
      analyzer.analyze();
      
      console.log('✓ Semantic analysis complete');
      
      // Code generation
      const generator = new WATGenerator(ast);
      const wat = generator.generate();
      
      console.log('✓ Code generation complete');
      console.log('\nGenerated WAT:\n');
      console.log(wat);
      
      return wat;
    } catch (error) {
      console.error('Compilation error:', error.message);
      throw error;
    }
  }
  
  async compileToWasm(source) {
    const wat = this.compile(source);
    
    // Convert WAT to WASM binary
    const wasmModule = wabt.parseWat('module.wat', wat);
    const { buffer } = wasmModule.toBinary({});
    
    return buffer;
  }
}

Complete Example

// Example source code
const source = `
@export
fn fibonacci(n: i32): i32 {
  if (n <= 1) {
    return n;
  } else {
    return fibonacci(n - 1) + fibonacci(n - 2);
  }
}

@export
fn factorial(n: i32): i32 {
  let result: i32 = 1;
  let i: i32 = 2;
  
  while (i <= n) {
    result = result * i;
    i = i + 1;
  }
  
  return result;
}

@export
fn sumSquares(a: i32, b: i32): i32 {
  return (a * a) + (b * b);
}
`;

// Compile
const compiler = new Compiler();

async function main() {
  try {
    const wasmBuffer = await compiler.compileToWasm(source);
    
    // Save to file (Node.js)
    const fs = require('fs');
    fs.writeFileSync('output.wasm', wasmBuffer);
    
    console.log('\n✓ Binary written to output.wasm');
    
    // Load and test
    const wasmModule = await WebAssembly.compile(wasmBuffer);
    const instance = await WebAssembly.instantiate(wasmModule);
    
    console.log('\nTesting compiled functions:');
    console.log('fibonacci(10) =', instance.exports.fibonacci(10));
    console.log('factorial(5) =', instance.exports.factorial(5));
    console.log('sumSquares(3, 4) =', instance.exports.sumSquares(3, 4));
    
  } catch (error) {
    console.error('Error:', error);
  }
}

main();

Output: ✓ Lexical analysis complete ✓ Syntax analysis complete ✓ Semantic analysis complete ✓ Code generation complete

Generated WAT:

(module (func $fibonacci (param $n i32) (result i32) (if (i32.le_s (local.get $n) (i32.const 1) ) (then (local.get $n) (return) ) (else (i32.add (call $fibonacci (i32.sub (local.get $n) (i32.const 1) ) ) (call $fibonacci (i32.sub (local.get $n) (i32.const 2) ) ) ) (return) ) ) ) (export “fibonacci” (func $fibonacci)) … )

✓ Binary written to output.wasm

Testing compiled functions: fibonacci(10) = 55 factorial(5) = 120 sumSquares(3, 4) = 25


Optimization Passes (Bonus)

class Optimizer {
  optimize(ast) {
    ast = this.constantFolding(ast);
    ast = this.deadCodeElimination(ast);
    return ast;
  }
  
  constantFolding(node) {
    if (node instanceof BinaryExpression) {
      const left = this.constantFolding(node.left);
      const right = this.constantFolding(node.right);
      
      // Both operands are constants
      if (left instanceof Literal && right instanceof Literal) {
        const result = this.evaluateBinaryOp(
          node.operator,
          left.value,
          right.value
        );
        
        return new Literal(result, left.valueType, node.line, node.column);
      }
      
      node.left = left;
      node.right = right;
    }
    
    // Recursively optimize children...
    
    return node;
  }
  
  evaluateBinaryOp(operator, left, right) {
    switch (operator) {
      case TokenType.PLUS: return left + right;
      case TokenType.MINUS: return left - right;
      case TokenType.STAR: return left * right;
      case TokenType.SLASH: return Math.floor(left / right);
      // ... other operators
    }
  }
  
  deadCodeElimination(node) {
    // Remove unreachable code after return statements
    // ...
    return node;
  }
}

Summary

Building a WebAssembly compiler involves:

  1. Lexical Analysis: Tokenize source code

  2. Syntax Analysis: Build AST from tokens

  3. Semantic Analysis: Type checking, symbol resolution

  4. Code Generation: Emit WAT (WebAssembly Text)

  5. Binary Generation: Convert WAT to .wasm binary

Key Concepts:

  • S-expressions: WAT uses Lisp-like syntax

  • Type system: Explicit types (i32, i64, f32, f64)

  • Stack machine: Expressions leave results on stack

  • Structured control flow: block, loop, if

  • Local variables: Indexed, typed locals

Tools:

  • WABT (WebAssembly Binary Toolkit): wat2wasm, wasm2wat

  • Binaryen: Optimization and validation

  • Emscripten: C/C++ to WebAssembly

This foundation enables creating domain-specific languages, transpilers, or optimizing compilers targeting WebAssembly!


Chapter 17: WebAssembly System Interface (WASI)

Introduction: Beyond the Browser

WebAssembly was designed for the web, but its portability and safety make it attractive for standalone applications outside browsers. However, the WebAssembly specification deliberately avoids defining system APIs (file I/O, networking, environment access) to remain platform-agnostic.

The Problem:

  • Browser: Rich APIs (DOM, fetch, WebGL) but sandboxed

  • Server/CLI: No standard APIs—each runtime invents its own

  • Result: Portability broken outside browsers

WASI (WebAssembly System Interface) solves this by providing:

  1. Standardized system calls (POSIX-like APIs)

  2. Capability-based security (explicit permissions)

  3. Cross-platform compatibility (Windows, Linux, macOS)

  4. Language-agnostic (works from C, Rust, Go, etc.)


WASI Design Principles

1. Capability-Based Security

Traditional POSIX: Ambient authority (any code can access any file if OS permits)

// Traditional C - can access ANY file
FILE* f = fopen("/etc/passwd", "r");  // OS decides access

WASI: Explicit capabilities (must be granted by host)

// WASI - file descriptor must be pre-opened by host
// Application can ONLY access what was granted
int fd = __wasi_path_open(preopened_dir, "data.txt", ...);

Key Concept: The host grants specific directories/resources. The WASM module cannot escape the sandbox.

2. Virtualization

WASI virtualizes OS concepts:

  • File descriptors (stdin, stdout, stderr, files, sockets)

  • Clocks (monotonic, realtime)

  • Random data (secure entropy)

  • Environment variables

  • Command-line arguments

This allows WASM modules to be portable across different host environments.


WASI Preview 1 (Current Stable)

Core API Functions

WASI functions follow the naming pattern __wasi_* and return errno-style error codes.

File Descriptors
// Read from a file descriptor
__wasi_errno_t __wasi_fd_read(
    __wasi_fd_t fd,               // File descriptor
    const __wasi_iovec_t *iovs,   // I/O vectors (scatter-gather)
    size_t iovs_len,              // Number of vectors
    __wasi_size_t *nread          // Bytes actually read
);

// Write to a file descriptor
__wasi_errno_t __wasi_fd_write(
    __wasi_fd_t fd,
    const __wasi_ciovec_t *iovs,
    size_t iovs_len,
    __wasi_size_t *nwritten
);

// Close a file descriptor
__wasi_errno_t __wasi_fd_close(__wasi_fd_t fd);

// Seek within a file
__wasi_errno_t __wasi_fd_seek(
    __wasi_fd_t fd,
    __wasi_filedelta_t offset,
    __wasi_whence_t whence,
    __wasi_filesize_t *newoffset
);
Path Operations
// Open a file relative to a directory
__wasi_errno_t __wasi_path_open(
    __wasi_fd_t dirfd,                    // Pre-opened directory
    __wasi_lookupflags_t dirflags,
    const char *path,
    size_t path_len,
    __wasi_oflags_t oflags,               // O_CREAT, O_TRUNC, etc.
    __wasi_rights_t fs_rights_base,
    __wasi_rights_t fs_rights_inheriting,
    __wasi_fdflags_t fs_flags,
    __wasi_fd_t *fd
);

// Create a directory
__wasi_errno_t __wasi_path_create_directory(
    __wasi_fd_t fd,
    const char *path,
    size_t path_len
);

// Remove a file
__wasi_errno_t __wasi_path_unlink_file(
    __wasi_fd_t fd,
    const char *path,
    size_t path_len
);

// Get file metadata
__wasi_errno_t __wasi_path_filestat_get(
    __wasi_fd_t fd,
    __wasi_lookupflags_t flags,
    const char *path,
    size_t path_len,
    __wasi_filestat_t *buf
);
Environment & Args
// Get size of environment variables
__wasi_errno_t __wasi_environ_sizes_get(
    __wasi_size_t *environ_count,
    __wasi_size_t *environ_buf_size
);

// Get environment variables
__wasi_errno_t __wasi_environ_get(
    uint8_t **environ,
    uint8_t *environ_buf
);

// Get command-line argument sizes
__wasi_errno_t __wasi_args_sizes_get(
    __wasi_size_t *argc,
    __wasi_size_t *argv_buf_size
);

// Get command-line arguments
__wasi_errno_t __wasi_args_get(
    uint8_t **argv,
    uint8_t *argv_buf
);
Clock & Time
// Get current time
__wasi_errno_t __wasi_clock_time_get(
    __wasi_clockid_t clock_id,    // REALTIME or MONOTONIC
    __wasi_timestamp_t precision,
    __wasi_timestamp_t *time
);

// High-resolution sleep
__wasi_errno_t __wasi_poll_oneoff(
    const __wasi_subscription_t *in,
    __wasi_event_t *out,
    __wasi_size_t nsubscriptions,
    __wasi_size_t *nevents
);
Random Data
// Get cryptographically secure random bytes
__wasi_errno_t __wasi_random_get(
    uint8_t *buf,
    __wasi_size_t buf_len
);
Process Control
// Exit the process
_Noreturn void __wasi_proc_exit(__wasi_exitcode_t rval);

// Raise a signal (limited support)
__wasi_errno_t __wasi_proc_raise(__wasi_signal_t sig);

Example 1: Hello World in WASI

WAT Implementation

(module
  ;; Import fd_write from WASI
  (import "wasi_snapshot_preview1" "fd_write"
    (func $fd_write (param i32 i32 i32 i32) (result i32)))
  
  (memory 1)
  (export "memory" (memory 0))
  
  ;; Write "Hello, WASI!\n" at offset 8
  (data (i32.const 8) "Hello, WASI!\n")
  
  (func $main (export "_start")
    ;; Create I/O vector in memory at offset 0
    ;; iov_base = 8 (pointer to string)
    (i32.store (i32.const 0) (i32.const 8))
    
    ;; iov_len = 13 (length of string)
    (i32.store (i32.const 4) (i32.const 13))
    
    ;; Call fd_write(stdout=1, iovs=0, iovs_len=1, nwritten=16)
    (call $fd_write
      (i32.const 1)   ;; stdout
      (i32.const 0)   ;; pointer to iovs
      (i32.const 1)   ;; number of iovs
      (i32.const 16)  ;; pointer to store nwritten
    )
    drop  ;; Ignore return value
  )
)

Running with WASI Runtime

# Compile WAT to WASM
wat2wasm hello.wat -o hello.wasm

# Run with wasmtime
wasmtime hello.wasm
# Output: Hello, WASI!

# Run with wasmer
wasmer hello.wasm
# Output: Hello, WASI!

Example 2: File I/O in Rust

Rust Source

// Compile with: rustc --target wasm32-wasi -O main.rs

use std::fs;
use std::io::{self, Write};

fn main() -> io::Result<()> {
    // Write to a file
    fs::write("output.txt", "Hello from WASI!\n")?;
    
    // Read from the file
    let contents = fs::read_to_string("output.txt")?;
    
    // Write to stdout
    io::stdout().write_all(contents.as_bytes())?;
    
    // List directory contents
    for entry in fs::read_dir(".")? {
        let entry = entry?;
        println!("{}", entry.path().display());
    }
    
    Ok(())
}

Compiling and Running

# Compile to WASI
rustc --target wasm32-wasi -O main.rs -o app.wasm

# Run with directory access
wasmtime --dir=. app.wasm
# Output:
# Hello from WASI!
# ./app.wasm
# ./output.txt
# ...

# Try WITHOUT directory permission (will fail)
wasmtime app.wasm
# Error: access denied

Key Point: The --dir=. flag grants the WASM module access to the current directory. Without it, file operations fail (capability-based security in action).


Example 3: Environment Variables & Arguments

C Source

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv) {
    // Print arguments
    printf("Arguments (%d):\n", argc);
    for (int i = 0; i < argc; i++) {
        printf("  argv[%d] = %s\n", i, argv[i]);
    }
    
    // Print environment variables
    printf("\nEnvironment:\n");
    char *home = getenv("HOME");
    if (home) {
        printf("  HOME = %s\n", home);
    }
    
    char *path = getenv("PATH");
    if (path) {
        printf("  PATH = %s\n", path);
    }
    
    return 0;
}

Running

# Compile
clang --target=wasm32-wasi -O2 env.c -o env.wasm

# Run with environment variables
wasmtime env.wasm arg1 arg2 --env HOME=/home/user --env PATH=/bin
# Output:
# Arguments (3):
#   argv[0] = env.wasm
#   argv[1] = arg1
#   argv[2] = arg2
#
# Environment:
#   HOME = /home/user
#   PATH = /bin

WASI I/O Vectors (Scatter-Gather)

WASI uses I/O vectors for efficient batched reads/writes:

typedef struct __wasi_iovec_t {
    uint8_t *buf;      // Pointer to buffer
    __wasi_size_t buf_len;  // Buffer length
} __wasi_iovec_t;

JavaScript Implementation

// Implementing fd_write in JavaScript
function fd_write(fd, iovs_ptr, iovs_len, nwritten_ptr) {
    const memory = wasmInstance.exports.memory.buffer;
    const view = new DataView(memory);
    
    let totalWritten = 0;
    
    // Read each iovec
    for (let i = 0; i < iovs_len; i++) {
        const iov_base = view.getUint32(iovs_ptr + (i * 8), true);
        const iov_len = view.getUint32(iovs_ptr + (i * 8) + 4, true);
        
        // Extract bytes
        const bytes = new Uint8Array(memory, iov_base, iov_len);
        
        // Write to appropriate output
        if (fd === 1) {  // stdout
            console.log(new TextDecoder().decode(bytes));
        } else if (fd === 2) {  // stderr
            console.error(new TextDecoder().decode(bytes));
        }
        
        totalWritten += iov_len;
    }
    
    // Write total bytes written
    view.setUint32(nwritten_ptr, totalWritten, true);
    
    return 0;  // Success
}

// Import object
const importObject = {
    wasi_snapshot_preview1: {
        fd_write: fd_write,
        proc_exit: (code) => {
            console.log(`Process exited with code ${code}`);
        }
        // ... other WASI functions
    }
};

const instance = await WebAssembly.instantiate(wasmBytes, importObject);
instance.exports._start();

WASI Runtimes

  1. Wasmtime (Bytecode Alliance)

    # Install
    curl https://wasmtime.dev/install.sh -sSf | bash
    
    # Run with capabilities
    wasmtime --dir=. --env KEY=value app.wasm arg1 arg2
  2. Wasmer (Wasmer Inc.)

    # Install
    curl https://get.wasmer.io -sSf | sh
    
    # Run
    wasmer run app.wasm --dir=. -- arg1 arg2
  3. WasmEdge (CNCF Project)

    # Install
    curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install.sh | bash
    
    # Run
    wasmedge --dir=.:. app.wasm
  4. Node.js (Experimental)

    const { WASI } = require('wasi');
    const fs = require('fs');
    
    const wasi = new WASI({
        args: process.argv,
        env: process.env,
        preopens: {
            '/sandbox': '/real/path'
        }
    });
    
    const importObject = { wasi_snapshot_preview1: wasi.wasiImport };
    
    (async () => {
        const wasm = await WebAssembly.compile(fs.readFileSync('./app.wasm'));
        const instance = await WebAssembly.instantiate(wasm, importObject);
    
        wasi.start(instance);
    })();

Advanced WASI Features

1. Pre-opened Directories

# Map local directory to virtual path
wasmtime --mapdir=/app:/home/user/project app.wasm

# Inside WASM, "/" maps to the mapped directory

### 2. Network Sockets (WASI Preview 2)

rust
// Future WASI will support networking
use std::net::TcpListener;

let listener = TcpListener::bind("127.0.0.1:8080")?;

for stream in listener.incoming() {
    // Handle connection
}

### 3. Asynchronous I/O

c
// Poll for events
__wasi_subscription_t subscriptions[2];

// Subscribe to stdin readability
subscriptions[0].u.tag = __WASI_EVENTTYPE_FD_READ;
subscriptions[0].u.u.fd_read.file_descriptor = 0;  // stdin

// Subscribe to timer
subscriptions[1].u.tag = __WASI_EVENTTYPE_CLOCK;
subscriptions[1].u.u.clock.timeout = 1000000000;  // 1 second

__wasi_event_t events[2];
size_t nevents;

__wasi_poll_oneoff(subscriptions, events, 2, &nevents);


---

## WASI vs. Emscripten

| Feature | WASI | Emscripten |
|---------|------|------------|
| **Target** | Standalone/Server | Browser |
| **APIs** | POSIX-like system calls | Browser APIs (DOM, WebGL) |
| **Binary Size** | Small (~100KB runtime) | Large (>1MB runtime) |
| **Startup** | Fast | Slower (async compilation) |
| **Security** | Capability-based | Browser sandbox |
| **Portability** | High (cross-runtime) | Browser-only |


**When to use**:

- **WASI**: CLI tools, servers, plugins, edge computing

- **Emscripten**: Games, graphics, browser apps


---

## WASI Preview 2 & Component Model

### Current Evolution

WASI is transitioning to a **component model** with:


1. **Interface Types**: High-level types (strings, lists, records)

2. **Components**: Composable WASM modules

3. **WIT (WebAssembly Interface Types)**: IDL for components

```wit
// Define an interface in WIT
interface filesystem {
    record file-stat {
        size: u64,
        modified: u64
    }
    
    read-file: func(path: string) -> result<list<u8>, error>
    write-file: func(path: string, data: list<u8>) -> result<_, error>
    stat: func(path: string) -> result<file-stat, error>
}

Component Composition

bash # Compose components wasm-tools compose producer.wasm consumer.wasm -o app.wasm

Producer exports an interface

Consumer imports and uses it

No JavaScript glue needed!


Summary

WASI provides:

  1. Standard System API: POSIX-like functions portable across runtimes

  2. Capability Security: Explicit permission model (no ambient authority)

  3. Language Neutral: Works from C, Rust, Go, Zig, etc.

  4. Cross-Platform: Windows, Linux, macOS, embedded systems

Key Functions:

  • File I/O: fd_read, fd_write, path_open

  • Environment: environ_get, args_get

  • Time: clock_time_get, poll_oneoff

  • Random: random_get

Future (Preview 2):

  • Networking (sockets)

  • Component model (high-level composition)

  • Interface types (no manual serialization)

WASI enables WebAssembly to become a universal runtime for portable, secure applications beyond the browser!


Chapter 18: JavaScript Engines and WebAssembly Implementations

Introduction: The Engine Landscape

JavaScript engines are the heart of modern web browsers and runtimes. They parse, compile, and execute JavaScript code—and increasingly, WebAssembly modules. Understanding how these engines work internally reveals why WebAssembly is fast and how it integrates with JavaScript.

Major JavaScript Engines:

  1. V8 (Google) - Chrome, Node.js, Deno, Edge

  2. SpiderMonkey (Mozilla) - Firefox

  3. JavaScriptCore (JSC) (Apple) - Safari, Bun

  4. ChakraCore (Microsoft) - Legacy Edge (archived)

Each engine implements both the ECMAScript specification and the WebAssembly specification, but with different architectural approaches.


JavaScript Engine Architecture

Traditional JavaScript Pipeline

Source Code → Lexer → Parser → AST → Bytecode → Interpreter ↓ Profiler (Hot Path Detection) ↓ JIT Compiler (Optimized Machine Code)

V8 Architecture (Modern Multi-Tier)

JavaScript Source ↓ Parser → AST ↓ Ignition (Bytecode Interpreter) ↓ (if hot) TurboFan (Optimizing Compiler) ↓ Machine Code (x64, ARM64, etc.)

Key Insight: JavaScript engines use tiered compilation:

  1. Interpreter (fast startup, slower execution)

  2. Baseline JIT (quick compilation, moderate performance)

  3. Optimizing JIT (slow compilation, peak performance)

This trades off startup time vs. steady-state performance.


WebAssembly in JavaScript Engines

WebAssembly Pipeline

.wasm binary ↓ Validation (type checking, structure verification) ↓ Baseline Compiler (fast, unoptimized machine code) ↓ (background thread) Optimizing Compiler (TurboFan/Ion/OMG) ↓ Optimized Machine Code

Key Differences from JavaScript:

Feature JavaScript WebAssembly
Parsing Complex (JS syntax) Simple (binary format)
Validation Runtime type checks Ahead-of-time type checking
Compilation Deferred (lazy) Eager (immediate)
Optimization Speculative (can deoptimize) Stable (types known)
Predictability Variable (depends on profile) Consistent (no surprises)

V8 Engine Deep Dive

V8 Components

  1. Parser: Converts JavaScript source to AST

  2. Ignition: Bytecode interpreter

  3. TurboFan: Optimizing compiler

  4. Liftoff: WebAssembly baseline compiler

  5. TurboFan (Wasm): WebAssembly optimizing compiler

V8 WebAssembly Compilation Strategy

// Streaming compilation
const response = await fetch('module.wasm');
const { instance } = await WebAssembly.instantiateStreaming(response);

// Behind the scenes:
// 1. Download starts
// 2. Liftoff compiles incrementally as bytes arrive
// 3. TurboFan compiles in background
// 4. Hot functions switch to TurboFan code
Liftoff (Baseline Compiler)

Goal: Fast compilation, reasonable performance

Wasm Function → Liftoff ↓

  1. Simple register allocation

  2. One-pass code generation

  3. No optimizations

  4. ~10x faster than TurboFan compilation ↓ Machine Code (adequate performance)

Example: Add Function

(func $add (param $a i32) (param $b i32) (result i32)
  local.get $a
  local.get $b
  i32.add
)

Liftoff Output (pseudo-assembly):

; Load parameters from stack/registers
mov eax, [param_a]
mov ebx, [param_b]

; Add
add eax, ebx

; Return (result in eax)
ret
TurboFan (Optimizing Compiler)

Goal: Maximum performance, slower compilation

Wasm Function → TurboFan ↓

  1. Build intermediate representation (Sea of Nodes)

  2. Type analysis

  3. Inlining

  4. Loop optimizations

  5. Register allocation

  6. Code generation ↓ Optimized Machine Code

Optimizations:

  • Constant Folding: i32.const 2 i32.const 3 i32.addi32.const 5

  • Dead Code Elimination: Remove unreachable code

  • Loop Invariant Code Motion: Hoist constant calculations out of loops

  • Inlining: Replace function calls with function bodies

  • SIMD Vectorization: Use SIMD instructions when available

Example: Optimized Loop

(func $sum (param $n i32) (result i32)
  (local $i i32)
  (local $sum i32)
  
  (loop $continue
    ;; sum += i
    local.get $sum
    local.get $i
    i32.add
    local.set $sum
    
    ;; i++
    local.get $i
    i32.const 1
    i32.add
    local.tee $i
    
    ;; if (i < n) continue
    local.get $n
    i32.lt_s
    br_if $continue
  )
  
  local.get $sum
)

TurboFan Optimizations:

  1. Strength Reduction: Convert i32.mul by powers of 2 to shifts

  2. Loop Unrolling: Process multiple iterations per loop

  3. Register Allocation: Keep $sum, $i, $n in registers


SpiderMonkey Engine (Firefox)

Architecture

JavaScript Source ↓ Parser → Bytecode ↓ Baseline Interpreter ↓ (warm) Baseline JIT Compiler (Ion Baseline) ↓ (hot) Ion (Optimizing Compiler)

SpiderMonkey WebAssembly

  1. Baseline Compiler: Fast, simple code generation

  2. Ion (Wasm): Optimizing compiler with aggressive optimizations

Unique Features:

  • Cranelift Backend (optional): Rust-based code generator

  • Streaming Compilation: Compile as download progresses

  • Tier-up Strategy: Automatically promote hot code

Ion Optimizations

(func $matrix_multiply (param $n i32)
  ;; ... matrix multiplication loop ...
)

Ion Optimizations:

  • Loop Vectorization: Use SIMD for parallel operations

  • Bounds Check Elimination: Remove redundant array bounds checks

  • Type Specialization: Optimize for specific numeric types


JavaScriptCore (Safari)

Architecture

JavaScript Source ↓ Parser → Bytecode ↓ LLInt (Low-Level Interpreter) ↓ (warm) Baseline JIT ↓ (hot) DFG (Data Flow Graph JIT) ↓ (very hot) FTL (Faster Than Light) - uses B3/Air backend

JavaScriptCore WebAssembly

  1. BBQ (Baseline): Fast baseline compiler

  2. OMG (Optimizing): Advanced optimizing compiler using B3 backend

B3 (Bare Bones Backend):

  • Intermediate representation for low-level code

  • Shared between JavaScript (FTL) and WebAssembly (OMG)

  • Aggressive optimizations:

    • Instruction selection

    • Register allocation

    • Code layout optimization

Example: SIMD Optimization

(func $add_vectors (param $a i32) (param $b i32) (param $result i32) (param $len i32)
  (local $i i32)
  
  (loop $loop
    ;; Load SIMD vectors
    (v128.load (local.get $a))
    (v128.load (local.get $b))
    
    ;; Add vectors
    i32x4.add
    
    ;; Store result
    (local.get $result)
    v128.store
    
    ;; Increment pointers
    (local.set $a (i32.add (local.get $a) (i32.const 16)))
    (local.set $b (i32.add (local.get $b) (i32.const 16)))
    (local.set $result (i32.add (local.get $result) (i32.const 16)))
    (local.set $i (i32.add (local.get $i) (i32.const 4)))
    
    ;; Loop condition
    (local.get $i)
    (local.get $len)
    i32.lt_u
    br_if $loop
  )
)

OMG Optimization: Generates native SIMD instructions (SSE/AVX on x86, NEON on ARM)


Memory Management

Linear Memory in Engines

WebAssembly linear memory is separate from JavaScript heap:

┌│─────────────────────────────────────┐

││ JavaScript Engine Memory │

├│─────────────────────────────────────┤

││ JavaScript Objects (GC-managed) │

││ - Strings, Arrays, Objects │

├│─────────────────────────────────────┤

││ WebAssembly Linear Memory │

││ - ArrayBuffer (unmanaged) │

││ - Fixed layout, no GC │ └─────────────────────────────────────┘

V8 Memory Representation

// JavaScript side
const memory = new WebAssembly.Memory({ initial: 10, maximum: 100 });
const buffer = memory.buffer;  // ArrayBuffer

// Behind the scenes in V8:
// 1. Allocate ArrayBuffer (10 pages = 640KB)
// 2. Create backing store (native memory)
// 3. Associate ArrayBuffer with Wasm instance

Memory Growth:

(func $grow_memory
  ;; Grow by 1 page (64KB)
  i32.const 1
  memory.grow
  drop  ;; Ignore old page count
)

Engine Behavior:

  1. Allocate new backing store (larger)

  2. Copy existing data

  3. Update all TypedArray views

  4. Invalidate old buffer reference

let view = new Uint8Array(memory.buffer);
instance.exports.grow_memory();
// view is now DETACHED - must recreate
view = new Uint8Array(memory.buffer);

Garbage Collection Integration

Current State (MVP)

WebAssembly cannot directly reference JavaScript objects (no GC integration).

Workaround: Reference Types

// JavaScript table of objects
const table = new WebAssembly.Table({
    element: "anyfunc",
    initial: 10
});

// Store JavaScript function
table.set(0, () => console.log("Called from Wasm!"));

// Wasm can call via index
(call_indirect (type $void_to_void) (i32.const 0))

Future: GC Proposal

Goal: Allow WebAssembly to create and manipulate GC’d objects

;; Future syntax (GC proposal)
(type $point (struct
  (field $x (mut i32))
  (field $y (mut i32))
))

(func $create_point (param $x i32) (param $y i32) (result (ref $point))
  ;; Allocate GC'd struct
  (struct.new $point (local.get $x) (local.get $y))
)

(func $get_x (param $p (ref $point)) (result i32)
  (struct.get $point $x (local.get $p))
)

Engine Implementation:

  • Wasm structs live in JavaScript GC heap

  • Same GC algorithms (generational, incremental)

  • Efficient cross-language references


Optimization Challenges

Deoptimization (JavaScript-specific)

JavaScript engines use speculative optimization:

function add(a, b) {
    return a + b;
}

// First 1000 calls: a and b are numbers
// Engine optimizes: add = (int a, int b) => a + b

add(1, 2);  // Fast path

// 1001st call: a is a string
add("hello", " world");  // DEOPTIMIZATION!

// Engine reverts to unoptimized code

WebAssembly Advantage: Types are static → no deoptimization

Hidden Classes (JavaScript)

function Point(x, y) {
    this.x = x;  // Hidden class C0
    this.y = y;  // Transition to C1
}

// Same property order → same hidden class → fast property access
const p1 = new Point(1, 2);  // Uses C1
const p2 = new Point(3, 4);  // Uses C1

WebAssembly: No hidden classes needed (fixed memory layout)

Inline Caching

JavaScript engines cache property lookups:

function getName(obj) {
    return obj.name;  // IC: cache offset of 'name' for seen object shapes
}

WebAssembly: Direct memory access (no IC needed)


Performance Comparison

Benchmark: Fibonacci (Recursive)

// JavaScript
function fib(n) {
    if (n <= 1) return n;
    return fib(n - 1) + fib(n - 2);
}
;; WebAssembly
(func $fib (param $n i32) (result i32)
  (if (result i32) (i32.le_s (local.get $n) (i32.const 1))
    (then (local.get $n))
    (else
      (i32.add
        (call $fib (i32.sub (local.get $n) (i32.const 1)))
        (call $fib (i32.sub (local.get $n) (i32.const 2)))
      )
    )
  )
)

Results (fib(40), V8):

  • JavaScript: ~800ms

  • WebAssembly: ~600ms

  • Speedup: 1.3x

Why WebAssembly is faster:

  1. No type checks (static typing)

  2. No deoptimization risk

  3. Better register allocation

  4. Predictable performance

Benchmark: Matrix Multiplication

// Compiled to WebAssembly
void matmul(float *a, float *b, float *c, int n) {
    for (int i = 0; i < n; i++) {
        for (int j = 0; j < n; j++) {
            float sum = 0;
            for (int k = 0; k < n; k++) {
                sum += a[i*n + k] * b[k*n + j];
            }
            c[i*n + j] = sum;
        }
    }
}

Results (1024×1024 matrices):

  • JavaScript (typed arrays): ~3000ms

  • WebAssembly (scalar): ~1200ms

  • WebAssembly (SIMD): ~400ms

  • Speedup: 2.5x (scalar), 7.5x (SIMD)

Why SIMD is faster:

  • Process 4 floats per instruction

  • Better instruction-level parallelism

  • Reduced loop overhead


Tooling and Debugging

Chrome DevTools

WebAssembly Debugging:

  1. Source Maps: Map .wasm to original C/Rust/etc.

  2. Breakpoints: Set breakpoints in WebAssembly code

  3. Call Stack: Mixed JavaScript/WebAssembly stack traces

  4. Memory Inspector: View linear memory as hex/typed arrays

// Enable source maps
instance = await WebAssembly.instantiateStreaming(
    fetch('module.wasm'),
    imports,
    { sourceMap: 'module.wasm.map' }
);

V8 Flags for Profiling

# Run Node.js with profiling
node --prof --no-logfile-per-isolate app.js

# Profile WebAssembly compilation
node --trace-wasm-compiler app.js

# Detailed TurboFan output
node --trace-turbo --trace-turbo-graph app.js

SpiderMonkey Profiling

# Firefox with profiling
firefox --profiler

# Capture profile
# Tools → Web Developer → Performance
# Record → Analyze WebAssembly execution

Future Developments

1. WebAssembly Tail Calls

Problem: Deep recursion causes stack overflow

(func $factorial_tail (param $n i64) (param $acc i64) (result i64)
  (if (result i64) (i64.eqz (local.get $n))
    (then (local.get $acc))
    (else
      ;; Tail call (doesn't grow stack)
      (return_call $factorial_tail
        (i64.sub (local.get $n) (i64.const 1))
        (i64.mul (local.get $n) (local.get $acc))
      )
    )
  )
)

Engine Support: V8, SpiderMonkey (implemented)

2. WebAssembly Threads

Shared Memory:

const memory = new WebAssembly.Memory({
    initial: 10,
    maximum: 100,
    shared: true  // Shared between workers
});

const worker = new Worker('worker.js');
worker.postMessage({ memory });

Atomic Operations:

;; Atomic increment
(i32.atomic.rmw.add (i32.const 0) (i32.const 1))

3. Exception Handling

(try (result i32)
  (do
    (call $may_throw)
  )
  (catch $exception
    (i32.const -1)  ;; Error code
  )
)

Engine Integration: Unified exception handling across JS/Wasm


Summary

JavaScript Engines use tiered compilation:

  • JavaScript: Interpreter → Baseline JIT → Optimizing JIT (speculative)

  • WebAssembly: Baseline Compiler → Optimizing Compiler (stable)

Key Engine Components:

Engine Wasm Baseline Wasm Optimizing Backend
V8 Liftoff TurboFan Custom
SpiderMonkey Baseline Ion Cranelift (optional)
JavaScriptCore BBQ OMG B3/Air

WebAssembly Advantages:

  1. Static typing: No runtime type checks

  2. Predictable performance: No deoptimization

  3. Efficient compilation: Simple binary format

  4. SIMD support: Vectorized operations

  5. Low overhead: Direct memory access

Performance:

  • Typical speedup: 1.5x–3x over JavaScript

  • SIMD workloads: 5x–10x speedup

  • Consistent performance (no warmup needed)

Future:

  • GC integration (direct object references)

  • Tail calls (efficient recursion)

  • Threads (parallel execution)

  • Exception handling (unified with JavaScript)

JavaScript engines are rapidly evolving to make WebAssembly a first-class citizen alongside JavaScript!


Appendix A: Quick Reference – JavaScript Syntax for C Programmers

This appendix provides a practical reference for C programmers learning JavaScript, particularly in the context of WebAssembly interop. It maps common C constructs to their JavaScript equivalents.


A.1 Variables and Data Types

Variable Declaration

C JavaScript Notes
int x; let x; Block-scoped, mutable
int x = 42; let x = 42; Type inferred dynamically
const int MAX = 100; const MAX = 100; Immutable binding
static int count = 0; let count = 0; (module scope) No static keyword

Key Difference: JavaScript is dynamically typed — types are determined at runtime.

let x = 42;      // Number
x = "hello";     // Now a String (allowed!)
x = [1, 2, 3];   // Now an Array (allowed!)

Primitive Types

C Type JavaScript Type Size Range
int Number 64-bit float ±2^53 (safe integers)
long long BigInt Arbitrary Unlimited (use 42n)
float Number 64-bit float IEEE 754 double
double Number 64-bit float IEEE 754 double
char String 16-bit UTF-16 Single character string
bool Boolean N/A true or false
void* N/A Use indices Pointers → array indices

Important: JavaScript Number is always a 64-bit IEEE 754 float. For WebAssembly i64, use BigInt:

// WASM i32 → JavaScript Number
const x = 42;

// WASM i64 → JavaScript BigInt
const y = 42n;  // Note the 'n' suffix

// BigInt operations
const sum = 100n + 200n;  // 300n

Type Checking

// C: compile-time type checking
// JavaScript: runtime type checking

typeof 42;              // "number"
typeof 42n;             // "bigint"
typeof "hello";         // "string"
typeof true;            // "boolean"
typeof undefined;       // "undefined"
typeof null;            // "object" (historical quirk!)
typeof [1, 2, 3];       // "object"
typeof {x: 1};          // "object"

A.2 Arrays and Memory

Arrays

C JavaScript Notes
int arr[10]; let arr = new Array(10); Creates sparse array
int arr[] = {1, 2, 3}; let arr = [1, 2, 3]; Array literal
arr[i] arr[i] Zero-indexed
sizeof(arr)/sizeof(arr[0]) arr.length Dynamic property

Typed Arrays (for WebAssembly interop):

// C: int buffer[1024];
// JavaScript: Fixed-type, efficient arrays

const buffer = new Int32Array(1024);      // 32-bit signed integers
buffer[0] = 42;

const floats = new Float32Array(256);     // 32-bit floats
const bytes = new Uint8Array(1024);       // 8-bit unsigned

// Backed by ArrayBuffer (WebAssembly linear memory)
const memory = new WebAssembly.Memory({ initial: 1 });
const view = new Uint8Array(memory.buffer);

Common Typed Arrays

JavaScript Type C Equivalent Bytes per Element
Int8Array int8_t[] 1
Uint8Array uint8_t[] 1
Int16Array int16_t[] 2
Uint16Array uint16_t[] 2
Int32Array int32_t[] 4
Uint32Array uint32_t[] 4
Float32Array float[] 4
Float64Array double[] 8
BigInt64Array int64_t[] 8
BigUint64Array uint64_t[] 8

Pointers and Memory Access

// C: Pointer arithmetic
int *ptr = array;
int value = *(ptr + 5);  // array[5]
ptr++;                    // Move to next element
// JavaScript: No pointers, use indices
const array = new Int32Array(buffer);
const value = array[5];   // Direct indexing
// No pointer arithmetic needed

WebAssembly Memory Model:

// Linear memory is a giant ArrayBuffer
const memory = instance.exports.memory;
const bytes = new Uint8Array(memory.buffer);

// Read 32-bit int at offset 100
const view32 = new Int32Array(memory.buffer);
const value = view32[25];  // offset 100 ÷ 4 bytes = index 25

// Or use DataView for mixed types
const dataView = new DataView(memory.buffer);
const value = dataView.getInt32(100, true);  // true = little-endian

A.3 Strings

String Basics

C JavaScript Notes
char str[] = "hello"; let str = "hello"; Immutable in JS
char *str = "hello"; const str = "hello"; String literal
strlen(str) str.length Property, not function
strcmp(a, b) a === b Direct comparison
strcat(dest, src) dest + src Concatenation

Key Difference: JavaScript strings are immutable and UTF-16 encoded.

const str = "hello";
str[0] = "H";      // Does nothing (silently fails)
const upper = str.toUpperCase();  // Returns new string "HELLO"

String Operations

// Length
"hello".length;  // 5

// Indexing (read-only)
"hello"[0];      // "h"
"hello".charAt(0);  // "h"

// Concatenation
"hello" + " " + "world";  // "hello world"

// Substring
"hello".substring(1, 4);  // "ell" (start, end)
"hello".slice(1, 4);      // "ell"

// Search
"hello".indexOf("ll");    // 2
"hello".includes("ll");   // true

// Case conversion
"hello".toUpperCase();    // "HELLO"
"HELLO".toLowerCase();    // "hello"

// Split
"a,b,c".split(",");       // ["a", "b", "c"]

C String ↔︎ JavaScript String (WebAssembly)

// Read C string from WebAssembly memory
function readCString(memory, offset) {
    const bytes = new Uint8Array(memory.buffer);
    let end = offset;
    
    // Find null terminator
    while (bytes[end] !== 0) end++;
    
    // Decode UTF-8 bytes
    const decoder = new TextDecoder();
    return decoder.decode(bytes.subarray(offset, end));
}

// Write JavaScript string to WebAssembly memory
function writeCString(memory, offset, str) {
    const bytes = new Uint8Array(memory.buffer);
    const encoder = new TextEncoder();
    const encoded = encoder.encode(str);
    
    bytes.set(encoded, offset);
    bytes[offset + encoded.length] = 0;  // Null terminator
    
    return offset;
}

// Usage
const ptr = instance.exports.malloc(256);
writeCString(memory, ptr, "Hello from JavaScript!");
instance.exports.printf(ptr);  // Calls C printf

A.4 Operators

Arithmetic Operators

C JavaScript Notes
a + b a + b Addition (or string concatenation)
a - b a - b Subtraction
a * b a * b Multiplication
a / b a / b Always float division
a % b a % b Remainder
++a ++a Pre-increment
a++ a++ Post-increment
a ** b a ** b Exponentiation (ES2016)

Critical Difference: Integer division

// C: Integer division
int a = 7 / 2;  // 3
// JavaScript: Always float division
let a = 7 / 2;           // 3.5
let b = Math.floor(7/2); // 3 (integer division)
let c = (7/2) | 0;       // 3 (bitwise trick)

Comparison Operators

C JavaScript Notes
a == b a === b Use strict equality
a != b a !== b Strict inequality
a < b a < b Less than
a > b a > b Greater than
a <= b a <= b Less or equal
a >= b a >= b Greater or equal

Important: Use === (strict) not == (loose):

// Loose equality (type coercion - avoid!)
42 == "42";       // true (bad!)
0 == false;       // true (bad!)
null == undefined; // true (confusing!)

// Strict equality (recommended)
42 === "42";      // false (good!)
42 === 42;        // true
0 === false;      // false

Logical Operators

C JavaScript Notes
a && b a && b Returns a or b (not boolean!)
a \|\| b a \|\| b Returns a or b
!a !a Boolean NOT
N/A a ?? b Nullish coalescing (ES2020)

Short-circuit evaluation:

// && returns first falsy value or last value
5 && 10;         // 10
0 && 10;         // 0

// || returns first truthy value or last value
5 || 10;         // 5
0 || 10;         // 10
null || "default";  // "default"

// ?? returns right side only if left is null/undefined
0 ?? 10;         // 0 (0 is not null/undefined)
null ?? 10;      // 10
undefined ?? 10; // 10

Bitwise Operators

C JavaScript Notes
a & b a & b AND
a \| b a \| b OR
a ^ b a ^ b XOR
~a ~a NOT
a << b a << b Left shift
a >> b a >> b Signed right shift
a >>> b a >>> b Unsigned right shift

Important: Bitwise ops convert to 32-bit signed integers:

const a = 0b1010;  // 10
const b = 0b1100;  // 12

a & b;   // 0b1000 = 8
a | b;   // 0b1110 = 14
a ^ b;   // 0b0110 = 6
~a;      // -11 (two's complement)

// Shifts
a << 2;  // 40
a >> 1;  // 5

-1 >>> 1; // 2147483647 (unsigned)

A.5 Control Flow

Conditionals

// C
if (condition) {
    // ...
} else if (other) {
    // ...
} else {
    // ...
}
// JavaScript (identical syntax)
if (condition) {
    // ...
} else if (other) {
    // ...
} else {
    // ...
}

// Ternary operator
const result = condition ? value1 : value2;

Truthy/Falsy:

// Falsy values (convert to false)
false, 0, 0n, "", null, undefined, NaN

// Everything else is truthy
if (42) { }        // true
if ("hello") { }   // true
if ([]) { }        // true (empty array is truthy!)
if ({}) { }        // true (empty object is truthy!)

Switch Statement

// C
switch (value) {
    case 1:
        // ...
        break;
    case 2:
    case 3:
        // ...
        break;
    default:
        // ...
}
// JavaScript (identical syntax)
switch (value) {
    case 1:
        // ...
        break;
    case 2:
    case 3:
        // ...
        break;
    default:
        // ...
}

// Supports any type (not just integers)
switch (str) {
    case "hello":
        console.log("Greeting");
        break;
    case "bye":
        console.log("Farewell");
        break;
}

Loops

// C: while
while (condition) {
    // ...
}

// C: do-while
do {
    // ...
} while (condition);

// C: for
for (int i = 0; i < n; i++) {
    // ...
}
// JavaScript: while (identical)
while (condition) {
    // ...
}

// JavaScript: do-while (identical)
do {
    // ...
} while (condition);

// JavaScript: for (identical)
for (let i = 0; i < n; i++) {
    // ...
}

// JavaScript: for-of (iterate values)
for (const value of array) {
    console.log(value);
}

// JavaScript: for-in (iterate keys - avoid for arrays!)
for (const key in object) {
    console.log(key, object[key]);
}

Loop Control

C JavaScript Notes
break; break; Exit loop
continue; continue; Next iteration
goto label; N/A No goto in JavaScript

A.6 Functions

Function Declaration

// C
int add(int a, int b) {
    return a + b;
}

void print_message(const char *msg) {
    printf("%s\n", msg);
}
// JavaScript: function declaration
function add(a, b) {
    return a + b;
}

function printMessage(msg) {
    console.log(msg);
}

// JavaScript: arrow function (ES2015)
const add = (a, b) => a + b;
const square = x => x * x;  // Single param, no parens
const greet = () => "Hello"; // No params

// Arrow with block body
const complex = (a, b) => {
    const sum = a + b;
    return sum * 2;
};

Function Parameters

// Default parameters (ES2015)
function greet(name = "World") {
    return `Hello, ${name}!`;
}

greet();         // "Hello, World!"
greet("Alice");  // "Hello, Alice!"

// Rest parameters (variadic)
function sum(...numbers) {
    return numbers.reduce((a, b) => a + b, 0);
}

sum(1, 2, 3, 4);  // 10

// Destructuring parameters
function drawPoint({x, y}) {
    console.log(`Point at (${x}, ${y})`);
}

drawPoint({x: 10, y: 20});

Function Pointers (WebAssembly Tables)

// C: function pointer
int (*operation)(int, int);
operation = add;
int result = operation(5, 3);  // Calls add(5, 3)
// JavaScript: functions are first-class
let operation = add;
let result = operation(5, 3);

// WebAssembly: use Table for function references
const table = new WebAssembly.Table({
    element: "anyfunc",
    initial: 10
});

// Store function at index 0
table.set(0, instance.exports.add);

// Call via call_indirect in WebAssembly
// (call_indirect (type $binary_op) (i32.const 0))

A.7 Structures and Objects

Structures

// C
struct Point {
    int x;
    int y;
};

struct Point p = {10, 20};
p.x = 30;
// JavaScript: object literal
const p = {
    x: 10,
    y: 20
};

p.x = 30;

// Constructor function (old style)
function Point(x, y) {
    this.x = x;
    this.y = y;
}

const p1 = new Point(10, 20);

// Class (ES2015)
class Point {
    constructor(x, y) {
        this.x = x;
        this.y = y;
    }
    
    distance() {
        return Math.sqrt(this.x ** 2 + this.y ** 2);
    }
}

const p2 = new Point(10, 20);

Memory Layout (WebAssembly Interop)

// C struct
struct Point {
    float x;  // Offset 0
    float y;  // Offset 4
};  // Total size: 8 bytes

void move_point(struct Point *p, float dx, float dy) {
    p->x += dx;
    p->y += dy;
}
// JavaScript: manual memory layout
const memory = instance.exports.memory;
const floats = new Float32Array(memory.buffer);

// Point at offset 100 (byte offset 100 = float index 25)
const pointIndex = 25;

// Read point
const x = floats[pointIndex];
const y = floats[pointIndex + 1];

// Write point
floats[pointIndex] = 10.5;
floats[pointIndex + 1] = 20.3;

// Call C function
const byteOffset = pointIndex * 4;  // Convert to byte offset
instance.exports.move_point(byteOffset, 5.0, 3.0);

A.8 Common Patterns

Memory Allocation

// C
int *buffer = malloc(1024 * sizeof(int));
// ... use buffer ...
free(buffer);
// JavaScript: automatic garbage collection
const buffer = new Int32Array(1024);
// ... use buffer ...
// Automatically freed when no longer referenced

// WebAssembly: manual allocation
const malloc = instance.exports.malloc;
const free = instance.exports.free;

const ptr = malloc(1024 * 4);  // 1024 ints
// ... use memory at ptr ...
free(ptr);

Error Handling

// C: return codes
int divide(int a, int b, int *result) {
    if (b == 0) return -1;  // Error

    *result = a / b;
    return 0;  // Success
}

int result;
if (divide(10, 2, &result) < 0) {
    fprintf(stderr, "Error!\n");
}
// JavaScript: exceptions
function divide(a, b) {
    if (b === 0) {
        throw new Error("Division by zero");
    }
    return a / b;
}

try {
    const result = divide(10, 0);
} catch (error) {
    console.error("Error:", error.message);
} finally {
    console.log("Cleanup");
}

File I/O (WASI)

// C
FILE *f = fopen("data.txt", "r");
char buffer[256];
fgets(buffer, sizeof(buffer), f);
fclose(f);
// JavaScript (Node.js)
import fs from 'fs';

const data = fs.readFileSync('data.txt', 'utf8');

// WebAssembly with WASI
import { WASI } from 'wasi';

const wasi = new WASI({
    args: process.argv,
    env: process.env,
    preopens: {
        '/sandbox': '.'
    }
});

const instance = await WebAssembly.instantiate(wasmModule, {
    wasi_snapshot_preview1: wasi.wasiImport
});

wasi.start(instance);

A.9 Common Pitfalls

1. Equality Comparison

// Bad: loose equality
if (x == 42) { }  // Avoid!

// Good: strict equality
if (x === 42) { }

2. Integer Division

// Bad: float division
const pages = total / pageSize;  // 10.5

// Good: integer division
const pages = Math.floor(total / pageSize);  // 10

3. Array Bounds

// C: undefined behavior
// JavaScript: returns undefined (no crash)
const arr = [1, 2, 3];
arr[10];  // undefined (not an error!)

4. Type Coercion

// Surprising behavior
"5" + 3;   // "53" (string concatenation)
"5" - 3;   // 2 (numeric subtraction)
"5" * "3"; // 15 (both converted to numbers)

// Be explicit
Number("5") + 3;  // 8

5. this Binding

const obj = {
    value: 42,
    getValue: function() {
        return this.value;
    }
};

obj.getValue();  // 42

const fn = obj.getValue;
fn();  // undefined (this is not obj!)

// Use arrow functions or bind
const boundFn = obj.getValue.bind(obj);
boundFn();  // 42

A.10 Quick Reference Table

C → JavaScript Conversion

Category C JavaScript
Variable int x = 42; let x = 42;
Constant const int X = 42; const X = 42;
Array int arr[10]; new Int32Array(10)
String char *str = "hi"; let str = "hi";
Function int add(int a, int b) function add(a, b)
Struct struct Point {int x, y;}; {x: 0, y: 0}
Pointer int *ptr Index into TypedArray
malloc malloc(size) new Uint8Array(size)
free free(ptr) Garbage collected
printf printf("%d", x); console.log(x);
NULL NULL null or undefined
true/false 1/0 true/false

This appendix provides the essential mappings for C programmers working with JavaScript and WebAssembly. For deeper learning, consult the ECMAScript specification and WebAssembly documentation!


Appendix B: WebAssembly Instruction Reference

Based on the WebAssembly specification documents provided, here’s a comprehensive reference for WebAssembly instructions:

Overview

WebAssembly uses a stack machine architecture with a small, well-defined set of instructions. Instructions manipulate values on an implicit operand stack and can be categorized by their functionality.

Instruction Categories

1. Control Flow Instructions

These manage program flow and block structures:

;; Block structures
block [blocktype]     ;; Begin a block
loop [blocktype]      ;; Begin a loop
if [blocktype]        ;; Conditional execution
else                  ;; Alternative branch
end                   ;; End block/loop/if

;; Branching
br [labelidx]         ;; Unconditional branch
br_if [labelidx]      ;; Conditional branch
br_table [vec(labelidx)] [labelidx]  ;; Table branch
return                ;; Return from function

;; Function calls
call [funcidx]        ;; Direct function call
call_indirect [tableidx] [typeidx]  ;; Indirect call via table

2. Parametric Instructions

Stack manipulation operations:

drop                  ;; Remove top stack value
select                ;; Conditional selection
select [vec(valtype)] ;; Typed conditional selection

3. Variable Instructions

Access local and global variables:

;; Local variables
local.get [localidx]  ;; Get local variable
local.set [localidx]  ;; Set local variable
local.tee [localidx]  ;; Set local and keep value on stack

;; Global variables
global.get [globalidx] ;; Get global variable
global.set [globalidx] ;; Set global variable

4. Numeric Instructions

Integer Operations (i32/i64)

Constants:

i32.const [i32]       ;; Push 32-bit integer constant
i64.const [i64]       ;; Push 64-bit integer constant

Arithmetic:

i32.add / i64.add     ;; Addition
i32.sub / i64.sub     ;; Subtraction
i32.mul / i64.mul     ;; Multiplication
i32.div_s / i64.div_s ;; Signed division
i32.div_u / i64.div_u ;; Unsigned division
i32.rem_s / i64.rem_s ;; Signed remainder
i32.rem_u / i64.rem_u ;; Unsigned remainder

Bitwise:

i32.and / i64.and     ;; Bitwise AND
i32.or / i64.or       ;; Bitwise OR
i32.xor / i64.xor     ;; Bitwise XOR
i32.shl / i64.shl     ;; Shift left
i32.shr_s / i64.shr_s ;; Arithmetic shift right
i32.shr_u / i64.shr_u ;; Logical shift right
i32.rotl / i64.rotl   ;; Rotate left
i32.rotr / i64.rotr   ;; Rotate right

Comparison:

i32.eqz / i64.eqz     ;; Equal to zero
i32.eq / i64.eq       ;; Equal
i32.ne / i64.ne       ;; Not equal
i32.lt_s / i64.lt_s   ;; Less than (signed)
i32.lt_u / i64.lt_u   ;; Less than (unsigned)
i32.gt_s / i64.gt_s   ;; Greater than (signed)
i32.gt_u / i64.gt_u   ;; Greater than (unsigned)
i32.le_s / i64.le_s   ;; Less or equal (signed)
i32.le_u / i64.le_u   ;; Less or equal (unsigned)
i32.ge_s / i64.ge_s   ;; Greater or equal (signed)
i32.ge_u / i64.ge_u   ;; Greater or equal (unsigned)

Unary:

i32.clz / i64.clz     ;; Count leading zeros
i32.ctz / i64.ctz     ;; Count trailing zeros
i32.popcnt / i64.popcnt ;; Count set bits
Floating-Point Operations (f32/f64)

Constants:

f32.const [f32]       ;; Push 32-bit float constant
f64.const [f64]       ;; Push 64-bit float constant

Arithmetic:

f32.add / f64.add     ;; Addition
f32.sub / f64.sub     ;; Subtraction
f32.mul / f64.mul     ;; Multiplication
f32.div / f64.div     ;; Division
f32.min / f64.min     ;; Minimum
f32.max / f64.max     ;; Maximum
f32.copysign / f64.copysign ;; Copy sign

Unary:

f32.abs / f64.abs     ;; Absolute value
f32.neg / f64.neg     ;; Negation
f32.sqrt / f64.sqrt   ;; Square root
f32.ceil / f64.ceil   ;; Ceiling
f32.floor / f64.floor ;; Floor
f32.trunc / f64.trunc ;; Truncate
f32.nearest / f64.nearest ;; Round to nearest

Comparison:

f32.eq / f64.eq       ;; Equal
f32.ne / f64.ne       ;; Not equal
f32.lt / f64.lt       ;; Less than
f32.gt / f64.gt       ;; Greater than
f32.le / f64.le       ;; Less or equal
f32.ge / f64.ge       ;; Greater or equal

5. Conversion Instructions

Type conversions:

;; Integer wrapping/extension
i32.wrap_i64          ;; Wrap i64 to i32
i64.extend_i32_s      ;; Extend i32 to i64 (signed)
i64.extend_i32_u      ;; Extend i32 to i64 (unsigned)

;; Float truncation to integer
i32.trunc_f32_s       ;; Truncate f32 to i32 (signed)
i32.trunc_f32_u       ;; Truncate f32 to i32 (unsigned)
i32.trunc_f64_s       ;; Truncate f64 to i32 (signed)
i32.trunc_f64_u       ;; Truncate f64 to i32 (unsigned)
i64.trunc_f32_s       ;; Truncate f32 to i64 (signed)
i64.trunc_f32_u       ;; Truncate f32 to i64 (unsigned)
i64.trunc_f64_s       ;; Truncate f64 to i64 (signed)
i64.trunc_f64_u       ;; Truncate f64 to i64 (unsigned)

;; Integer to float conversion
f32.convert_i32_s     ;; Convert i32 to f32 (signed)
f32.convert_i32_u     ;; Convert i32 to f32 (unsigned)
f32.convert_i64_s     ;; Convert i64 to f32 (signed)
f32.convert_i64_u     ;; Convert i64 to f32 (unsigned)
f64.convert_i32_s     ;; Convert i32 to f64 (signed)
f64.convert_i32_u     ;; Convert i32 to f64 (unsigned)
f64.convert_i64_s     ;; Convert i64 to f64 (signed)
f64.convert_i64_u     ;; Convert i64 to f64 (unsigned)

;; Float promotion/demotion
f32.demote_f64        ;; Demote f64 to f32
f64.promote_f32       ;; Promote f32 to f64

;; Reinterpretation
i32.reinterpret_f32   ;; Reinterpret f32 as i32
i64.reinterpret_f64   ;; Reinterpret f64 as i64
f32.reinterpret_i32   ;; Reinterpret i32 as f32
f64.reinterpret_i64   ;; Reinterpret i64 as f64

6. Memory Instructions

Linear memory access operations:

;; Memory size and growth
memory.size           ;; Get memory size (in pages)
memory.grow           ;; Grow memory by delta pages

;; Load operations (i32)
i32.load [memarg]     ;; Load 32-bit integer
i32.load8_s [memarg]  ;; Load 8-bit signed, extend to 32
i32.load8_u [memarg]  ;; Load 8-bit unsigned, extend to 32
i32.load16_s [memarg] ;; Load 16-bit signed, extend to 32
i32.load16_u [memarg] ;; Load 16-bit unsigned, extend to 32

;; Load operations (i64)
i64.load [memarg]     ;; Load 64-bit integer
i64.load8_s [memarg]  ;; Load 8-bit signed, extend to 64
i64.load8_u [memarg]  ;; Load 8-bit unsigned, extend to 64
i64.load16_s [memarg] ;; Load 16-bit signed, extend to 64
i64.load16_u [memarg] ;; Load 16-bit unsigned, extend to 64
i64.load32_s [memarg] ;; Load 32-bit signed, extend to 64
i64.load32_u [memarg] ;; Load 32-bit unsigned, extend to 64

;; Load operations (float)
f32.load [memarg]     ;; Load 32-bit float
f64.load [memarg]     ;; Load 64-bit float

;; Store operations
i32.store [memarg]    ;; Store 32-bit integer
i32.store8 [memarg]   ;; Store lower 8 bits
i32.store16 [memarg]  ;; Store lower 16 bits
i64.store [memarg]    ;; Store 64-bit integer
i64.store8 [memarg]   ;; Store lower 8 bits
i64.store16 [memarg]  ;; Store lower 16 bits
i64.store32 [memarg]  ;; Store lower 32 bits
f32.store [memarg]    ;; Store 32-bit float
f64.store [memarg]    ;; Store 64-bit float

7. Table Instructions

Table manipulation for indirect function calls:

table.get [tableidx]  ;; Get table element
table.set [tableidx]  ;; Set table element
table.size [tableidx] ;; Get table size
table.grow [tableidx] ;; Grow table
table.fill [tableidx] ;; Fill table range
table.copy [tableidx] [tableidx] ;; Copy table elements
table.init [tableidx] [elemidx]  ;; Initialize table from element
elem.drop [elemidx]   ;; Drop element segment

8. Reference Instructions

Reference type operations:

ref.null [heaptype]   ;; Create null reference
ref.is_null           ;; Test if reference is null
ref.func [funcidx]    ;; Create function reference

9. Vector (SIMD) Instructions

128-bit vector operations (when SIMD extension is enabled):

Load/Store:

v128.load [memarg]    ;; Load 128-bit vector
v128.store [memarg]   ;; Store 128-bit vector
v128.const [v128]     ;; Vector constant

Lane operations (examples for i8x16, i16x8, i32x4, i64x2, f32x4, f64x2):

i8x16.splat           ;; Splat value to all lanes
i8x16.extract_lane_s [laneidx] ;; Extract signed lane
i8x16.extract_lane_u [laneidx] ;; Extract unsigned lane
i8x16.replace_lane [laneidx]   ;; Replace lane value

Arithmetic (vectorized):

i32x4.add             ;; Vector addition
i32x4.sub             ;; Vector subtraction
i32x4.mul             ;; Vector multiplication
f32x4.add             ;; Float vector addition
f32x4.sqrt            ;; Float vector square root

Comparison and selection:

i32x4.eq              ;; Vector equality comparison
i32x4.lt_s            ;; Vector less than (signed)
v128.bitselect        ;; Bitwise selection

Relaxed SIMD operations (0xFD prefix with secondary opcodes):

i8x16.relaxed_swizzle         ;; Relaxed swizzle
i32x4.relaxed_trunc_f32x4_s   ;; Relaxed truncation
f32x4.relaxed_madd            ;; Relaxed multiply-add
f32x4.relaxed_min/max         ;; Relaxed min/max

10. Exception Handling Instructions

Try-catch block operations:

try [blocktype]       ;; Begin try block
catch [tagidx]        ;; Catch specific exception
catch_all             ;; Catch any exception
throw [tagidx]        ;; Throw exception
rethrow [labelidx]    ;; Rethrow exception

11. Atomic Instructions (Threads extension)

Thread-safe memory operations:

memory.atomic.notify [memarg]  ;; Notify waiting threads
memory.atomic.wait32 [memarg]  ;; Wait on 32-bit value
memory.atomic.wait64 [memarg]  ;; Wait on 64-bit value

;; Atomic load/store
i32.atomic.load [memarg]
i32.atomic.store [memarg]
i64.atomic.load [memarg]
i64.atomic.store [memarg]

;; Atomic read-modify-write
i32.atomic.rmw.add [memarg]    ;; Atomic add
i32.atomic.rmw.sub [memarg]    ;; Atomic subtract
i32.atomic.rmw.and [memarg]    ;; Atomic AND
i32.atomic.rmw.or [memarg]     ;; Atomic OR
i32.atomic.rmw.xor [memarg]    ;; Atomic XOR
i32.atomic.rmw.xchg [memarg]   ;; Atomic exchange
i32.atomic.rmw.cmpxchg [memarg] ;; Atomic compare-exchange

12. Bulk Memory Instructions

Efficient memory and table operations:

memory.copy           ;; Copy memory region
memory.fill           ;; Fill memory region
memory.init [dataidx] ;; Initialize memory from data segment
data.drop [dataidx]   ;; Drop data segment

13. GC Instructions (Garbage Collection proposal)

Structure and array operations:

struct.new [typeidx]        ;; Create structure
struct.new_default [typeidx] ;; Create with defaults
struct.get [typeidx] [fieldidx] ;; Get field
struct.set [typeidx] [fieldidx] ;; Set field

array.new [typeidx]         ;; Create array
array.new_default [typeidx] ;; Create with defaults
array.get [typeidx]         ;; Get element
array.set [typeidx]         ;; Set element
array.len                   ;; Get array length

Instruction Encoding

Instructions are encoded in binary format with:

  • Single-byte opcodes (0x00-0xFF) for most instructions

  • Multi-byte opcodes with prefixes:

    • 0xFC - Numeric/saturating operations

    • 0xFD - SIMD operations

    • 0xFE - Atomic operations (reserved)

Example encoding: i32.add → 0x6A f64.sqrt → 0x9F v128.load → 0xFD 0x00 i32.atomic.load → 0xFE 0x10 0x00

Instruction Properties

Type Signatures

Each instruction has a specific type signature showing stack behavior:

[input_types] → [output_types]

Examples: i32.add: [i32 i32] → [i32] i32.eqz: [i32] → [i32] call: [t1* t2*] → [t3*] (depends on function type)

Validation Rules

Instructions must satisfy:

  1. Type correctness - Stack has required input types

  2. Label validity - Branch targets exist within scope

  3. Index bounds - All indices reference valid definitions

  4. Mutability - Global/memory writes respect mutability

Execution Semantics

  • Deterministic - Same inputs always produce same outputs (except NaN handling)

  • Trapping - Invalid operations trap (division by zero, out-of-bounds access)

  • Stack-based - All operations via implicit stack manipulation

Usage Notes

  1. No strings - Only numeric types in MVP; strings require memory encoding

  2. 32-bit addressing - Memory limited to 4GB in current implementations

  3. Little-endian - Memory layout is always little-endian

  4. IEEE 754 - Floating-point follows IEEE 754-2019 standard

  5. Structured control - All control flow via blocks, no arbitrary goto

Summary Statistics

From the specification index:

  • ~400+ instructions including all extensions

  • 4 numeric types (i32, i64, f32, f64)

  • 1 vector type (v128 with SIMD)

  • Reference types (funcref, externref, custom heap types)

  • Multi-byte encoding for extended instruction sets


This reference covers the core WebAssembly instruction set as defined in the official specification (Release 3.0, 2025-10-06), including both MVP features and newer extensions like SIMD, exception handling, and garbage collection proposals. 🔧


Appendix C: Tools, Libraries, and Resources

Based on the extracted content from WebAssembly: The Definitive Guide, here’s a comprehensive summary of the tools, libraries, and resources appendix:


Overview

This appendix provides installation guidance and references for the essential WebAssembly ecosystem tools. As Steve Jobs noted: “Technology is nothing. What’s important is that you have a faith in people, that they’re basically good and smart, and if you give them tools, they’ll do wonderful things with them.”

The tools covered are foundational for WebAssembly development and work across various platforms, though some are easier to install on Linux or macOS than Windows.


1. Emscripten

Description

Emscripten is a complete compiler toolchain that translates C and C++ code to WebAssembly. It provides comprehensive support for running compiled code in both browser and Node.js environments.

Key Features

  • Full C/C++ to WebAssembly compilation

  • Support for widely used dependencies:

    • Standard C/C++ libraries

    • OpenGL

    • Other common libraries

  • Browser and Node.js runtime support

  • Extensive ecosystem compatibility

Installation

The official Getting Started guide provides detailed instructions for multiple operating systems:

Resource: https://emscripten.org/docs/getting_started/index.html

Platform Support

  • Linux

  • macOS

  • Windows


2. WebAssembly Binary Toolkit (WABT)

Description

WABT (pronounced “wabbit”) is a suite of tools for working with WebAssembly binary and text formats. It’s essential for debugging, validation, and format conversion.

Key Features

  • Format conversion between all WebAssembly formats:

    • .wasm (binary)

    • .wat (text)

    • And several others

  • Module inspection tools:

    • Dumping module details

    • Structure validation

    • Disassembly

  • Online tools available (browser-based)

Installation

Build instructions for all three major operating systems are available on GitHub:

Repository: https://github.com/WebAssembly/wabt

Online Demo

Try the tools without installation:

Demo: https://webassembly.github.io/wabt/demo

Notable Tools in WABT

  • wasm2wat - Binary to text format converter

  • wat2wasm - Text to binary format converter

  • wasm-objdump - Display information about wasm files

  • wasm-validate - Validate wasm files

  • wasm-strip - Remove debugging information


3. Wasm3

Description

Wasm3 is a high-performance WebAssembly interpreter that claims to be “the fastest WebAssembly interpreter, and the most universal runtime.” It’s designed for embedded systems and resource-constrained environments.

Key Features

  • Extremely portable - runs on diverse platforms

  • High performance interpretation

  • Tracks new proposals actively

  • Small footprint suitable for embedded systems

Platform Support

Desktop/Server:

  • Linux

  • Windows

  • macOS

  • FreeBSD

  • Android

  • iOS

Embedded/IoT:

  • OpenWrt

  • Yocto

  • Buildroot (network equipment)

  • Raspberry Pi and other single-board computers

  • Various microcontrollers

Web:

  • Most modern browsers

Installation

Multiple installation methods documented:

Installation Guide: https://github.com/wasm3/wasm3/blob/main/docs/Installation.md

Additional Resources

Cookbook: https://github.com/wasm3/wasm3/blob/main/docs/Cookbook.md

Use Cases

  • Embedded systems

  • IoT devices

  • Edge computing

  • Mobile applications

  • Testing and development


4. Wasmtime

Description

Wasmtime is described as a “fast, secure, and standards-compliant runtime for WebAssembly and WASI.” It’s one of the most actively developed and feature-complete runtimes.

Key Features

  • Optimizing runtime - JIT compilation for performance

  • WASI support - Full WASI (WebAssembly System Interface) implementation

  • Up-to-date proposals - Tracks and implements latest WebAssembly proposals

  • Extensive programmatic libraries - APIs for multiple languages

  • Production-ready - Used in real-world applications

Language Bindings

Wasmtime provides APIs for:

  • Rust (native)

  • C/C++

  • Python

  • .NET

  • Go

  • And others

Installation

Documentation: https://docs.wasmtime.dev

Installation Instructions: https://docs.wasmtime.dev/cli-install.html

Use Cases

  • Server-side WebAssembly execution

  • Plugin systems

  • Sandboxed code execution

  • Command-line tools

  • Embedded runtime in applications

Command-Line Tool

# Example usage
wasmtime run module.wasm
wasmtime compile module.wasm
wasmtime wast test.wast

5. SwiftWasm

Description

SwiftWasm is a toolchain for compiling Swift code to WebAssembly, enabling Swift developers to target WebAssembly platforms.

Key Features

  • Swift to WebAssembly compilation

  • Swift standard library support

  • Integration with existing Swift ecosystem

  • Modern Swift language features

Installation

Multiple installation options available:

Setup Guide: https://book.swiftwasm.org/getting-started/setup.html

Use Cases

  • iOS/macOS developers targeting WebAssembly

  • Cross-platform Swift applications

  • Web applications written in Swift

  • Shared codebases between native and web


Additional Tools and Utilities

Common C/C++ Development Tools

nm - Binary inspection tool

  • Prints contents of binary files

  • Shows symbol tables

  • Useful for debugging compiled modules

Online Resources

From the book’s references:

  1. C Programming Tutorials:

    • Learn-C.org - Interactive C tutorial (with ads)

    • Practical C Programming by Steve Oualline (O’Reilly)

    • The C Programming Language by Kernighan & Ritchie

  2. LLVM Resources:

    • LLVM Project

    • Note: LLVM used to stand for “Low-Level Virtual Machine,” but now it’s just “LLVM”

  3. Algorithm Resources:


Installation Best Practices

General Guidelines

  1. Check documentation first - Each tool has comprehensive installation guides

  2. Platform-specific instructions - Follow OS-specific steps carefully

  3. Use package managers when available (Homebrew, apt, etc.)

  4. Build from source if needed - Most tools provide clear build instructions

  5. Test installations - Verify tools work before starting projects

Troubleshooting Tips

  1. Dependency issues - Ensure all prerequisites are installed

  2. Path configuration - Add tools to system PATH

  3. Version compatibility - Check that tool versions work together

  4. Online communities - GitHub issues and forums are helpful


Summary Table

Tool Primary Use Platform Support Installation Difficulty
Emscripten C/C++ to Wasm compilation Linux, macOS, Windows Moderate
WABT Format conversion, inspection Linux, macOS, Windows Easy-Moderate
Wasm3 Universal interpreter Extremely broad Easy
Wasmtime Optimizing runtime, WASI Linux, macOS, Windows Easy
SwiftWasm Swift to Wasm compilation Linux, macOS Moderate

Development Workflow

Typical Tool Chain

  1. Write code in your preferred language (C/C++, Rust, Swift, etc.)

  2. Compile to WebAssembly using appropriate compiler:

    • Emscripten for C/C++

    • SwiftWasm for Swift

    • rustc for Rust

  3. Inspect/debug using WABT tools:

    • Convert to WAT for readability

    • Validate structure

    • Examine symbols

  4. Test execution with runtime:

    • Wasmtime for WASI modules

    • Wasm3 for embedded targets

    • Browser for web applications

  5. Optimize and deploy


Additional Notes

Quote from the Book

“It is unsurprising, given all of the languages, tools, and frameworks that we discuss in this book, that there is a fair amount to install.”

Important Considerations

  1. No comprehensive list - The appendix acknowledges it’s not exhaustive

  2. Platform variations - Some tools easier on Linux/macOS than Windows

  3. Active development - WebAssembly ecosystem evolves rapidly

  4. Community support - GitHub repos are primary resource locations

  5. Online alternatives - Many tools offer browser-based versions

Historical Context

The book references several foundational concepts:

  • C’s history is integral to modern operating systems

  • Security issues have been a major concern in C/C++

  • LLVM evolution from “Low-Level Virtual Machine” to just “LLVM”

  • Common idioms like “ten pounds of manure in a five-pound bag” for size constraints


Conclusion

This appendix provides essential starting points for WebAssembly development. While not comprehensive, it covers the most important tools in the ecosystem. The linked documentation for each tool provides detailed, up-to-date installation and usage information.

The WebAssembly tooling landscape continues to evolve, so checking official documentation and GitHub repositories is recommended for the latest updates and features. 🛠️


Note: The content is based on the extracted pages from WebAssembly: The Definitive Guide (wasm-defguide.pdf, pages 380-387). For the most current information, always refer to the official documentation links provided above.