Dossier - ecmascript-dossier
Preface: Why a Systems Programmer Should Care About JavaScript
The Unlikely Marriage of Low-Level Thinking and High-Level Chaos
The Cognitive Dissonance
You’ve spent years—perhaps decades—thinking in terms of registers, stack frames, and memory layouts. You understand that a program is ultimately a sequence of machine instructions, that data structures are arrangements of bytes in memory, and that performance comes from understanding what the hardware actually does. JavaScript, at first glance, seems to be the antithesis of everything you value.
JavaScript doesn’t care about memory alignment. It has garbage
collection instead of manual memory management. It converts types
implicitly, sometimes in ways that defy logic. The ==
operator can make [] == ![] evaluate to
true. The language was designed in ten days, and it
shows. For someone who appreciates the elegance of C’s simplicity or
the brutal honesty of assembly language, JavaScript appears to be a
cosmic joke.
The Uncomfortable Truth
Yet here we are. JavaScript is the most deployed runtime environment in human history. It runs on billions of devices—every smartphone, every laptop, every desktop computer with a web browser. It executes in environments you wouldn’t expect: embedded systems, IoT devices, databases (MongoDB), serverless functions, and even spacecraft. The V8 engine alone represents one of the most sophisticated JIT compiler infrastructures ever built, comparable in complexity to LLVM.
More importantly for you: JavaScript and WebAssembly are becoming the universal compilation targets. When you need your code to run anywhere—truly anywhere—you increasingly have two choices: target native machine code for each platform (x86-64, ARM, RISC-V, etc.) or target JavaScript/WebAssembly. The latter is often more practical.
The Strategic Position
JavaScript occupies a strategic position in computing infrastructure that cannot be ignored. It is:
The assembly language of the web: Just as you might not write assembly directly but need to understand it, JavaScript is the bytecode that runs everywhere.
A compilation target: TypeScript, CoffeeScript, ClojureScript, Elm, PureScript, and hundreds of other languages compile to JavaScript. WebAssembly is designed to run alongside it.
A plugin mechanism: Many applications use JavaScript as an embedded scripting language (Photoshop, Unity, game engines, CAD software).
An optimization target: Modern JavaScript engines perform sophisticated optimizations—type inference, inline caching, hidden classes—that rival static compilers.
If you’re building a transpiler, you need to understand JavaScript as a target. If you’re building a compiler, WebAssembly is increasingly the deployment mechanism. If you’re building tools, browser extensions are the distribution channel. Like it or not, this is the infrastructure layer of modern computing.
The Systems Perspective
This book takes a different approach than typical JavaScript tutorials. We’re not here to teach you how to build a todo list app or manipulate the DOM to create animated buttons. We’re treating JavaScript as a systems-level technology:
The runtime as a virtual machine: Understanding the event loop, call stack, and heap allocation strategies.
The language as a compilation target: Studying the AST, code generation, and optimization pipelines.
The ecosystem as infrastructure: Package management, module systems, and build toolchains as analogous to linkers and loaders.
The specification as documentation: Reading ECMA-262 the way you’d read the Intel SDM or the ARM Architecture Reference Manual.
This perspective reveals that JavaScript, despite its surface chaos, has underlying mechanisms that are comprehensible and even elegant when viewed through the right lens.
The WebAssembly Revolution
WebAssembly changes the equation fundamentally. It provides:
A binary format: Compact, fast to parse, fast to validate.
A stack machine model: Simple, deterministic, and easy to target from a compiler.
Near-native performance: JIT-compiled to machine code with predictable performance characteristics.
Language agnosticism: C, C++, Rust, Go, and dozens of other languages can compile to WebAssembly.
Security: Sandboxed execution with capability-based security (via WASI).
WebAssembly is the first serious attempt at a portable, safe, fast binary format for code distribution since Java bytecode. Unlike Java, it’s succeeding because it doesn’t try to be a complete platform—it’s a compilation target that integrates with existing ecosystems.
For a systems programmer, WebAssembly is fascinating because it brings low-level concepts back to the web:
Manual memory management: You allocate from linear memory, just like
malloc.Static typing:
i32,i64,f32,f64—no implicit conversions.Deterministic execution: No garbage collection pauses (unless you implement your own GC).
Low-level control flow: Direct branches, computed jumps via tables.
It’s almost like someone took the good parts of a systems language and made them run in a browser.
What This Book Is (and Emphatically Isn’t)
What This Book Is
This is a technical manual for understanding JavaScript and WebAssembly from a compiler writer’s and systems programmer’s perspective. It assumes you have strong fundamentals in:
Systems programming: You’ve written substantial code in C, C++, or similar languages.
Computer architecture: You understand what machine code is, how CPUs execute instructions, and what assembly language looks like.
Compilers: You’ve at least read about lexing, parsing, AST construction, and code generation, even if you haven’t built a complete compiler.
Operating systems concepts: You know what virtual memory is, how processes work, and what system calls do.
With that foundation, we’ll explore:
The JavaScript language specification (ECMA-262) in detail, focusing on semantics rather than syntax.
The execution model: How JavaScript engines actually work—the JIT compilation pipeline, inline caching, hidden classes, garbage collection strategies.
The module systems: From global scope chaos to ES6 modules, CommonJS, and the Node.js runtime.
JavaScript as a compilation target: How to parse JavaScript, manipulate ASTs, and generate code.
WebAssembly’s design: The binary format, text format, type system, and execution model.
Building a WebAssembly compiler: How to translate from an intermediate representation to the Wasm stack machine.
Interoperability: How JavaScript and WebAssembly communicate, share memory, and call each other’s functions.
Practical applications: Browser extensions, userscripts, Node.js tooling, and WASI for server-side WebAssembly.
What This Book Isn’t
This is not:
A web development tutorial: We won’t build React apps or discuss CSS frameworks.
A beginner’s programming book: We assume you can program and understand algorithmic complexity.
A JavaScript best practices guide: We care more about how things work than about style guides or linting rules.
Framework-focused: No Angular, React, Vue, or any other frontend framework unless it illustrates a fundamental concept.
A replacement for the specifications: We reference ECMA-262 and the WebAssembly spec extensively, but we can’t replace them. You should have both documents available.
The Tone and Approach
I’m going to be honest with you: JavaScript has warts. Many of
them. We’ll point them out, explain why they exist (historical
context matters), and show you how to work around them. We won’t
pretend that 0.1 + 0.2 !== 0.3 is elegant—it’s an IEEE
754 consequence that every language faces, but JavaScript’s implicit
coercion makes it more visible.
At the same time, we’ll acknowledge the genuine engineering achievements: V8’s TurboFan compiler is a marvel of optimization technology. The ES6 module system is actually well-designed. WebAssembly’s security model is sophisticated and practical.
This book has opinions, but they’re engineering opinions backed by technical reasoning. When we say something is poorly designed, we’ll explain why and what the alternatives would have been. When we say something is clever, we’ll show you the mechanism.
Practical Outcomes
By the end of this book, you should be able to:
Write a source-to-source compiler that targets JavaScript, handling scoping, closures, and the module system correctly.
Build browser extensions for Firefox and Chrome that do non-trivial work (not just DOM manipulation).
Create userscripts for Greasemonkey/Tampermonkey that can intercept and modify web traffic at the JavaScript level.
Understand the Node.js runtime well enough to build command-line tools, understand the event loop, and integrate with native C/C++ addons.
Read and understand ECMA-262: Navigate the specification, understand abstract operations, and predict behavior in edge cases.
Write WebAssembly by hand in the text format (WAT) for educational purposes.
Build a compiler backend that targets WebAssembly, handling function calls, linear memory, and the import/export system.
Debug WebAssembly in browser DevTools and understand the generated binary format.
Use WASI to run WebAssembly outside the browser with system interface access.
Understand JavaScript engine internals at a high level: what V8’s TurboFan does, how SpiderMonkey’s IonMonkey optimizes, and how these relate to traditional compiler technology.
These are practical skills that enable you to work at the intersection of high-level and low-level programming.
Your Background Assessment: C, Assembly, and the Unix Philosophy
What I’m Assuming You Know
Looking at your GitHub portfolio (https://github.com/Chubek), I can see projects involving:
Compiler construction: Lexers, parsers, and code generation.
Assembly language programming: x86, ARM, and potentially other architectures.
Systems-level C/C++: Memory management, pointer manipulation, bit operations.
Unix/Linux tools: Shell scripting, text processing, build systems.
Language implementation: Interpreters, VMs, and runtime systems.
This tells me you think in terms of:
Memory as a flat array of bytes: Addresses, pointers, and manual layout.
Explicit control flow: Jumps, branches, and call/return mechanisms.
Minimal abstractions: You prefer mechanisms you can see through.
Tools as composable primitives: The Unix philosophy of small, focused utilities.
The Conceptual Bridge
JavaScript violates many of your intuitions:
| Systems Intuition | JavaScript Reality | The Truth Underneath |
|---|---|---|
| Variables have types | Variables hold references to values | Values have types; variables are just names |
| Memory is explicit | Garbage collection | Generational GC with compaction, hidden from you |
| Functions are code | Functions are objects | Functions are closures with [[Environment]]
slots |
| == tests equality | == converts types | Abstract Equality uses a complex coercion algorithm |
| Performance is predictable | JIT compilation is non-deterministic | Hidden classes and inline caches create performance cliffs |
The key to understanding JavaScript is realizing that it’s a high-level language with low-level implementation details that matter. The spec describes abstract operations, but V8 implements them with sophisticated compiler techniques. You need to understand both layers.
The Translation Guide
Throughout this book, I’ll provide “translation guides” that map JavaScript concepts to systems-level equivalents:
Closures ↔︎ Stack frames captured on the heap
Prototypes ↔︎ Vtable pointers with delegation
The event loop ↔︎
select()/epoll()with callback queuesPromises ↔︎ Continuation-passing style with state machines
Typed Arrays ↔︎ Pointers to raw memory with typed access
WebAssembly linear memory ↔︎
mmap()-allocated region
These analogies aren’t perfect, but they provide mental models that map to what you already know.
The Systems Programmer’s Advantage
You have advantages that web developers often lack:
You understand the cost of abstractions: When we discuss engine internals, you’ll grasp why certain patterns perform well.
You can read specifications: ECMA-262 uses algorithmic pseudocode similar to compiler textbooks.
You know binary formats: WebAssembly’s binary encoding will make immediate sense.
You understand ISAs: The WebAssembly instruction set is simpler than x86 or ARM.
You’re comfortable with low-level debugging: Understanding stack traces and heap dumps comes naturally.
Where web developers struggle with “why is this slow?”, you’ll be able to profile, understand deoptimization, and fix the root cause.
How to Use This Book
Reading Strategies
This book is designed to support multiple reading strategies:
Strategy 1: Linear Read (Recommended for First Pass)
Read chapters 1-18 in order. This builds concepts progressively:
Chapters 1-3: Foundation—language semantics and execution model.
Chapters 4-7: Core JavaScript—functions, objects, arrays, modules.
Chapters 8-10: Runtime environments—browser, Node.js, extensions.
Chapters 11-12: Transpilation—building compilers that target JavaScript.
Chapters 13-17: WebAssembly—from fundamentals to building a compiler.
Chapter 18: Engine internals—how JavaScript and WebAssembly are implemented.
Strategy 2: Goal-Oriented Read
Jump to the chapters that support your immediate goals:
Goal: Build a transpiler → Chapters 1-2 (foundation), 7 (modules), 11-12 (transpilation)
Goal: Browser extensions → Chapters 1-2 (foundation), 8 (browser environment), 10 (extensions)
Goal: WebAssembly compiler → Chapters 1-2 (foundation), 13-16 (WebAssembly)
Goal: Node.js tools → Chapters 1-2 (foundation), 7 (modules), 9 (Node.js)
Goal: Understand engines → Chapters 1-3 (foundation), 18 (engines)
Strategy 3: Reference Use
Keep the book nearby while working on projects. Use the appendices for quick lookups:
Appendix A: Syntax quick reference for when you forget the syntax.
Appendix B: Feature compatibility for when you need to support older environments.
Appendix C: WebAssembly instruction reference for when you’re hand-coding WAT.
Appendix D: Tools and libraries for when you need to find the right library.
Working with the Specifications
Throughout the book, I reference:
ECMA-262 (the JavaScript specification):
ecma-262-std.pdf(847 pages)WebAssembly Specification:
wasm-spec.pdf(321 pages)
These documents are dense but invaluable. I’ll teach you how to read them:
ECMA-262 uses abstract operations (functions prefixed with
!) and algorithmic steps. It’s actually quite readable once you understand the notation.WebAssembly Spec uses formal semantics (mathematical notation), which is more challenging but precise.
I’ll provide page references so you can cross-reference the specifications. For example:
The
[[Environment]]internal slot of a function object (ECMA-262, §10.2.3, p. 203) stores the lexical environment in which the function was created, enabling closures.
This means you can turn to page 203 of
ecma-262-std.pdf to see the actual specification
text.
Code Examples and Conventions
All code examples follow these conventions:
JavaScript Code
// Comments explain what's happening
function example(parameter) {
// We use modern ES6+ syntax by default
const result = parameter * 2;
return result;
}WebAssembly Text Format (WAT)
;; Comments in WAT use semicolons
(module
(func $example (param $parameter i32) (result i32)
;; Stack-based operations
local.get $parameter
i32.const 2
i32.mul
)
)
C/C++ (for comparison)
// C code for comparison when useful
int example(int parameter) {
int result = parameter * 2;
return result;
}Shell Commands
# Commands you run in a terminal
npm install some-package
node script.jsExercises and Experiments
Each chapter includes “Experiments” sections suggesting things to try. These are not traditional exercises with solutions—they’re explorations:
Run this code in the browser console and observe the output.
Modify this example and see what breaks.
Use the debugger to step through execution.
Read this section of the spec and see if you can understand it now.
I don’t provide solutions because the point is exploration and building intuition. When you modify code and it breaks in an unexpected way, that’s when you learn.
Online Resources
While I’ve tried to make this book comprehensive, you’ll need external resources:
MDN Web Docs (https://developer.mozilla.org): The best JavaScript reference.
Node.js Documentation (https://nodejs.org/docs): For Node.js-specific APIs.
WebAssembly.org (https://webassembly.org): Official WebAssembly documentation.
WABT (https://github.com/WebAssembly/wabt): WebAssembly Binary Toolkit for working with Wasm.
V8 Blog (https://v8.dev/blog): Deep dives into engine internals.
I’ll reference these throughout the book.
Feedback and Errata
This is a technical book about rapidly evolving technologies. JavaScript evolves yearly (ES2025, ES2026, etc.), and WebAssembly is still adding features. I’ve focused on:
Stable features: Things unlikely to change (closures, prototypes, the event loop).
Current standards: ES2025 and WebAssembly 3.0 as of publication.
Timeless concepts: Compilation strategies, runtime models, and engineering trade-offs.
When new features arrive, the fundamentals we cover will help you understand them.
Acknowledgments
Standing on the Shoulders of Giants
This book wouldn’t exist without:
Dr. Axel Rauschmayer’s “JavaScript for Impatient Programmers”: A clear, well-organized introduction that inspired parts of our structure.
The ECMA TC39 Committee: For stewarding JavaScript’s evolution with care.
The WebAssembly Community Group: For designing a remarkably elegant bytecode format.
Engine Developers: The teams behind V8, SpiderMonkey, JavaScriptCore, and ChakraCore who push the boundaries of JIT compilation.
The Open Source Community: For creating tools like Babel, Acorn, and WABT that make working with these technologies practical.
Personal Acknowledgments
To Chubek (the intended reader): I’ve studied your GitHub projects and tried to write the book I think you need. Your background in compilers, assembly, and systems programming is evident, and I’ve assumed that level of sophistication throughout. If I’ve misjudged, let me know—this book is for you.
To the broader systems programming community: Many of you are skeptical of JavaScript, and rightfully so. I hope this book convinces you that underneath the chaos, there’s interesting technology worth understanding. Not because you’ll love JavaScript (you might not), but because it’s infrastructure you can’t avoid, and infrastructure should be understood.
Tools and Environment
This book was written using:
ECMA-262 16th Edition (June 2025): The authoritative JavaScript specification.
WebAssembly Specification 3.0: The latest WebAssembly standard.
Node.js v20+: For testing all code examples.
Firefox Developer Edition and Chrome/Chromium: For browser-based examples.
WABT (WebAssembly Binary Toolkit): For assembling and disassembling WebAssembly.
All code examples have been tested and work as shown (or fail in the instructive way described).
A Note on Attitude
If you’ve gotten this far, you’re committed to understanding JavaScript and WebAssembly despite your misgivings. That’s good. You don’t have to love these technologies—I’m not asking you to. But you should understand them with the same rigor you’d apply to any other systems-level technology.
JavaScript is messy, yes. It has more warts than most languages. But it’s also the result of decades of evolution, trillions of dollars of investment in engine optimization, and the collective work of thousands of talented engineers. Dismissing it as “toy language for web developers” is intellectually lazy.
WebAssembly, on the other hand, is a genuinely well-designed system. It learned from Java’s mistakes, avoided JavaScript’s legacy baggage, and provides a clean compilation target with strong security properties. It deserves your respect.
Let’s begin.
Today’s date: 1404/07/21 (Jalali) / 2025/10/13 (Gregorian)
Chapter 1: The ECMAScript Standard and Its Discontents
From Mocha to ES2025: A Brief, Opinionated History
The Ten-Day Language
In May 1995, Brendan Eich was hired by Netscape Communications Corporation with a specific mandate: create a scripting language for the web browser. The timeline was absurd—the company wanted it done in ten days. The result was Mocha, later renamed LiveScript, and finally JavaScript to capitalize on Java’s marketing momentum (despite having almost nothing in common with Java beyond some syntactic similarities).
This rushed creation explains many of JavaScript’s quirks. When you have ten days to design a language, you don’t have time to think through every edge case. You borrow liberally from existing languages—Scheme for first-class functions and closures, Self for prototypal inheritance, Java for syntax—and you ship it. The consequences of those hasty decisions compound over decades when backward compatibility becomes sacrosanct.
Systems Programmer’s Note: Imagine designing a processor ISA in ten days. You’d probably copy x86’s instruction encoding, ARM’s conditional execution, and MIPS’s delayed branches. Then imagine that ISA being frozen for 30 years with billions of devices depending on every quirk. That’s JavaScript’s predicament.
The Standardization Process (1997-1999)
By 1996, JavaScript was too important to remain a proprietary Netscape technology. Microsoft had reverse-engineered it as JScript for Internet Explorer, introducing subtle incompatibilities. The Browser Wars had begun, and developers were caught in the crossfire.
Netscape submitted JavaScript to Ecma International (a European standards body) in 1996. The Technical Committee 39 (TC39) was formed to standardize the language. The first edition of ECMA-262 was published in June 1997, codifying what would become known as ECMAScript 1.
Why “ECMAScript” and not “JavaScript”? Sun Microsystems (later acquired by Oracle) owned the trademark “JavaScript.” To avoid legal issues, the standard uses “ECMAScript” as the formal name. In practice, everyone still says “JavaScript.”
The timeline of early editions:
ES1 (June 1997): The baseline. Standardized what Netscape Navigator 4 and IE 4 mostly agreed on.
ES2 (June 1998): Editorial changes to align with ISO/IEC 16262.
ES3 (December 1999): Added regular expressions, better string handling,
try/catch, and other improvements. This became the stable base for years.
Then came ES4, the edition that never was.
The ES4 Debacle (2000-2008)
ES4 was ambitious—perhaps too ambitious. The committee wanted to add:
Classes and classical inheritance: Moving away from prototypes.
Modules: A proper module system instead of global scope pollution.
Namespaces: Organizing code into hierarchical structures.
Type annotations: Optional static typing.
Generators and iterators: Lazy evaluation and custom iteration.
Operator overloading: Custom behavior for
+,*, etc.
This was a complete overhaul of the language. Microsoft, Yahoo, and others opposed it, arguing it was too complex and strayed too far from JavaScript’s roots. Adobe had ActionScript 3 (for Flash), which implemented many ES4 proposals, but the broader JavaScript community was divided.
The conflict became political. Two factions emerged:
ES4 maximalists: Led by Mozilla and Adobe, wanting a comprehensive upgrade.
ES3.1 minimalists: Led by Microsoft and Yahoo, wanting incremental improvements.
By 2008, it was clear ES4 wouldn’t pass. In a compromise, ES4 was abandoned, and ES3.1 became ES5. Some ES4 ideas (generators, modules, classes) would later appear in ES6, but in different forms.
Engineering Lesson: Standards by committee are slow, but that’s a feature, not a bug. Backward compatibility and consensus prevent one vendor from fragmenting the ecosystem. The price is conservatism and occasionally absurd compromises.
The Modern Era (ES5 to ES2025)
ES5 (December 2009)
ES5 brought modest but important improvements:
Strict mode (
"use strict";): Opts into stricter error checking and removes dangerous features.JSON: Native
JSON.parse()andJSON.stringify().Array methods:
map,filter,reduce,forEach, etc.Property descriptors: Fine-grained control over object properties (
Object.defineProperty()).Getters and setters: Accessor properties.
ES5 was conservative, but it stabilized the language. It’s the last version universally supported without transpilation (though even old IE versions had spotty support).
ES6/ES2015: The Turning Point
ES6, officially named ES2015, was transformative. After years of stagnation, the committee delivered a massive upgrade:
letandconst: Block-scoped variables, finally!Arrow functions: Lexical
thisbinding and concise syntax.Classes: Syntactic sugar over prototypes, but cleaner.
Modules:
importandexportfor proper modularity.Promises: Standardized asynchronous programming.
Generators: Functions that can pause and resume (
function*).Template literals: String interpolation with
`Hello, ${name}`.Destructuring: Pattern matching for assignment.
Rest/spread operators:
...argsfor functions and arrays.Symbols: A new primitive type for unique identifiers.
Iterators and
for...of: Standardized iteration protocol.Map,Set,WeakMap,WeakSet: New collection types.Typed arrays:
Uint8Array,Int32Array, etc., for binary data.
ES6 was so large it took years for engines to fully implement. It marked the transition from “JavaScript is a toy language” to “JavaScript is a serious programming platform.”
The Yearly Release Cycle (ES2016-ES2025)
After ES6, TC39 adopted a yearly release cycle. Each edition is named by year: ES2016, ES2017, etc. Features are smaller and more focused:
ES2016 (ES7):
Array.prototype.includes(), exponentiation operator (**).ES2017 (ES8):
async/await,Object.entries(),Object.values(), shared memory and atomics.ES2018 (ES9): Asynchronous iteration, rest/spread for objects,
Promise.prototype.finally().ES2019 (ES10):
Array.prototype.flat(),Object.fromEntries(), optionalcatchbinding.ES2020 (ES11): Optional chaining (
?.), nullish coalescing (??),BigInt,Promise.allSettled(),globalThis.ES2021 (ES12): Logical assignment operators (
&&=,||=,??=), numeric separators (1_000_000),String.prototype.replaceAll().ES2022 (ES13): Class fields, private methods (
#private), top-levelawait,Array.prototype.at().ES2023 (ES14):
Array.prototype.findLast(),Array.prototype.toSorted(), hashbang grammar (#!/usr/bin/env node).ES2024 (ES15):
Promise.withResolvers(),Object.groupBy(),ArrayBuffertransfer, well-formed Unicode strings.ES2025 (ES16): Regular expression modifiers,
Setmethods (.union(),.intersection()), duplicate named capture groups in regex.
Key Insight: The yearly cycle allows incremental evolution. TC39 uses a four-stage proposal process (Stage 0: Strawperson, Stage 1: Proposal, Stage 2: Draft, Stage 3: Candidate, Stage 4: Finished). Only Stage 4 proposals make it into the spec.
The Shadow History: Engines Drive Innovation
While TC39 standardizes, browser vendors innovate. V8 (Chrome/Node.js), SpiderMonkey (Firefox), and JavaScriptCore (Safari) compete on performance. Many language features emerged from engine experiments:
V8’s hidden classes: Optimization technique that became foundational to modern JS performance.
asm.js: Mozilla’s subset of JavaScript designed for AOT compilation, precursor to WebAssembly.
Typed Arrays: Originally a WebGL requirement (need fast binary data), later standardized.
The relationship is symbiotic: engines push boundaries, TC39 standardizes what works, and the spec constrains engine behavior to ensure interoperability.
The ECMA-262 Document: Structure and Navigation
Obtaining the Specification
ECMA-262 is freely available at https://ecma-international.org/publications-and-standards/standards/ecma-262/. The document is published annually, with the most recent being the 16th Edition (June 2025)—referred to as ES2025.
Physical Properties: The PDF
(ecma-262-std.pdf) is 847 pages. It’s dense, technical,
and uses specialized notation. Don’t try to read it cover-to-cover;
treat it as a reference manual.
High-Level Structure
The specification is organized into major sections:
Front Matter (Pages 1-24)
Scope (§1, p. 1): One-paragraph overview of what ECMAScript is.
Conformance (§2, pp. 1-2): What it means for an implementation to be conformant.
Normative References (§3, p. 2): References to Unicode, ISO 8601 (dates), etc.
Overview (§4, pp. 2-11): High-level description of the language, hosts (browser vs. Node.js), and terms.
Notational Conventions (§5, pp. 11-24): How to read the spec’s algorithmic notation.
Systems Programmer’s Note: Think of this as the “Instruction Set Architecture” manual’s introductory chapters—defining notation, conventions, and scope before diving into opcodes.
Core Semantics (Pages 25-168)
ECMAScript Data Types and Values (§6, pp. 25-62): Primitive types (
Undefined,Null,Boolean,String,Symbol,Number,BigInt,Object) and specification types (internal constructs likeReference,Completion Record).Abstract Operations (§7, pp. 63-92): Reusable algorithms for type conversion, comparisons, and object operations. Think of these as the “standard library” of the spec itself.
Syntax-Directed Operations (§8, pp. 93-136): How syntax maps to semantics (evaluation rules, scope analysis).
Executable Code and Execution Contexts (§9, pp. 137-168): The runtime environment—how code executes, what an execution context is, the job queue, agents (threads), and realms.
Object Model (Pages 169-213)
- Ordinary and Exotic Objects Behaviours (§10, pp. 169-213): How objects work internally. Ordinary objects follow standard property access rules. Exotic objects (Arrays, bound functions, Proxies) have custom internal methods.
Key Concept: JavaScript objects have
internal slots (like [[Prototype]],
[[Extensible]]) and internal methods
(like [[Get]], [[Set]]). These are
specification mechanisms, not directly accessible in code.
Language Syntax and Semantics (Pages 214-699)
This is the bulk of the spec:
Source Text (§11, pp. 214-217): How source code is interpreted (UTF-16, BOM handling).
Lexical Grammar (§12, pp. 218-240): Tokens—identifiers, keywords, literals, operators.
Expressions (§13, pp. 241-356): How expressions are parsed and evaluated.
Statements and Declarations (§14, pp. 357-446): Control flow, loops, variable declarations.
Functions and Classes (§15, pp. 447-522): Function definitions, arrow functions, class syntax.
Scripts and Modules (§16, pp. 523-542): Top-level code execution, module imports/exports.
Error Handling (§17, pp. 543-550): Native error types and throw/catch semantics.
Built-in Objects (Pages 551-826)
Every standard object (Object, Array,
Function, Promise, Map,
Set, etc.) is specified here:
Fundamental Objects (§18-20):
Object,Function,Boolean, etc.Numbers and Dates (§21-22):
Number,BigInt,Math,Date.Text Processing (§23):
String,RegExp.Indexed Collections (§24):
Array,TypedArray.Keyed Collections (§25):
Map,Set,WeakMap,WeakSet.Structured Data (§26):
ArrayBuffer,DataView,Atomics,JSON.Managing Memory (§27):
WeakRef,FinalizationRegistry.Control Abstraction Objects (§28-29): Iterators, generators,
Promise,async/await.Reflection (§30):
Reflect,Proxy,Moduleintrospection.
Appendices (Pages 827-847)
Grammar Summary (Annex A, pp. 827-835): Complete lexical and syntactic grammar in one place.
Strict Mode (Annex C, p. 836): Differences between strict and sloppy mode.
Corrections and Clarifications (Annex D, pp. 837-842): Changes from previous editions.
Bibliography: References to Unicode, ISO standards, etc.
How to Navigate the Spec
Reading Algorithmic Steps
The spec uses a pseudocode notation called abstract operations. Example from §7.1.1 (ToPrimitive, p. 63):
ToPrimitive ( input [ , preferredType ] )
- If Type(input) is Object, then
- If preferredType is not present, let hint be “default”.
- Else if preferredType is STRING, let hint be “string”.
- Else,
- Assert: preferredType is NUMBER.
- Let hint be “number”.
- Let exoticToPrim be ? GetMethod(input, @@toPrimitive).
- If exoticToPrim is not undefined, then
- Let result be ? Call(exoticToPrim, input, « hint »).
- If Type(result) is not Object, return result.
- Throw a TypeError exception.
- If hint is “default”, set hint to “number”.
- Return ? OrdinaryToPrimitive(input, hint).
- Return input.
Notation guide:
?: Propagates exceptions. If the called operation returns an abrupt completion (error), this step returns immediately with that error.!: Asserts the operation never fails. Used when the spec guarantees success.[ , optional ]: Optional parameters.« »: A list (like an array).Type(x): Returns the type tag ofx(Object,Number, etc.).@@toPrimitive: A well-known symbol (Symbol.toPrimitive).
Systems Analogy: This is like reading CPU microcode or a state machine definition. Each step is deterministic; follow them sequentially.
Finding Information Quickly
Use the table of contents (pp. i-x). It’s
detailed. For example, to find how Array.prototype.map
works:
Look up “Indexed Collections” → Section 24.
Navigate to § → “Array.prototype.map” (p. 588).
The spec has internal hyperlinks in the PDF. Click on a reference like “§7.1.1” to jump there.
Use the index (pages 827+). Search for “ToPrimitive,” “Completion Record,” “Lexical Environment,” etc.
Cross-Referencing with MDN
While ECMA-262 is authoritative, MDN Web Docs (https://developer.mozilla.org) is more readable for learning. MDN explains what and why; the spec explains how and edge cases.
Workflow:
Learn from MDN.
Verify behavior in the spec.
Understand nuances (like why
[] == ![]istrue) by tracing through abstract operations.
Internal Slots and Methods
JavaScript objects have internal slots (data) and internal methods (behavior). These are spec constructs, not accessible in code.
Example: Every function has:
[[Environment]](internal slot): The lexical environment where it was created (enables closures).[[Call]](internal method): What happens when you invoke it.
Ordinary objects implement [[Get]] and
[[Set]] for property access. Exotic objects (like
Proxies) override these.
Systems Analogy: Internal slots are like private fields in a C++ class. Internal methods are virtual functions—exotic objects provide custom implementations.
Completion Records
Most abstract operations return a Completion Record, which is either:
Normal completion:
{ [[Type]]: normal, [[Value]]: v }Abrupt completion: Errors,
return,break,continue, etc.
The ? operator checks if a completion is abrupt and
propagates it up the call stack.
Why this matters: Understanding Completion Records explains how exceptions work internally and why certain operations can fail.
TC39 and the Standards Process
Who is TC39?
Technical Committee 39 is the Ecma International committee responsible for ECMAScript. Members include:
Browser vendors: Google (V8), Mozilla (SpiderMonkey), Apple (JavaScriptCore), Microsoft (formerly Chakra, now contributes to V8).
Large tech companies: Facebook/Meta, Netflix, PayPal, Airbnb, Bloomberg.
Individual experts: Academics, language designers, and community representatives.
TC39 meets every two months (in-person or remote) to discuss proposals.
The Proposal Process
New features follow a stage-based process:
Stage 0: Strawperson
Anyone can submit an idea. It’s just a concept, not a formal proposal. Many ideas die here.
Example: “What if JavaScript had pattern matching?”
Stage 1: Proposal
The committee agrees the problem is worth solving. A champion (a TC39 member or invited expert) takes ownership. The proposal needs:
A clear problem statement.
High-level API design.
Potential challenges identified.
Example: Pattern matching reaches Stage 1 with a champion outlining use cases and syntax options.
Stage 2: Draft
The proposal has a formal specification text (written in ECMA-262 notation). The committee believes the feature will eventually be standardized, but details may change.
Engines may implement experimental versions (behind flags) for testing.
Example: Pattern matching gets draft spec text.
V8 implements it behind --harmony-pattern-matching.
Stage 3: Candidate
The spec text is complete. Engines are expected to implement it for real-world testing. Only critical issues will cause changes.
Example: Pattern matching is implemented in Firefox Nightly and Chrome Canary. Feedback from developers refines edge cases.
Stage 4: Finished
The feature is ready for inclusion in the next annual release. Two independent implementations must exist, and significant real-world usage (or test262 tests) must validate it.
Example: Pattern matching ships in ES2026 after successful Stage 3 testing.
Consensus-Based Decision Making
TC39 operates by consensus, not majority vote. This means:
Everyone must agree (or at least not object strongly enough to block).
Compromises are common: Features are diluted or redesigned to satisfy objections.
Progress is slow but stable: Hasty decisions (like ES4) don’t repeat.
Engineering Trade-off: Consensus prevents fragmentation but can lead to “design by committee” where elegant solutions are compromised for political reasons.
Test262: The Conformance Test Suite
Test262 (https://github.com/tc39/test262) is the official conformance test suite. It has over 70,000 tests covering every feature in ECMA-262.
Engines run Test262 to ensure compliance. If your JavaScript engine fails Test262, it’s not conformant.
For implementers: Test262 is your ground truth. If you’re building a JavaScript compiler or interpreter, you must pass Test262.
Strict vs. Sloppy Mode: Why It Matters to You
The Historical Accident
JavaScript was designed to be forgiving. If you forgot
var, the variable became global. If you used reserved
words as identifiers, it often worked. If you assigned to
undefined, it silently failed.
This forgiveness was a mistake. It led to bugs, performance cliffs, and security issues. But backward compatibility meant these mistakes couldn’t be fixed without breaking the web.
Solution: ES5 introduced strict mode in 2009.
Enabling Strict Mode
Add "use strict"; at the top of a file or
function:
"use strict";
function example() {
// This function runs in strict mode
x = 10; // ReferenceError: x is not defined
}In ES6 modules, strict mode is implicit. You
don’t need "use strict"; because modules are always
strict.
// Inside an ES6 module file (imported via <script type="module">)
x = 10; // ReferenceError, even without "use strict"Key Differences
| Sloppy Mode (Default) | Strict Mode | Why It Matters |
|---|---|---|
| Undeclared variables become global | ReferenceError |
Prevents accidental globals |
| Assignment to read-only properties fails silently | TypeError |
Catches bugs earlier |
delete on non-configurable properties fails
silently |
TypeError |
Avoids confusion |
Octal literals (0123) allowed |
SyntaxError |
Octal is error-prone |
with statement allowed |
SyntaxError |
with breaks optimizations |
this in functions is global object |
this is undefined |
Safer default |
| Duplicate parameter names allowed | SyntaxError |
Prevents ambiguity |
arguments is aliased to parameters |
arguments is independent |
Simplifies semantics |
Performance Implications
Strict mode enables optimizations:
Hidden classes: Engines can assume properties don’t change unexpectedly.
Inline caching:
thisin strict mode is more predictable.Elimination of
argumentsaliasing: Simplifies stack frame layout.
Benchmarks show strict mode can be 10-20% faster for certain workloads because the engine doesn’t have to handle edge cases.
When to Use Strict Mode
Always. There’s no reason to use sloppy mode in
new code. ES6 modules enforce it automatically, and any modern
tooling (eslint, TypeScript) defaults to strict.
Exception: You’re maintaining legacy code that depends on sloppy behavior. In that case, incrementally migrate to strict mode.
The Two Faces of JavaScript: Browser (Mobile) vs. Node.js (Resident)
Mobile ECMA-262: The Browser Environment
“Mobile” refers to JavaScript running in a browser, where it’s transient—code is downloaded, executed, and discarded. The environment provides:
DOM APIs:
document,window,Element, etc.Browser APIs:
fetch,localStorage,WebSocket,WebRTC.Event-driven model: User clicks, network responses, timers.
Security sandboxing: Same-origin policy, CSP, no file system access.
The browser is a hostile environment—your code runs alongside untrusted code from other origins, so security is paramount.
Key characteristics:
No file system: You can’t open files directly.
Limited persistence:
localStorageis capped at ~5-10MB.Asynchronous everything: Network, user input, timers—all callback-based.
Resident ECMA-262: The Node.js Environment
“Resident” refers to JavaScript running on a server (or desktop) via Node.js, where it’s long-lived—the process stays up, handling requests or tasks continuously. Node.js provides:
File system APIs:
fs.readFile,fs.writeFile, etc.Network APIs:
http,https,net,dgram(UDP).Process control:
child_process,cluster,process.exit().Streams: Efficient I/O with backpressure.
Node.js is a trusted environment—you control the process, and it has full system access (with OS permissions).
Key characteristics:
Full file system access: Read/write/delete files.
Long-lived state: Variables persist across requests.
Synchronous file operations:
fs.readFileSync()blocks, unlike the browser.
The Common Core
Both environments implement ECMA-262, so core JavaScript is identical:
Primitives, objects, functions, closures.
Promises,
async/await, generators.ES6 modules (with caveats).
Differences are in the host environment (DOM vs. Node.js APIs), not the language.
The Module System Divide
Browser: ES6 modules
(import/export) are the standard, but
CommonJS isn’t supported natively. Bundlers (Webpack, Rollup)
transform modules for the browser.
Node.js: Originally used CommonJS
(require()/module.exports). ES6 modules
are now supported (since Node 12), but the ecosystem is split:
.mjsfiles: ES6 modules..cjsfiles: CommonJS modules..jsfiles: Depends onpackage.json"type": "module"or"type": "commonjs".
Systems Programmer’s Note: This is like the mess
of linking C libraries—static vs. dynamic, .a
vs. .so, name mangling. Node.js is trying to unify, but
legacy code remains.
Tooling Overlap
Many tools work in both environments:
Babel: Transpiles modern JavaScript to older versions.
ESLint: Lints code for errors and style.
Jest: Testing framework.
Webpack: Bundles modules (primarily for browsers, but can target Node.js).
Performance Differences
V8 (Chrome/Node.js) uses the same engine, but:
Browser: Optimizes for quick startup (website loads must be fast).
Node.js: Optimizes for throughput (servers run continuously).
Garbage collection tuning differs:
Browser: Frequent small GC pauses (don’t block rendering).
Node.js: Fewer, larger GC pauses (throughput matters more than latency).
Reading the Spec Like a Compiler Writer
The Mindset Shift
If you’ve written compilers, you’re familiar with:
Formal grammars: BNF, EBNF, or similar notation.
Attribute grammars: Syntax with semantic rules attached.
Operational semantics: Step-by-step execution rules.
ECMA-262 uses all three:
Grammar: §12-16 define lexical and syntactic grammar (similar to BNF).
Semantic rules: §8 defines syntax-directed operations (like attribute grammars).
Abstract operations: §7 defines operational semantics (step-by-step algorithms).
Your advantage: You already think this way. The spec is just documentation of the abstract machine.
Example: Tracing Type Coercion
Let’s trace why [] == ![] is true:
Parse:
[] == ![]Left:
[](empty array)Right:

Evaluate right side:
![]→ Is[]truthy? Yes (all objects are truthy). →!true→false.
Now we have:
[] == falseApply Abstract Equality (§7.2.15, p. 74):
Step 8: If one operand is Boolean, convert it to Number.
false→ToNumber(false)→0.Now:
[] == 0.
Apply Abstract Equality again:
Step 10: If one operand is Object and the other is Number, convert Object to primitive.
[]→ToPrimitive([])(§7.1.1, p. 63).ToPrimitive([])callsOrdinaryToPrimitive([], "number").Tries
valueOf():[].valueOf()→[](still an object).Tries
toString():[].toString()→""(empty string).Now:
"" == 0.
Apply Abstract Equality again:
Step 6: If one operand is String and the other is Number, convert String to Number.
""→ToNumber("")→0.Now:
0 == 0.
Result:
true.
Takeaway: By following the spec’s algorithmic steps, you can predict any JavaScript behavior. This is essential when writing a transpiler or debugging generated code.
Grammars in the Spec
The spec uses two grammars:
Lexical Grammar (§12)
Defines tokens: identifiers, keywords, literals, punctuation.
Example (§12.6.1, p. 221):
IdentifierName :: IdentifierStart IdentifierName IdentifierPart
IdentifierStart :: UnicodeIDStart $ _ UnicodeEscapeSequence
IdentifierPart :: UnicodeIDContinue $ _ UnicodeEscapeSequence
This says identifiers start with a Unicode ID start character,
$, or _, and can contain those plus
Unicode ID continue characters.
Systems Analogy: This is like the lexer rules in
lex or flex.
Syntactic Grammar (§13-16)
Defines how tokens combine into statements and expressions.
Example (§13.5, p. 256):
ConditionalExpression[In, Yield, Await] : ShortCircuitExpression[?In, ?Yield, ?Await] ShortCircuitExpression[?In, ?Yield, ?Await] ? AssignmentExpression[+In, ?Yield, ?Await] : AssignmentExpression[?In, ?Yield, ?Await]
This defines the ternary operator:
condition ? trueExpr : falseExpr.
The [In, Yield, Await] are grammar
parameters (context-sensitive flags). For example,
[+In] means “in” is allowed, [?In] means
“inherit from parent.”
Systems Analogy: This is like yacc
or bison grammar rules.
Semantic Rules: RS: Evaluation
Runtime Semantics: Evaluation (§8.1, p. 93) defines what each syntactic construct does.
Example (§13.5.1, p. 257):
ConditionalExpression : ShortCircuitExpression ? AssignmentExpression : AssignmentExpression
Let lref be ? Evaluation of ShortCircuitExpression.
Let lval be ToBoolean(? GetValue(lref)).
If lval is true, then
- Let trueRef be ? Evaluation of the first AssignmentExpression.
- Return ? GetValue(trueRef).
Else,
- Let falseRef be ? Evaluation of the second AssignmentExpression.
- Return ? GetValue(falseRef).
This says: evaluate the condition, convert to boolean, and evaluate one branch depending on the result.
Systems Analogy: This is like instruction semantics in an ISA manual—each instruction’s behavior is defined step-by-step.
Building a Mental Model
To use the spec effectively:
Start with MDN to understand high-level behavior.
Trace through the spec to understand edge cases.
Implement in code (a parser, interpreter, or transpiler) to solidify understanding.
Run Test262 to validate your implementation.
Example workflow:
You’re implementing
Array.prototype.mapfor your transpiler.Read MDN: “Calls a function on every element, returns new array.”
Read ECMA-262 § (p. 588): See the exact algorithm (checks for callability, handles
thisArg, handles sparse arrays).Implement based on the spec.
Run Test262 tests for
Array.prototype.mapto catch edge cases.
The Spec Is Your Ground Truth
When JavaScript behaves unexpectedly:
Don’t trust intuition: Trust the spec.
Don’t trust Stack Overflow: Trust the spec (but SO can point you to the right section).
Don’t trust the browser: If it contradicts the spec, it’s a browser bug (or you misread the spec).
The spec defines JavaScript. Everything else is commentary.
End of Chapter 1
In the next chapter, we’ll dive into JavaScript’s type system
from a systems perspective—how primitives and objects are
represented, how type coercion works at the algorithmic level, and
why typeof lies to you.
Chapter 2: JavaScript’s Type System from a Systems Perspective
The Seven Language Types and Their Internal Representation
Introduction: Types as Tagged Unions
If you’ve worked in C, you’re familiar with tagged unions—a discriminated union where a tag indicates which variant is active:
enum TypeTag {
TYPE_INT,
TYPE_FLOAT,
TYPE_POINTER
};
struct Value {
enum TypeTag tag;
union {
int i;
double f;
void* ptr;
} data;
};JavaScript’s type system is conceptually similar. Every
JavaScript value is a tagged union with one of
seven possible types. The difference is that JavaScript hides the
tag from you—you can’t directly access or manipulate it, though you
can query it with typeof (which sometimes lies) or
internal operations.
ECMA-262 §6.1 (p. 25) defines the seven language types:
Undefined – A singleton type with one value:
undefined.Null – A singleton type with one value:
null.Boolean – Two values:
trueandfalse.String – Sequences of UTF-16 code units.
Symbol – Unique, immutable identifiers (ES6+).
Number – IEEE 754 double-precision floating-point.
BigInt – Arbitrary-precision integers (ES2020+).
Object – Collections of properties (wait, that’s eight!)
Actually, Object is special. The first six are primitive types—immutable, passed by value. Object is the only reference type—mutable, passed by reference.
Systems Insight: In most JavaScript engines (V8, SpiderMonkey), values are represented using pointer tagging or NaN boxing to pack the type tag into the pointer itself. More on this in §2.2.
Undefined: The Uninitialized Sentinel
Specification: §6.1.1 (p. 25)
undefined is the type of variables that have been
declared but not assigned:
let x;
console.log(x); // undefinedInternal representation: In V8,
undefined is represented as a special tagged pointer
(kUndefinedValue). It’s a singleton—there’s only one
undefined in memory.
Where you’ll see it:
Uninitialized variables.
Missing function arguments.
Missing object properties.
Functions that don’t explicitly
return(implicitly returnundefined).
Systems note: undefined is
not the same as an uninitialized variable in C. In
JavaScript, accessing an uninitialized variable that hasn’t been
declared is a ReferenceError, not
undefined:
console.log(y); // ReferenceError: y is not defined
let y;This is because of the Temporal Dead Zone
(TDZ)—let and const variables exist in
scope but are uninitialized until their declaration is executed
(§8.1.1.4.9, p. 124).
Null: The Intentional Absence
Specification: §6.1.2 (p. 25)
null represents the intentional absence of a
value. It’s semantically different from
undefined:
undefined: “I haven’t been initialized yet.”null: “I explicitly have no value.”
let x = null; // Explicitly no value
let y; // Implicitly undefinedThe typeof null bug: One of
JavaScript’s most infamous quirks:
typeof null; // "object" (WAT?!)This is a bug from JavaScript’s original
implementation. In the first JavaScript engine, values were
represented as a type tag (3 bits) plus a value (32 bits on 32-bit
systems). Objects had a type tag of 000, and
null was represented as a null pointer (all zeros), so
the type check incorrectly identified null as an
object.
Brendan Eich has called this “the original sin” of JavaScript. It
can’t be fixed without breaking millions of websites that depend on
typeof null === "object".
Systems lesson: Early implementation decisions calcify into permanent language semantics when backward compatibility is sacrosanct.
Boolean: The Simple Case
Specification: §6.1.3 (p. 25)
true and false—nothing fancy here. Two
singleton values.
Internal representation: In V8, booleans are immediate values (not heap-allocated). They’re represented as tagged pointers with a specific bit pattern.
Truthy and falsy: JavaScript has a concept of truthy and falsy values for use in conditional contexts. The falsy values are:
false0,-0,0n(BigInt zero)""(empty string)nullundefinedNaN
Everything else is truthy, including:
"0"(non-empty string, even if it looks like zero)[](empty array){}(empty object)function() {}(any function)
Conversion to boolean: The abstract operation ToBoolean (§7.1.2, p. 64) defines this:
ToBoolean ( argument )
If argument is a Boolean, return argument.
If argument is undefined or null, return false.
If argument is a Number, then
- If argument is +0, -0, or NaN, return false.
- Otherwise, return true.
If argument is a String, then
- If argument is the empty String, return false.
- Otherwise, return true.
If argument is a Symbol or BigInt, return true.
If argument is an Object, return true.
Systems note: This is why
[] == false can be true (§1.6 in Chapter 1)—the array
is coerced to a primitive (""), which is then coerced
to a number (0), which equals false
coerced to a number (0). But if ([]) is
always true because objects are truthy.
String: UTF-16 Code Units, Not Characters
Specification: §6.1.4 (pp. 25-27)
JavaScript strings are sequences of 16-bit unsigned integers, representing UTF-16 code units. They are immutable and primitives (not objects, despite having methods via autoboxing—see §2.8).
UTF-16 encoding: JavaScript predates Unicode’s
expansion beyond the Basic Multilingual Plane (BMP). Originally, 16
bits per character was sufficient. When Unicode added code points
beyond U+FFFF, UTF-16 introduced surrogate
pairs—two 16-bit code units representing one character.
Example: The emoji 💩 (U+1F4A9) is represented as two code units:
const poop = "💩";
console.log(poop.length); // 2 (not 1!)
console.log(poop.charCodeAt(0)); // 55357 (0xD83D, high surrogate)
console.log(poop.charCodeAt(1)); // 56489 (0xDCA9, low surrogate)This is a footgun:
String.prototype.length counts code units, not
characters (grapheme clusters). For most ASCII text, they’re the
same. For emoji, combining characters, or other BMP-external code
points, they differ.
ES6 improvements:
String.prototype.codePointAt(): Returns the full code point (handles surrogates).String.fromCodePoint(): Creates strings from code points.for...ofiteration: Iterates by code points, not code units.
for (const char of "💩") {
console.log(char); // Logs "💩" (one iteration)
}Internal representation: Engines optimize string storage:
Latin-1 encoding: If all characters are
< 256, store as one byte per character.UTF-16 encoding: Otherwise, store as two bytes per code unit.
Ropes: Concatenated strings can be stored as trees of substrings (lazy concatenation).
Slices: Substrings can reference slices of parent strings without copying.
Systems note: JavaScript’s string model is awkward for modern Unicode text processing. If you’re building a transpiler that handles source code, you’ll need to be careful with positions—does your line/column counter use code units or code points? Source maps (§13.5) use zero-based code unit offsets.
Symbol: Unique, Unforgeable Identifiers
Specification: §6.1.5 (pp. 27-28)
Symbols were introduced in ES6 to solve the property key
collision problem. Before symbols, all object property keys
were strings. If you wanted to add a private or special property,
you’d use a string like "__myPrivateProp" and hope no
one else used the same name.
Symbols are unique: Every call to
Symbol() creates a new, distinct symbol:
const sym1 = Symbol("description");
const sym2 = Symbol("description");
console.log(sym1 === sym2); // falseSymbols as property keys: You can use symbols as object property keys, and they won’t collide with string keys:
const mySymbol = Symbol("my-symbol");
const obj = {
[mySymbol]: "value",
"my-symbol": "different value"
};
console.log(obj[mySymbol]); // "value"
console.log(obj["my-symbol"]); // "different value"Well-known symbols: The spec defines well-known symbols (§6.1.5.1, p. 28) for metaprogramming:
Symbol.iterator: Defines the default iterator for an object (enablesfor...of).Symbol.toStringTag: CustomizesObject.prototype.toString()behavior.Symbol.toPrimitive: Customizes type coercion (§7.1.1, p. 63).Symbol.hasInstance: Customizesinstanceofbehavior.And 10+ more.
Internal representation: Symbols are heap-allocated objects with a unique identifier. The “description” is just metadata for debugging—it doesn’t affect uniqueness.
Global symbol registry:
Symbol.for(key) creates or retrieves a symbol from a
global registry:
const sym1 = Symbol.for("app.id");
const sym2 = Symbol.for("app.id");
console.log(sym1 === sym2); // true (same symbol)Systems use case: If you’re implementing object
property access in a transpiler, you need to handle symbol keys
differently from string keys. Symbol keys are not enumerable in
for...in loops and don’t show up in
Object.keys().
Number: IEEE 754 Double-Precision Floating-Point
Specification: §6.1.6 (pp. 28-31)
JavaScript’s Number type is IEEE 754-2019
binary64 (double-precision floating-point). This means:
64 bits total: 1 sign bit, 11 exponent bits, 52 mantissa bits (plus 1 implicit leading bit).
Range: Approximately .
Precision: 15-17 significant decimal digits.
Special values:
+0,-0,+Infinity,-Infinity,NaN.
Why double-precision? JavaScript was designed
for simple scripting, and doubles were “good enough” for most use
cases. The assumption was that you wouldn’t need 64-bit integers
(spoiler: you do, hence BigInt in ES2020).
The Integer Range Problem
Safe integer range: Integers can be represented exactly in the range . This is because the mantissa has 52 bits plus 1 implicit bit (53 bits total).
console.log(Number.MAX_SAFE_INTEGER); // 9007199254740991 (2^53 - 1)
console.log(Number.MIN_SAFE_INTEGER); // -9007199254740991 (-(2^53 - 1))Beyond this range, not all integers can be represented exactly:
console.log(9007199254740992 === 9007199254740993); // true (WAT?!)Both values round to the same double-precision representation.
Systems impact: If you’re implementing a
compiler backend targeting JavaScript, you can’t use
Number for 64-bit integers. You need
BigInt (§2.1.7) or a custom library (like
bn.js or bignumber.js).
Signed Zero
IEEE 754 has two zeros: +0 and
-0. They compare as equal, but have different bitwise
representations:
console.log(+0 === -0); // true
console.log(1 / +0); // Infinity
console.log(1 / -0); // -Infinity
console.log(Object.is(+0, -0)); // false (Object.is distinguishes them)Why signed zero exists: It preserves the sign of underflow in floating-point computations. For example, .
Systems note: If you’re generating JavaScript
code that performs floating-point math, be aware that
-0 can appear unexpectedly (e.g.,
-1 * 0 === -0).
NaN: Not a Number (But Still a Number)
NaN represents the result of invalid operations:
console.log(0 / 0); // NaN
console.log(Math.sqrt(-1)); // NaN
console.log(parseInt("foo")); // NaNNaN is not equal to itself:
console.log(NaN === NaN); // falseThis is per IEEE 754 spec—NaN is unordered, so any
comparison with NaN (including NaN == NaN)
returns false.
To check for NaN, use:
Number.isNaN(value); // Strict check (ES6, preferred)
isNaN(value); // Global function (coerces to Number first, don't use)
Object.is(value, NaN); // Also works
value !== value; // Classic hack (only NaN is not equal to itself)Systems note: If you’re implementing numeric
operations in your transpiler, you must handle NaN
propagation correctly. For example, NaN + 1 === NaN,
NaN * 0 === NaN, etc.
Infinity
Infinity and -Infinity represent
overflow:
console.log(1 / 0); // Infinity
console.log(-1 / 0); // -Infinity
console.log(1e308 * 10); // Infinity (overflow)Operations with Infinity:
Infinity + 1 === InfinityInfinity * 2 === InfinityInfinity - Infinity === NaN(indeterminate form)Infinity / Infinity === NaN
Systems note: Infinity behaves like the IEEE 754
special value. If you’re compiling to JavaScript, be aware that
integer overflow in the source language (e.g., C’s
INT_MAX + 1) won’t translate directly—JavaScript will
produce Infinity or wrap differently depending on the
operation.
BigInt: Arbitrary-Precision Integers
Specification: §6.1.6.2 (pp. 31-34)
BigInt was added in ES2020 to address the
Number type’s inability to represent large integers.
BigInt values are arbitrary-precision
integers—they can represent integers of any size (limited
only by memory).
Creating BigInts:
const big1 = 1234567890123456789012345678901234567890n; // Literal suffix 'n'
const big2 = BigInt("1234567890123456789012345678901234567890");
const big3 = BigInt(123); // Convert Number to BigInt (must be integer)Operations:
const a = 10n;
const b = 20n;
console.log(a + b); // 30n
console.log(a * b); // 200n
console.log(b / a); // 2n (integer division, truncates)
console.log(b % a); // 0n
console.log(a ** b); // 100000000000000000000n (10^20)No mixing with Number:
console.log(10n + 20); // TypeError: Cannot mix BigInt and other typesYou must explicitly convert:
console.log(10n + BigInt(20)); // 30n
console.log(Number(10n) + 20); // 30 (converts BigInt to Number, may lose precision)Comparisons work across types:
console.log(10n == 10); // true (abstract equality coerces)
console.log(10n === 10); // false (strict equality doesn't coerce)
console.log(10n < 20); // true (relational comparison coerces)Internal representation: BigInts are heap-allocated objects with dynamically-sized digit arrays. In V8, small BigInts (fitting in a word) are stored inline; large ones are stored as arrays of 32-bit or 64-bit digits (depending on architecture).
Systems use case: If you’re compiling a language with 64-bit integers (C, Rust, Go) to JavaScript, you have two options:
Use
Numberfor the safe integer range () and error/wrap on overflow.Use
BigIntfor all integers, accepting the performance cost (BigInt operations are slower than Number).
Emscripten (C/C++ to WebAssembly) originally used asm.js with
32-bit integers. For 64-bit integers, it used a pair of 32-bit
values. With WebAssembly, 64-bit integers are native
(i64 type).
Object: The Reference Type
Specification: §6.1.7 (pp. 34-43)
Object is the catch-all type for everything that’s not a primitive. This includes:
Plain objects:
{ key: value }Arrays:
[1, 2, 3]Functions:
function() {}Dates:
new Date()RegExps:
/pattern/Maps, Sets, WeakMaps, WeakSets
Promises
And more.
Key distinction: Primitives are immutable and passed by value. Objects are mutable and passed by reference:
let a = 5;
let b = a;
b = 10;
console.log(a); // 5 (unchanged)
let obj1 = { x: 5 };
let obj2 = obj1;
obj2.x = 10;
console.log(obj1.x); // 10 (mutated!)Internal structure: Objects are collections of properties. Each property has:
Key: String or Symbol.
Value: Any JavaScript value.
Attributes:
[[Writable]],[[Enumerable]],[[Configurable]].
Properties are either data properties (have a value) or accessor properties (have a getter/setter).
Property descriptors (§6.2.5, pp. 45-46):
const obj = {};
Object.defineProperty(obj, "x", {
value: 42,
writable: false,
enumerable: true,
configurable: false
});
console.log(obj.x); // 42
obj.x = 100; // Fails silently in sloppy mode, TypeError in strict mode
console.log(obj.x); // Still 42Internal slots and methods: Objects have internal slots (data) and internal methods (behavior). For example:
[[Prototype]]: The object’s prototype (for inheritance).[[Get]]: Invoked when accessing a property.[[Set]]: Invoked when assigning to a property.[[Call]]: Invoked when calling a function (functions are callable objects).
These are specification mechanisms, not directly
accessible in code. You interact with them via built-in methods like
Object.getPrototypeOf() or Reflect API.
Ordinary vs. exotic objects:
Ordinary objects: Follow standard property access semantics.
Exotic objects: Have custom internal methods. Examples:
Arrays: Custom
[[DefineOwnProperty]]to updatelength.Functions: Have
[[Call]]and[[Construct]].Proxy: Intercepts all internal methods.
Bound functions: Custom
[[Call]]to bindthis.
We’ll cover objects in depth in Chapter 3. For now, understand that Object is the foundation of JavaScript’s type system.
Pointer Tagging and NaN Boxing: How Engines Optimize Values
The Problem: Representing Dynamic Types Efficiently
JavaScript values have dynamic types—a variable can hold any type at runtime. The naive approach is a struct with a type tag and a union:
struct JSValue {
enum { UNDEFINED, NULL, BOOL, NUMBER, STRING, OBJECT, SYMBOL, BIGINT } tag;
union {
bool boolean;
double number;
char* string;
JSObject* object;
// ...
} data;
};But this wastes space:
Size: On a 64-bit system,
enumis 4 bytes (aligned to 8),unionis 8 bytes (for a pointer or double). Total: 16 bytes per value.Cache efficiency: Larger values mean fewer fit in CPU cache lines.
JavaScript engines need to optimize this. The two main techniques are pointer tagging and NaN boxing.
Pointer Tagging (V8’s Smi)
Pointer tagging exploits the fact that pointers
are aligned—on a 64-bit system, heap-allocated objects are aligned
to 8-byte boundaries, so the low 3 bits of a pointer are always
0.
V8’s Smi (Small Integer): If a value is a small
integer (31 bits on 32-bit systems, 32 bits on 64-bit systems), V8
stores it directly in the pointer with the low bit set to
1 to distinguish it from a pointer:
Smi: [31-bit integer][1] (low bit is 1) Pointer: [61-bit address][000] (low 3 bits are 0)
Example (64-bit system):
The integer
42is stored as42 << 1 | 1 = 85 = 0x55.A pointer
0x7ffeefbff000stays as-is (low bits are 0).
Advantages:
Fast integer operations: Check the low bit. If it’s
1, it’s a Smi. Perform arithmetic directly on the tagged value (shift to extract, operate, shift back).No heap allocation: Smis are immediate values.
Limitations:
Range: Smis can only represent on 64-bit systems (one bit is used for the tag, one for the sign).
Overflow: If an integer exceeds the Smi range, it must be boxed (heap-allocated as a
HeapNumber).
Other tagged values: V8 also uses specific pointer values for singletons:
undefined: A specific tagged pointer.null: Another specific tagged pointer.true/false: Tagged pointers.
NaN Boxing (SpiderMonkey, JavaScriptCore)
NaN boxing exploits the fact that IEEE 754
doubles have many NaN representations.
In IEEE 754:
Exponent bits all
1(0x7FF) and mantissa non-zero →NaN.There are possible
NaNbit patterns, but JavaScript only needs one canonicalNaN.
Idea: Use the remaining NaN bit
patterns to encode other types.
JavaScriptCore’s encoding (simplified, 64-bit):
NaN patterns (exponent = 0x7FF):
0x7FF8_0000_0000_0000: Canonical NaN
0x7FF0_0000_0000_0001 to 0x7FF7_FFFF_FFFF_FFFF: Unused NaN patterns
Encoding:
Doubles: Normal IEEE 754 (if not in the reserved NaN range).
Integers: Store as a double (if representable exactly).
Pointers: Use reserved NaN patterns.
Special values (undefined, null, true, false): Use reserved NaN patterns.
Example encoding:
undefined: 0xFFFF_FFFF_FFFF_FFF2 null: 0xFFFF_FFFF_FFFF_FFF3 true: 0xFFFF_FFFF_FFFF_FFF4 false: 0xFFFF_FFFF_FFFF_FFF5 Pointer: 0xFFFF_xxxx_xxxx_xxxx (where xxxx is the pointer value) Double: Normal IEEE 754 (0x0000_xxxx_xxxx_xxxx to 0x7FEF_xxxx_xxxx_xxxx and negatives)
Advantages:
Uniform 64-bit representation: Every value fits in 64 bits.
No extra type tag: The value itself encodes the type.
Disadvantages:
Pointer range limitation: On 64-bit systems, only 48 bits of address space are typically used (x86-64 canonical addresses), so pointers fit. But if the address space expands, this breaks.
Complexity: Bit manipulation is required for every value access.
Which Technique Does Each Engine Use?
V8 (Chrome, Node.js): Pointer tagging (Smi + heap objects).
SpiderMonkey (Firefox): NaN boxing (called “nunboxing” in SpiderMonkey).
JavaScriptCore (Safari): NaN boxing.
ChakraCore (old Edge): Pointer tagging (deprecated; Edge now uses V8).
Systems takeaway: When compiling to JavaScript, you generally don’t need to care about these details—engines handle it. But understanding them helps explain why:
Small integers are fast (Smi in V8).
Large integers or non-integer numbers are slower (heap-allocated).
Operations that cause boxing/unboxing (e.g., Smi overflow) have performance cliffs.
The Type() Abstract Operation: How the Spec Classifies Values
Specification: §6.1 (p. 25)
The Type(x) abstract operation returns the type tag of a value. It’s not exposed to JavaScript code, but it’s fundamental to understanding the spec.
Definition:
Type ( x )
Returns one of: Undefined, Null, Boolean, String, Symbol, Number, BigInt, Object.
Examples:
Type(undefined)→UndefinedType(null)→NullType(true)→BooleanType("hello")→StringType(Symbol())→SymbolType(42)→NumberType(42n)→BigIntType({})→ObjectType([])→ObjectType(function() {})→Object
Note: Arrays and functions are Objects. There’s
no separate Array or Function type at this
level. To distinguish them, you check internal slots like
[[Call]] (for functions).
The typeof Operator: JavaScript’s Unreliable Type Query
Specification: §13.5.3 (p. 260)
The typeof operator is JavaScript’s user-facing type
query. It mostly matches Type(), but
with quirks:
console.log(typeof undefined); // "undefined"
console.log(typeof null); // "object" (BUG!)
console.log(typeof true); // "boolean"
console.log(typeof "hello"); // "string"
console.log(typeof Symbol()); // "symbol"
console.log(typeof 42); // "number"
console.log(typeof 42n); // "bigint"
console.log(typeof {}); // "object"
console.log(typeof []); // "object"
console.log(typeof function() {}); // "function"
console.log(typeof Math.sqrt); // "function"Discrepancies:
typeof null === "object": Bug from JavaScript’s original implementation (§2.1.2).typeof function() {} === "function": Functions are Objects internally, buttypeofdistinguishes them.
Why “function” is a special case: The spec
explicitly checks for [[Call]] (§13.5.3, p. 260):
typeof operator
If val has a [[Call]] internal method, return “function”.
Return “object”.
Reliable Type Checking
To reliably check types:
For primitives:
value === undefined // Check for undefined
value === null // Check for null
typeof value === "boolean" // Boolean
typeof value === "string" // String
typeof value === "symbol" // Symbol
typeof value === "number" // Number
typeof value === "bigint" // BigIntFor objects, use:
Array.isArray(value) // Array
typeof value === "function" // Function
value instanceof Date // Date
value instanceof RegExp // RegExp
value instanceof Promise // Promise
Object.prototype.toString.call(value) // Generic (returns "[object Type]")Example:
Object.prototype.toString.call([]); // "[object Array]"
Object.prototype.toString.call(new Date()); // "[object Date]"
Object.prototype.toString.call(/regex/); // "[object RegExp]"Why Object.prototype.toString? It
uses the [[Class]] internal property (legacy term; now
called Symbol.toStringTag). Objects can customize
this:
const obj = {
[Symbol.toStringTag]: "MyCustomType"
};
console.log(Object.prototype.toString.call(obj)); // "[object MyCustomType]"Systems note: If you’re implementing type guards
in a transpiler, you’ll need to emit code that handles these quirks.
For example, checking for null requires
value === null, not
typeof value === "object".
Type Coercion: The Algorithmic Nightmare
Why Coercion Exists
JavaScript was designed for non-programmers. Brendan Eich wanted
it to be forgiving—if you write "5" + 2, it shouldn’t
error; it should “do something reasonable.”
The problem: “reasonable” is subjective. JavaScript’s coercion rules are Byzantine, full of edge cases, and the source of endless bugs.
Two kinds of coercion:
Explicit coercion: You manually convert types (
Number("5"),String(42)).Implicit coercion: JavaScript converts types automatically (
"5" + 2,if (value)).
Implicit coercion is where the madness lies.
The Big Three: ToString, ToNumber, ToBoolean
The spec defines three core coercion operations:
ToString (§7.1.17, pp. 71-72)
Converts a value to a string.
Algorithm (simplified):
ToString ( argument )
If argument is a String, return argument.
If argument is a Symbol, throw TypeError.
If argument is undefined, return “undefined”.
If argument is null, return “null”.
If argument is true, return “true”.
If argument is false, return “false”.
If argument is a Number, return NumberToString(argument).
If argument is a BigInt, return BigIntToString(argument).
If argument is an Object, return ? ToString(? ToPrimitive(argument, STRING)).
Examples:
String(undefined); // "undefined"
String(null); // "null"
String(true); // "true"
String(42); // "42"
String(42n); // "42"
String({}); // "[object Object]" (calls ToPrimitive, then toString)
String([1, 2, 3]); // "1,2,3"
String(Symbol("x")); // TypeError!Key insight: Objects are converted via
ToPrimitive (§7.1.1, p. 63), which tries
valueOf(), then toString().
ToNumber (§7.1.4, pp. 65-67)
Converts a value to a number.
Algorithm (simplified):
ToNumber ( argument )
If argument is a Number, return argument.
If argument is a Symbol or BigInt, throw TypeError.
If argument is undefined, return NaN.
If argument is null, return +0.
If argument is true, return 1.
If argument is false, return +0.
If argument is a String, parse it as a number (details in §7.1.4.1).
If argument is an Object, return ? ToNumber(? ToPrimitive(argument, NUMBER)).
Examples:
Number(undefined); // NaN
Number(null); // 0 (WAT?!)
Number(true); // 1
Number(false); // 0
Number("42"); // 42
Number("42.5"); // 42.5
Number(" 42 "); // 42 (trims whitespace)
Number(""); // 0 (empty string is 0)
Number("hello"); // NaN
Number([]); // 0 ([] -> "" -> 0)
Number([5]); // 5 ([5] -> "5" -> 5)
Number([1, 2]); // NaN ([1,2] -> "1,2" -> NaN)
Number({}); // NaN ({} -> "[object Object]" -> NaN)
Number(Symbol("x")); // TypeError!
Number(42n); // TypeError!Why Number(null) === 0? Historical
accident. In JavaScript’s original implementation, null
was treated as “nothing,” which coerced to 0. It’s a
terrible decision but unchangeable due to backward
compatibility.
ToBoolean (§7.1.2, p. 64)
Converts a value to a boolean (covered in §2.1.3).
Falsy values: false,
0, -0, 0n, "",
null, undefined, NaN.
Everything else: Truthy.
ToPrimitive: The Object-to-Primitive Dance
Specification: §7.1.1 (pp. 63-64)
When JavaScript needs to coerce an object to a primitive (for arithmetic, string concatenation, etc.), it calls ToPrimitive:
ToPrimitive ( input [ , preferredType ] )
If input is not an Object, return input.
If preferredType is not present, let hint be “default”.
Else if preferredType is STRING, let hint be “string”.
Else, let hint be “number”.
Let exoticToPrim be ? GetMethod(input, @@toPrimitive).
If exoticToPrim is not undefined, then
- Let result be ? Call(exoticToPrim, input, « hint »).
- If result is not an Object, return result.
- Throw TypeError.
If hint is “default”, set hint to “number”.
Return ? OrdinaryToPrimitive(input, hint).
OrdinaryToPrimitive (§7.1.1.1, p. 64):
OrdinaryToPrimitive ( O, hint )
- If hint is “string”, then
- Let methodNames be « “toString”, “valueOf” ».
- Else,
- Let methodNames be « “valueOf”, “toString” ».
- For each name in methodNames, do
- Let method be ? Get(O, name).
- If IsCallable(method), then
- Let result be ? Call(method, O).
- If result is not an Object, return result.
- Throw TypeError.
Translation:
If
hintis"string", trytoString()first, thenvalueOf().If
hintis"number"or"default", tryvalueOf()first, thentoString().If both return objects, throw
TypeError.
Examples:
const obj = {
valueOf() { return 42; },
toString() { return "hello"; }
};
Number(obj); // 42 (hint is "number", tries valueOf first)
String(obj); // "hello" (hint is "string", tries toString first)
obj + ""; // "42" (hint is "default", which becomes "number")Custom Symbol.toPrimitive:
const obj = {
[Symbol.toPrimitive](hint) {
if (hint === "number") return 42;
if (hint === "string") return "hello";
return "default";
}
};
Number(obj); // 42
String(obj); // "hello"
obj + ""; // "default"Systems note: If you’re transpiling a language
with operator overloading to JavaScript, you might emit code that
defines Symbol.toPrimitive.
Abstract Equality (==) vs. Strict Equality (===)
Specification: §7.2.15 (pp. 74-75) for
==, §7.2.16 (p. 75) for ===.
Strict Equality (===)
No coercion. Types must match:
x === y
If Type(x) is different from Type(y), return false.
If Type(x) is Number or BigInt, compare numerically (handle NaN, ±0).
Return SameValueNonNumeric(x, y) (reference comparison for Objects, value comparison for primitives).
Examples:
5 === 5; // true
5 === "5"; // false (different types)
NaN === NaN; // false (NaN is unordered)
+0 === -0; // true
[] === []; // false (different object references)Abstract Equality (==)
Coercion madness. The algorithm is 12 steps (§7.2.15, pp. 74-75). Simplified version:
x == y
If Type(x) is the same as Type(y), return x === y.
If x is null and y is undefined (or vice versa), return true.
If x is Number and y is String, return x == ToNumber(y).
If x is String and y is Number, return ToNumber(x) == y.
If x is BigInt and y is String, convert String to BigInt and compare.
If x is Boolean, return ToNumber(x) == y.
If y is Boolean, return x == ToNumber(y).
If x is Object and y is primitive, return ToPrimitive(x) == y.
If y is Object and x is primitive, return x == ToPrimitive(y).
Otherwise, return false.
Examples:
null == undefined; // true (special case)
5 == "5"; // true (coerces "5" to 5)
0 == false; // true (false -> 0)
"" == false; // true ("" -> 0, false -> 0)
[] == false; // true ([] -> "" -> 0, false -> 0)
[] == ![]; // true (Chapter 1 example: [] -> 0, ![] -> false -> 0)
"0" == false; // true ("0" -> 0, false -> 0)
"0" == 0; // true ("0" -> 0)
0 == "0"; // true (symmetric)
false == "false"; // false ("false" -> NaN, false -> 0)Why [] == ![] is true (§1.6 in
Chapter 1):
![]→false(objects are truthy).[] == false.false→0(step 7).[] == 0.[]→ToPrimitive([])→""(step 8)."" == 0.""→0(step 3).0 == 0→true.
Engineering lesson: Never use
== unless you explicitly need coercion (rare).
Always use ===.
Relational Comparisons (<, >, <=, >=)
Specification: §7.2.14 (pp. 73-74)
Relational operators coerce to primitives (with hint
"number"), then compare:
x < y
Let px be ? ToPrimitive(x, NUMBER).
Let py be ? ToPrimitive(y, NUMBER).
If px and py are Strings, compare lexicographically.
Else, convert both to Number and compare numerically.
Examples:
5 < 10; // true
"5" < 10; // true ("5" -> 5)
"10" < "5"; // true (lexicographic: "1" < "5")
"10" < 5; // false ("10" -> 10, 10 < 5 is false)
[1] < [2]; // true ([1] -> "1" -> 1, [2] -> "2" -> 2)
{} < {}; // false ({} -> NaN, NaN < NaN is false)Gotcha: String comparison is lexicographic, not numeric:
"10" < "2"; // true ("1" < "2")
10 < 2; // falseSystems note: If you’re generating comparison code, ensure operands are the same type to avoid coercion surprises.
Autoboxing: Primitives with Methods
The Illusion of Primitive Methods
Primitives (String, Number, Boolean, Symbol, BigInt) are not objects. Yet you can call methods on them:
"hello".toUpperCase(); // "HELLO"
(42).toFixed(2); // "42.00"
true.toString(); // "true"How? Autoboxing (also called wrapping). When you access a property on a primitive, JavaScript temporarily converts it to an object:
Algorithm (§7.1.19, p. 72):
ToObject ( argument )
If argument is undefined or null, throw TypeError.
If argument is a Boolean, return a Boolean object wrapping argument.
If argument is a Number, return a Number object wrapping argument.
If argument is a String, return a String object wrapping argument.
If argument is a Symbol, return a Symbol object wrapping argument.
If argument is a BigInt, return a BigInt object wrapping argument.
If argument is an Object, return argument.
Example:
const str = "hello";
str.toUpperCase(); // Internally: ToObject(str).toUpperCase()The wrapper object is created, the method is called, and the wrapper is discarded.
Wrapper constructors:
const strObj = new String("hello");
console.log(typeof strObj); // "object"
console.log(strObj.valueOf()); // "hello" (unwrap)
console.log(strObj === "hello"); // false (object vs. primitive)Never use wrapper constructors explicitly (with
new). They’re confusing and unnecessary. If you want to
convert types, call without new:
String(42); // "42" (converts to primitive string)
Number("42"); // 42 (converts to primitive number)Why Autoboxing Matters
Performance: Autoboxing has overhead. Engines optimize it (e.g., V8 inlines wrapper creation), but it’s still slower than operating on primitives directly.
Mutation doesn’t work:
const str = "hello";
str.foo = "bar";
console.log(str.foo); // undefined (the wrapper is discarded)Each access creates a new wrapper, so assignments are lost.
Systems note: If you’re compiling to JavaScript
and want to attach metadata to primitives, you can’t. You’ll need to
use a Map or WeakMap to associate data
with primitives.
Value vs. Reference Semantics: The Great Divide
Primitives Are Passed by Value
When you assign a primitive to a variable or pass it to a function, the value is copied:
let a = 5;
let b = a;
b = 10;
console.log(a); // 5 (unchanged)function modify(x) {
x = 100;
}
let num = 50;
modify(num);
console.log(num); // 50 (unchanged)Systems analogy: Primitives behave like C’s
int, double, etc.—pass-by-value.
Objects Are Passed by Reference
When you assign an object to a variable or pass it to a function, the reference is copied, not the object itself:
let obj1 = { x: 5 };
let obj2 = obj1;
obj2.x = 10;
console.log(obj1.x); // 10 (mutated!)function modify(obj) {
obj.x = 100;
}
let myObj = { x: 50 };
modify(myObj);
console.log(myObj.x); // 100 (mutated!)But reassigning the variable doesn’t affect the original:
function modify(obj) {
obj = { x: 200 };
}
let myObj = { x: 50 };
modify(myObj);
console.log(myObj.x); // 50 (unchanged, because we reassigned the local variable)Systems analogy: Objects behave like C pointers—you’re passing the address, not the data. Mutating through the pointer affects the original. Reassigning the pointer doesn’t.
Implications for Transpiler Design
If you’re compiling a language with value semantics (like Rust or C++ structs) to JavaScript:
Option 1: Represent structs as objects, but deep clone on assignment (expensive).
Option 2: Use immutable patterns (return new objects instead of mutating).
Option 3: Use
Object.freeze()to prevent mutation (throws errors in strict mode).
Example (deep clone):
function cloneStruct(obj) {
return JSON.parse(JSON.stringify(obj)); // Naive, doesn't handle functions/symbols
}
let struct1 = { x: 5, y: 10 };
let struct2 = cloneStruct(struct1);
struct2.x = 20;
console.log(struct1.x); // 5 (unchanged)Better cloning (structured clone, ES2022):
let struct2 = structuredClone(struct1);Shallow vs. Deep Comparison
=== compares references for
objects:
console.log({} === {}); // false (different references)
console.log([] === []); // false
let obj = {};
console.log(obj === obj); // true (same reference)Deep comparison requires a custom function:
function deepEqual(a, b) {
if (a === b) return true;
if (typeof a !== "object" || typeof b !== "object") return false;
if (a === null || b === null) return false;
const keysA = Object.keys(a);
const keysB = Object.keys(b);
if (keysA.length !== keysB.length) return false;
for (const key of keysA) {
if (!keysB.includes(key)) return false;
if (!deepEqual(a[key], b[key])) return false;
}
return true;
}
console.log(deepEqual({ x: 1 }, { x: 1 })); // true
console.log(deepEqual([1, 2], [1, 2])); // trueSystems note: If you’re emitting comparison
code, primitives can use ===, but objects require deep
comparison if you want value semantics.
Memory Management: Garbage Collection and WeakMaps
JavaScript Has Automatic Garbage Collection
Unlike C, you don’t manually allocate/free memory. JavaScript engines use tracing garbage collectors (mark-and-sweep or generational GC).
Key insight: You can’t deallocate memory explicitly. Objects are collected when unreachable.
Example:
let obj = { data: "large array" };
obj = null; // Now the object is unreachable and can be GC'dGenerational GC (V8)
V8 uses a generational garbage collector with two generations:
Young generation (nursery): New objects. GC’d frequently (minor GC, fast).
Old generation (tenured): Objects that survive multiple young-gen GCs. GC’d infrequently (major GC, slow).
Promotion: Objects move from young to old after surviving several collections.
Stop-the-world pauses: GC pauses JavaScript execution. V8 uses incremental marking and concurrent sweeping to minimize pauses.
Memory Leaks in JavaScript
You can’t have use-after-free or double-free bugs (GC prevents that), but you can have memory leaks—objects that are reachable but no longer needed.
Common causes:
Global variables: Never collected.
Closures: Capture variables in scope, preventing collection.
Event listeners: Not removed when DOM elements are removed.
Caches without eviction:
Mapor object holding references indefinitely.
Example (closure leak):
function createLeakyFunction() {
const largeData = new Array(1000000).fill("data");
return function() {
console.log("I capture largeData in my closure!");
};
}
const fn = createLeakyFunction(); // largeData is retainedEven if you never use largeData in the returned
function, it’s captured in the closure’s scope.
WeakMap and WeakSet: Weak References
Specification: §25.3 (WeakMap, pp. 651-656), §25.4 (WeakSet, pp. 657-660)
WeakMap and WeakSet hold weak
references to keys. If the only reference to a key is from
a WeakMap/WeakSet, the key can be garbage collected.
Use case: Associating metadata with objects without preventing their collection.
Example:
const metadata = new WeakMap();
let obj = { id: 1 };
metadata.set(obj, { timestamp: Date.now() });
console.log(metadata.get(obj)); // { timestamp: ... }
obj = null; // Now obj can be GC'd, and the WeakMap entry is removedContrast with Map:
const metadata = new Map();
let obj = { id: 1 };
metadata.set(obj, { timestamp: Date.now() });
obj = null; // obj is still reachable via the Map, so it's NOT GC'dWeakRef (ES2021): For even finer control,
WeakRef creates a weak reference to an object. You can
check if it’s still alive:
let obj = { data: "important" };
const ref = new WeakRef(obj);
console.log(ref.deref()); // { data: "important" }
obj = null; // Object may be GC'd
setTimeout(() => {
console.log(ref.deref()); // undefined (if GC'd)
}, 1000);FinalizationRegistry (ES2021): Runs a callback when an object is collected:
const registry = new FinalizationRegistry((heldValue) => {
console.log(`Object with value ${heldValue} was collected`);
});
let obj = { id: 1 };
registry.register(obj, "myObject");
obj = null; // Callback runs eventually after GCSystems note: If you’re implementing a transpiler with manual memory management (e.g., compiling C to JavaScript), you can use WeakMap to associate metadata (like allocation info) with objects without preventing collection.
Specification Types: Internal Constructs You’ll Never Touch Directly
What Are Specification Types?
In addition to the seven language types (Undefined, Null, Boolean, String, Symbol, Number, BigInt, Object), the spec defines specification types—internal constructs used in algorithmic descriptions. These are not accessible in JavaScript code.
Specification types (§6.2, pp. 43-61):
Completion Record: Represents the result of an operation (normal, return, throw, break, continue).
Reference: Represents a binding (variable, property).
Property Descriptor: Describes a property’s attributes.
Environment Record: Stores variable bindings (lexical scope).
Data Block: Raw binary data (for TypedArrays, SharedArrayBuffer).
Completion Record: How Control Flow Works Internally
Specification: §6.2.3 (pp. 44-45)
Every abstract operation returns a Completion Record:
CompletionRecord { [[Type]]: normal | return | throw | break | continue [[Value]]: any value [[Target]]: label (for break/continue) }
Normal completion: Operation succeeded,
[[Value]] is the result.
Abrupt completion: Operation failed or control
flow changed (return, throw,
break, continue).
The ? operator (§5.2.3.3,
p. 15):
Let result be ? SomeOperation().
Equivalent to:
Let result be SomeOperation(). If result is an abrupt completion, return result. Otherwise, let result be result.[[Value]].
The ! operator (§5.2.3.4,
p. 15):
Let result be ! SomeOperation().
Equivalent to:
Let result be SomeOperation(). Assert: result is a normal completion. Let result be result.[[Value]].
Systems analogy: Completion Records are like
error codes in C (-1 for error, 0 for
success) or Rust’s Result<T, E>. The
? operator is like Rust’s ? operator
(early return on error).
Reference: How Variable Lookup Works
Specification: §6.2.4 (pp. 45-46)
A Reference represents a binding to a variable or property:
Reference { [[Base]]: environment record or object [[ReferencedName]]: property key or variable name [[Strict]]: boolean (strict mode?) }
Example: In obj.prop, the reference
is:
Reference { [[Base]]: obj [[ReferencedName]]: “prop” }
GetValue (§6.2.4.5, p. 46) dereferences the reference:
GetValue ( V )
If V is not a Reference, return V.
Let base be V.[[Base]].
If base is an Environment Record, return base.GetBindingValue(V.[[ReferencedName]]).
Else (base is an Object),
- Return ? base.[Get].
Systems note: This is how a.b is
evaluated—the parser creates a Reference, and runtime dereferences
it.
Property Descriptor: How Properties Are Stored
Specification: §6.2.5 (pp. 46-47)
A Property Descriptor describes a property’s attributes:
PropertyDescriptor { [[Value]]: any (for data properties) [[Writable]]: boolean (for data properties) [[Get]]: function (for accessor properties) [[Set]]: function (for accessor properties) [[Enumerable]]: boolean [[Configurable]]: boolean }
Data property:
const obj = {};
Object.defineProperty(obj, "x", {
value: 42,
writable: true,
enumerable: true,
configurable: true
});Accessor property:
const obj = {};
Object.defineProperty(obj, "x", {
get() { return this._x; },
set(val) { this._x = val; },
enumerable: true,
configurable: true
});Systems note: If you’re implementing object property access in a transpiler, you need to handle both data and accessor properties correctly.
Environment Record: Lexical Scope
Specification: §9.1 (pp. 137-145)
An Environment Record stores variable bindings for a scope:
EnvironmentRecord { [[OuterEnv]]: parent environment (for scope chain) bindings: map of variable names to values }
Three kinds:
Declarative Environment Record:
let,const, function parameters.Object Environment Record:
withstatement (legacy), global object.Global Environment Record: Global scope (combines declarative and object).
Lexical scope is implemented as a linked list of environment records:
Global Environment
-> Function Environment (outer function)
-> Block Environment (let/const in block)
Systems analogy: Environment Records are like
stack frames in C, but heap-allocated (for closures). The
[[OuterEnv]] pointer is like the frame pointer linking
to the caller’s frame.
End of Chapter 2
In the next chapter, we’ll explore JavaScript’s object model in
depth—prototypes, inheritance, property access, the
this keyword, and how engines optimize object layout
with hidden classes.
Chapter 3: The Execution Model: Event Loop, Call Stack, and Heap
JavaScript’s Single-Threaded Concurrency Model
The Fundamental Constraint: One Thread of Execution
JavaScript was designed for the browser. In 1995, Netscape wanted a simple scripting language that could manipulate HTML without the complexity of threads. Brendan Eich’s solution: a single-threaded execution model with asynchronous I/O.
What does “single-threaded” mean?
Only one JavaScript call stack at a time.
Only one statement executes at any given moment.
No race conditions on shared memory (in the traditional sense).
No need for locks, mutexes, or semaphores in JavaScript code.
Systems analogy: Contrast this with POSIX threads (pthreads) in C:
pthread_t thread1, thread2;
int shared_counter = 0;
pthread_mutex_t lock;
void* increment(void* arg) {
pthread_mutex_lock(&lock);
shared_counter++;
pthread_mutex_unlock(&lock);
return NULL;
}
pthread_create(&thread1, NULL, increment, NULL);
pthread_create(&thread2, NULL, increment, NULL);In JavaScript, you never write code like this. The single-threaded model eliminates the need for explicit synchronization primitives at the language level.
But how does JavaScript handle I/O without blocking?
The answer is the event loop—a run-to-completion model where asynchronous operations are queued and executed when the stack is empty.
The Event Loop Architecture: A Conceptual Overview
The JavaScript runtime consists of three core components:
Call Stack: Tracks function execution (stack frames).
Heap: Stores objects and closures (garbage-collected memory).
Event Loop: Coordinates asynchronous tasks (timers, I/O, promises).
Diagram (conceptual):
┌│─────────────────────────────────────────────────────────┐
││ JavaScript Runtime │
├│─────────────────────────────────────────────────────────┤
││ │
││ ┌──────────────┐ ┌──────────────┐ │
││ │ Call Stack │ │ Heap │ │
││ │ │ │ │ │
││ │ frame_n │ │ Object@0x1 │ │
││ │ frame_n-1 │ │ Object@0x2 │ │
││ │ … │ │ … │ │
││ │ frame_0 │ │ │ │
││ └──────────────┘ └──────────────┘ │
││ │
││ ┌─────────────────────────────────────────────────┐ │
││ │ Event Loop (Message Queue) │ │
││ ├─────────────────────────────────────────────────┤ │
││ │ Task Queue (Macrotasks): │ │
││ │ [setTimeout, setInterval, I/O, UI events] │ │
││ │ Microtask Queue: │ │
││ │ [Promise callbacks, queueMicrotask] │ │
││ └─────────────────────────────────────────────────┘ │
││ │ └─────────────────────────────────────────────────────────┘ ▲ │ Web APIs / Node.js APIs (setTimeout, fetch, fs.readFile, etc.)
Execution flow:
Synchronous code runs on the call stack until completion.
Asynchronous operations (timers, I/O) are delegated to Web APIs (browser) or libuv (Node.js).
When an async operation completes, its callback is enqueued.
The event loop dequeues tasks when the call stack is empty.
Key invariant: The event loop never interrupts executing code. Each task runs to completion before the next task starts.
The Event Loop Specification (HTML Living Standard)
JavaScript’s core spec (ECMA-262) does not define the event loop. It’s defined in:
Node.js: libuv documentation
ECMA-262’s Job Queue (§9.5, pp. 150-151):
The spec defines an abstract job queue for promise reactions (microtasks), but leaves the event loop implementation-defined:
Jobs are scheduled for future execution via EnqueueJob. The Host (browser or Node.js) determines when jobs are executed.
HTML Living Standard’s Event Loop:
Each event loop has:
Task queues (plural): Multiple queues for different task sources (timers, I/O, rendering).
Microtask queue: Single queue for promise callbacks.
Rendering pipeline: Steps for updating the DOM and screen.
Node.js’s Event Loop:
Six phases, each with a queue:
Timers:
setTimeout,setInterval.Pending callbacks: I/O callbacks deferred from the previous cycle.
Idle, prepare: Internal use.
Poll: Retrieve new I/O events; execute I/O callbacks.
Check:
setImmediatecallbacks.Close callbacks:
socket.on('close', ...).
Between each phase, the microtask queue is drained.
Systems takeaway: The event loop is host-defined, but the core principle is the same: run to completion + task queues.
The Call Stack: Execution Contexts and Stack Frames
Execution Contexts: The Spec’s Abstraction
Specification: §9.4 (pp. 146-150)
An execution context is the spec’s abstraction for “what’s currently executing.” It contains:
ExecutionContext { CodeEvaluationState: suspend/resume (for generators/async) Function: the function being executed (or null for global/eval) Realm: the global object and intrinsics ScriptOrModule: the script or module LexicalEnvironment: current lexical scope (let/const) VariableEnvironment: current variable scope (var) PrivateEnvironment: private fields (class fields) }
Execution context stack: A stack of execution contexts (analogous to the call stack in C).
Example:
function outer() {
let x = 1;
function inner() {
let y = 2;
console.log(x + y);
}
inner();
}
outer();Execution context stack evolution:
Initial: [GlobalExecutionContext]
outer() called: [GlobalExecutionContext, outer_ExecutionContext]
inner() called: [GlobalExecutionContext, outer_ExecutionContext, inner_ExecutionContext]
inner() returns: [GlobalExecutionContext, outer_ExecutionContext]
outer() returns: [GlobalExecutionContext]
Systems analogy: This is identical to the call stack in C:
void inner() { int y = 2; printf("%d\n", y); }
void outer() { int x = 1; inner(); }
void main() { outer(); }Call stack:
main’s frame
-> outer’s frame
-> inner's frame
Stack Frames in JavaScript Engines
Engines don’t literally implement the spec’s execution context stack—they use optimized stack frames (like C compilers).
V8’s stack frame (simplified):
┌│─────────────────────────────────────┐
││ Return Address (caller’s PC) │
││ Frame Pointer (FP, points to base) │
││ Context (pointer to lexical env) │
││ Function (pointer to JSFunction) │
││ Receiver (this) │
││ Argument 0 │
││ Argument 1 │
││ … │
││ Local Variable 0 │
││ Local Variable 1 │
││ … │
││ Temporary Value 0 (stack operands) │
││ Temporary Value 1 │
││ … │ └─────────────────────────────────────┘
Key elements:
Return address: Where to jump after the function returns.
Frame pointer (FP): Base of the current frame (like
rbpin x86-64).Context: Pointer to the LexicalEnvironment (for closures).
Receiver: The
thisvalue.Arguments: Function parameters.
Locals: Local variables.
Temporaries: Operands for bytecode instructions.
Systems note: JavaScript’s stack frames are heap-allocated if the function creates a closure (the context must outlive the function). Otherwise, they’re stack-allocated.
Stack Overflow: Recursion Depth Limits
JavaScript has a maximum call stack size
(implementation-defined). Exceeding it throws
RangeError: Maximum call stack size exceeded.
Example:
function recurse() {
recurse(); // Infinite recursion
}
recurse(); // RangeErrorStack size limits (typical):
V8 (Chrome/Node.js): ~10,000-15,000 frames (depends on platform).
SpiderMonkey (Firefox): ~50,000 frames.
JavaScriptCore (Safari): ~100,000 frames.
Systems implication: If you’re compiling a recursive language (Scheme, Haskell) to JavaScript, you need tail call optimization (TCO) or trampolining.
Tail Call Optimization (TCO): ES6 specified TCO in strict mode (§15.9.1, p. 361), but only Safari implements it. V8 and SpiderMonkey explicitly rejected it due to debugger complexity.
Trampolining: Convert recursion to iteration:
function trampoline(fn) {
let result = fn();
while (typeof result === "function") {
result = result();
}
return result;
}
function factorial(n, acc = 1) {
if (n <= 1) return acc;
return () => factorial(n - 1, n * acc); // Return a thunk
}
console.log(trampoline(() => factorial(100000))); // Works!Continuation-Passing Style (CPS): Another approach—pass the “rest of the computation” as a callback:
function factorial_cps(n, k) {
if (n <= 1) return k(1);
return factorial_cps(n - 1, (result) => k(n * result));
}
factorial_cps(5, (result) => console.log(result)); // 120But this still overflows without TCO. You’d need to trampoline the CPS’d function.
Stack Traces and Debugging
Error.stack (non-standard but
universal):
function foo() { bar(); }
function bar() { baz(); }
function baz() { console.log(new Error().stack); }
foo();Output (V8):
Error at baz (
Source maps: For transpiled/minified code,
engines use source maps (.map files)
to map stack traces back to original source.
Systems note: If you’re building a transpiler, generate source maps (§13.5) so stack traces are useful.
The Heap: Object Storage and Garbage Collection
Heap Layout: Young Gen, Old Gen, Large Objects
JavaScript engines use generational garbage collection with multiple heap regions:
V8’s heap structure:
┌│─────────────────────────────────────────────────────┐
││ V8 Heap │
├│─────────────────────────────────────────────────────┤
││ Young Generation (Scavenger, Semi-Space) │
││ - From-space (active) │
││ - To-space (evacuation target) │
││ - Typical size: 1-8 MB │
││ - GC frequency: Very frequent (minor GC) │
├│─────────────────────────────────────────────────────┤
││ Old Generation (Mark-Sweep-Compact) │
││ - Old pointer space (objects with pointers) │
││ - Old data space (objects without pointers) │
││ - Typical size: 100s of MB │
││ - GC frequency: Infrequent (major GC) │
├│─────────────────────────────────────────────────────┤
││ Large Object Space │
││ - Objects > 64 KB (in V8) │
││ - Never moved (to avoid copying overhead) │
├│─────────────────────────────────────────────────────┤
││ Code Space │
││ - Compiled machine code (JIT’d functions) │
││ - Executable memory (W^X protection) │ └─────────────────────────────────────────────────────┘
Young generation (nursery):
New objects are allocated here.
Minor GC (Scavenger): Runs frequently (~1-10 ms pause).
Copying collector: Evacuates live objects to to-space, then swaps spaces.
Survival: Objects that survive 2-3 minor GCs are promoted to old gen.
Old generation:
Long-lived objects.
Major GC (Mark-Sweep-Compact): Runs infrequently (~100-1000 ms pause).
Tri-color marking: Incremental marking to reduce pauses.
Compaction: Periodically compacts memory to reduce fragmentation.
Large object space:
Objects too large to fit in young/old gen (e.g., large arrays, buffers).
Never moved (to avoid copying cost).
Systems analogy: Similar to multi-generational GC in JVM (Hotspot), GHC (Haskell), or .NET CLR.
Garbage Collection Algorithms
Minor GC: Cheney’s Semi-Space Collector
Algorithm:
Divide young gen into two equal semi-spaces: from-space and to-space.
Allocate objects in from-space (bump-pointer allocation, very fast).
When from-space is full, evacuate live objects to to-space:
Start from roots (global objects, stack, registers).
Traverse the object graph (BFS or DFS).
Copy each reachable object to to-space.
Update pointers to point to the new location.
Swap from-space and to-space.
Dead objects are implicitly reclaimed (left in the old from-space).
Complexity:
Time: where is the number of live objects (dead objects are ignored).
Space: Wastes 50% of young gen (to-space is empty until GC).
Why so fast? Most objects die young (the generational hypothesis). Copying only live objects is faster than marking all objects.
Systems note: This is the same algorithm used in early LISP systems (Cheney 1970).
Major GC: Mark-Sweep-Compact
Mark phase:
Start from roots.
Mark reachable objects (set a bit in the object header or a bitmap).
Use tri-color marking for incrementality:
White: Unvisited.
Gray: Visited, but children not yet scanned.
Black: Visited, children scanned.
Sweep phase:
Scan the entire old gen.
Add unmarked objects to the free list.
Reset mark bits.
Compact phase (optional, periodic):
Move objects to eliminate fragmentation.
Update all pointers.
Incremental marking: V8 splits the mark phase into small steps (1-5 ms each), interleaved with JavaScript execution. This reduces pause times.
Concurrent marking/sweeping: V8 uses parallel threads to mark/sweep concurrently with JavaScript execution (requires write barriers to track pointer updates).
Systems analogy: Similar to mark-sweep in Boehm GC (C/C++) or concurrent mark-sweep in JVM G1GC.
Write Barriers: Tracking Pointer Updates
Problem: If the GC runs incrementally, JavaScript code might update pointers while marking is in progress. The GC must track these updates to avoid missing live objects.
Solution: Write barriers—instrumentation inserted by the compiler to track pointer writes.
Example (pseudo-code):
// Without write barrier:
obj.field = newValue;
// With write barrier:
obj.field = newValue;
if (isMarking && isBlack(obj) && isWhite(newValue)) {
markGray(newValue); // Prevent missing this object
}V8’s write barrier:
; Store newValue to obj.field
mov [obj + field_offset], newValue
; Write barrier check
test [marking_flag], 1
jz skip_barrier
call WriteBarrier
skip_barrier:
Cost: Write barriers add overhead to every pointer write (~5-10% slowdown). But they enable concurrent GC, which reduces pauses.
Systems note: If you’re compiling to WebAssembly and want GC, you’ll need to implement write barriers yourself (WebAssembly GC proposal will add native support).
Memory Leaks: Common Pitfalls
Even with GC, you can leak memory by keeping objects reachable when they’re no longer needed.
Pitfall 1: Accidental Global Variables
function leak() {
leakyVariable = "This is a global!"; // Forgot 'let', becomes global
}
leak();Fix: Use strict mode
("use strict";), which throws
ReferenceError for undeclared variables.
Pitfall 2: Closures Capturing Large Scopes
function createLeakyFunction() {
const largeArray = new Array(1000000).fill("data");
const smallValue = 42;
return function() {
console.log(smallValue); // Only uses smallValue
};
}
const fn = createLeakyFunction(); // But largeArray is retained!Why? JavaScript closures capture the entire lexical environment, not just the variables they reference.
Fix (manual scope minimization):
function createLeakyFunction() {
const largeArray = new Array(1000000).fill("data");
const smallValue = 42;
// Process largeArray here, then discard
// ...
return function() {
console.log(smallValue); // Now only smallValue is captured
};
}V8 optimization: Modern engines analyze closures and only capture used variables. But this isn’t guaranteed.
Pitfall 3: Event Listeners Not Removed
const element = document.getElementById("myButton");
const handler = () => console.log("Clicked!");
element.addEventListener("click", handler);
// Later, remove the element from DOM:
element.remove();
// But the listener is still registered! Memory leak.Fix: Remove listeners explicitly:
element.removeEventListener("click", handler);Or use { once: true }:
element.addEventListener("click", handler, { once: true });Pitfall 4: Caches Without Eviction
const cache = {};
function getCachedData(key) {
if (!(key in cache)) {
cache[key] = expensiveComputation(key);
}
return cache[key];
}Problem: cache grows unbounded.
Fix: Use Map with size limits, or
WeakMap (if keys are objects):
const cache = new WeakMap();
function getCachedData(obj) {
if (!cache.has(obj)) {
cache.set(obj, expensiveComputation(obj));
}
return cache.get(obj);
}
// If obj is no longer reachable, the cache entry is GC'dProfiling Heap Usage
Chrome DevTools:
Open DevTools → Memory tab.
Take a heap snapshot.
Analyze object retainers (why an object is still alive).
Node.js:
const v8 = require("v8");
const heapStats = v8.getHeapStatistics();
console.log(heapStats);Output:
{
total_heap_size: 7376896,
used_heap_size: 4523432,
heap_size_limit: 2197815296,
// ...
}Memory snapshots:
const v8 = require("v8");
const fs = require("fs");
const snapshot = v8.writeHeapSnapshot();
console.log(`Snapshot written to ${snapshot}`);Open the .heapsnapshot file in Chrome DevTools for
analysis.
The Event Loop: Task Queues and Microtasks
Macrotasks vs. Microtasks
JavaScript has two types of asynchronous tasks:
Macrotasks (Tasks):
setTimeout,setInterval, I/O, UI events.Microtasks (Jobs): Promise callbacks (
then,catch,finally),queueMicrotask(),MutationObserver.
Execution order:
Execute one macrotask from the task queue.
Execute ALL microtasks from the microtask queue.
Render (if browser, and if necessary).
Repeat.
Key difference: Microtasks interrupt macrotasks. After each macrotask, the microtask queue is drained before the next macrotask.
Example:
console.log("Script start");
setTimeout(() => console.log("setTimeout"), 0);
Promise.resolve()
.then(() => console.log("Promise 1"))
.then(() => console.log("Promise 2"));
console.log("Script end");Output:
Script start Script end Promise 1 Promise 2 setTimeout
Explanation:
Synchronous code:
"Script start","Script end".Microtasks:
"Promise 1","Promise 2".Macrotask:
"setTimeout".
Systems analogy: Microtasks are like interrupt handlers—they run before the next task. Macrotasks are like scheduled jobs.
Task Queue: Macrotask Sources
Browser:
setTimeout/setInterval: Timers.
I/O:
fetch(),XMLHttpRequest.UI events:
click,keydown, etc.Rendering:
requestAnimationFrame.
Node.js:
Timers:
setTimeout,setInterval.I/O:
fs.readFile,net.connect.Immediate:
setImmediate(runs after I/O, before timers).Close callbacks:
socket.on('close').
Task ordering: The spec does not guarantee FIFO order for tasks from different sources. In practice, each task source has its own queue, and the event loop selects from them in implementation-defined order.
Microtask Queue: Promise Callbacks
HTML spec: After each task, the event loop performs a microtask checkpoint:
Microtask Checkpoint
If the microtask queue is empty, return.
Let oldestMicrotask be the first microtask in the queue.
Remove oldestMicrotask from the queue.
Run oldestMicrotask.
Goto step 1 (repeat until queue is empty).
ECMA-262 spec: §9.5.4 (p. 151) defines PerformPromiseJob (equivalent to microtask execution).
Example (nested promises):
Promise.resolve().then(() => {
console.log("1");
Promise.resolve().then(() => console.log("2"));
});
Promise.resolve().then(() => console.log("3"));Output:
1 3 2
Explanation:
First
thencallback:console.log("1"), enqueues"2".Second
thencallback:console.log("3").Nested
thencallback:console.log("2").
Microtask starvation: If a microtask keeps enqueuing more microtasks, the event loop never advances to the next macrotask:
function recursiveMicrotask() {
queueMicrotask(recursiveMicrotask);
}
recursiveMicrotask();
// Infinite loop! No macrotasks (timers, I/O) will ever run.Systems lesson: Microtasks can starve macrotasks. Use with caution.
queueMicrotask(): Manual Microtask Scheduling
Specification: HTML Living Standard, Section 8.5.2
API:
queueMicrotask(callback);Example:
console.log("Start");
queueMicrotask(() => console.log("Microtask"));
console.log("End");Output:
Start End Microtask
Use case: Batching updates:
let updates = [];
function scheduleUpdate(data) {
updates.push(data);
if (updates.length === 1) {
queueMicrotask(() => {
flushUpdates();
updates = [];
});
}
}
function flushUpdates() {
console.log("Flushing:", updates);
}
scheduleUpdate("A");
scheduleUpdate("B");
scheduleUpdate("C");
// Output: Flushing: ["A", "B", "C"]Node.js Event Loop: Six Phases
Node.js event loop (libuv):
┌───────────────────────────┐
┌│─>│ timers │ (setTimeout, setInterval)
││ └─────────────┬─────────────┘
││ ┌─────────────┴─────────────┐
││ │ pending callbacks │ (I/O callbacks deferred from prev cycle)
││ └─────────────┬─────────────┘
││ ┌─────────────┴─────────────┐
││ │ idle, prepare │ (internal)
││ └─────────────┬─────────────┘ ┌───────────────┐
││ ┌─────────────┴─────────────┐ │ incoming: │
││ │ poll │<─────┤ connections, │
││ └─────────────┬─────────────┘ │ data, etc. │
││ ┌─────────────┴─────────────┐ └───────────────┘
││ │ check │ (setImmediate)
││ └─────────────┬─────────────┘
││ ┌─────────────┴─────────────┐ └──┤ close callbacks │ (e.g., socket.on(‘close’)) └───────────────────────────┘
Between each phase: Microtask queue is drained.
Timers phase:
setTimeout(() => console.log("Timer 1"), 0);
setTimeout(() => console.log("Timer 2"), 0);
// Both run in the timers phase, in order.Poll phase:
Wait for I/O events (with a timeout).
Execute I/O callbacks.
If no timers and no
setImmediate, block until I/O arrives.
Check phase (setImmediate):
setImmediate(() => console.log("Immediate"));Runs after the poll phase, even if scheduled in the same iteration.
Close callbacks phase:
const socket = new net.Socket();
socket.on("close", () => console.log("Socket closed"));
socket.destroy();process.nextTick(): The Hidden Microtask Queue
Node.js-specific:
process.nextTick() schedules a callback to run
before the event loop continues to the next
phase.
Execution order:
Run the current phase’s callbacks.
Execute ALL process.nextTick callbacks.
Execute ALL promise microtasks.
Move to the next phase.
Example:
console.log("Start");
setTimeout(() => console.log("setTimeout"), 0);
setImmediate(() => console.log("setImmediate"));
process.nextTick(() => console.log("nextTick 1"));
Promise.resolve().then(() => console.log("Promise"));
process.nextTick(() => console.log("nextTick 2"));
console.log("End");Output (Node.js):
Start End nextTick 1 nextTick 2 Promise setTimeout setImmediate
Why nextTick before promises?
Node.js’s nextTick queue is processed
before the microtask queue. This is a
Node.js-specific quirk.
Warning: process.nextTick can
starve the event loop:
function recursiveNextTick() {
process.nextTick(recursiveNextTick);
}
recursiveNextTick();
// Infinite loop! Event loop phases never advance.Systems lesson: Prefer
queueMicrotask() or promises for portability. Only use
process.nextTick() when you explicitly need to run
before microtasks (rare).
Timers: setTimeout, setInterval, and Precision
setTimeout: Single-Shot Timer
Specification: HTML Living Standard, Section 8.6
API:
const timerId = setTimeout(callback, delay, ...args);
clearTimeout(timerId);Example:
setTimeout(() => console.log("Hello"), 1000);Key points:
Minimum delay: Browsers clamp delays < 4ms to 4ms (if nested > 5 levels). Node.js has no minimum (0ms is allowed).
Not guaranteed: The callback runs at least
delaymilliseconds later, but may be delayed by:Long-running tasks on the call stack.
Other tasks in the queue.
Passed arguments: After
delay, you can pass arguments to the callback:
setTimeout((a, b) => console.log(a + b), 1000, 5, 10); // Logs 15 after 1sCanceling a timer:
const id = setTimeout(() => console.log("Never runs"), 1000);
clearTimeout(id);setInterval: Repeating Timer
API:
const intervalId = setInterval(callback, delay, ...args);
clearInterval(intervalId);Example:
let count = 0;
const id = setInterval(() => {
console.log(++count);
if (count === 5) clearInterval(id);
}, 1000);Pitfall: Interval drift: If the callback takes
longer than delay, intervals can
overlap or drift.
Example:
setInterval(() => {
const start = Date.now();
while (Date.now() - start < 100) {} // Simulate 100ms work
console.log("Done");
}, 50); // Scheduled every 50ms, but takes 100msOutput: “Done” every ~100ms, not 50ms.
Fix: Use setTimeout
recursively:
function repeat() {
console.log("Done");
setTimeout(repeat, 50);
}
repeat();Now each invocation is scheduled after the previous one completes.
Timer Precision: The Reality Check
JavaScript timers are not precise. The delay is a minimum, not a guarantee.
Example:
const start = Date.now();
setTimeout(() => {
console.log(`Elapsed: ${Date.now() - start}ms`);
}, 1000);Output: Typically 1000-1005ms, but can be 1050ms or more if the system is busy.
Why?
Event loop: Timers are checked at the start of each event loop iteration. If the loop is blocked, timers are delayed.
OS scheduler: The OS may not wake the process exactly when the timer expires.
Browser throttling: Background tabs throttle timers to 1-second intervals (to save battery).
High-precision timers
(performance.now()):
const start = performance.now();
setTimeout(() => {
const elapsed = performance.now() - start;
console.log(`Elapsed: ${elapsed}ms`);
}, 1000);Output: More precise (microsecond resolution), but still subject to event loop delays.
Systems note: If you need precise timing (e.g.,
for animation), use requestAnimationFrame() (browser)
or high-resolution timers (Node.js
process.hrtime()).
Promises and Async/Await: Syntactic Sugar Over the Event Loop
Promises: Microtask-Based Asynchrony
Specification: §27.2 (pp. 673-706)
Promise states (§27.2.1.1, p. 674):
Pending: Initial state.
Fulfilled: Operation completed successfully.
Rejected: Operation failed.
State transitions:
pending → fulfilled (resolve) pending → rejected (reject)
Once fulfilled or rejected, a promise is settled (immutable).
Creating a promise:
const promise = new Promise((resolve, reject) => {
setTimeout(() => {
if (Math.random() > 0.5) {
resolve("Success!");
} else {
reject(new Error("Failure!"));
}
}, 1000);
});
promise
.then(result => console.log(result))
.catch(error => console.error(error));Chaining:
Promise.resolve(5)
.then(x => x * 2)
.then(x => x + 3)
.then(x => console.log(x)); // 13Each then returns a new promise,
allowing chains.
Async/Await: State Machine Compilation
Specification: §15.8 (pp. 346-354)
async/await is syntactic
sugar for promises. The compiler transforms it into a state
machine.
Example:
async function fetchData() {
const response = await fetch("https://api.example.com/data");
const data = await response.json();
return data;
}Desugared (conceptual):
function fetchData() {
return new Promise((resolve, reject) => {
fetch("https://api.example.com/data")
.then(response => response.json())
.then(data => resolve(data))
.catch(error => reject(error));
});
}Actually, the compiler generates a state machine (similar to generators):
function fetchData() {
let state = 0;
let response, data;
function step(value) {
switch (state) {
case 0:
state = 1;
return fetch("https://api.example.com/data");
case 1:
response = value;
state = 2;
return response.json();
case 2:
data = value;
return data;
}
}
return new Promise((resolve, reject) => {
function next(value) {
const result = step(value);
if (result instanceof Promise) {
result.then(next, reject);
} else {
resolve(result);
}
}
next();
});
}Systems analogy: Async functions are compiled to
coroutines with explicit state. The
await keyword is a suspension
point—execution pauses, the promise is awaited, and
execution resumes when the promise settles.
Error Handling: try/catch in Async Functions
async function fetchData() {
try {
const response = await fetch("https://api.example.com/data");
const data = await response.json();
return data;
} catch (error) {
console.error("Failed to fetch:", error);
return null;
}
}Desugared:
function fetchData() {
return fetch("https://api.example.com/data")
.then(response => response.json())
.catch(error => {
console.error("Failed to fetch:", error);
return null;
});
}Key point: await in a
try block can be caught in the catch
block. This is more ergonomic than .catch() chains.
Top-Level Await (ES2022)
Specification: §16.1.8 (p. 383)
ES2022 allows await at the top level of modules:
// module.js
const data = await fetch("https://api.example.com/data");
export default data;How it works: The module’s evaluation is suspended until the promise settles. Dependent modules wait for this module to finish.
Execution order:
// a.js
console.log("A start");
await delay(100);
console.log("A end");
// b.js
console.log("B start");
await delay(50);
console.log("B end");
// main.js
import "./a.js";
import "./b.js";
console.log("Main");Output:
A start B start B end A end Main
Modules are evaluated in parallel (in import
order), but main.js waits for both to settle.
Systems note: Top-level await changes module loading from synchronous to asynchronous. This affects bundlers (Webpack, Rollup) and requires careful dependency management.
Web Workers and Shared Memory: Breaking the Single-Threaded Model
Web Workers: True Parallelism in the Browser
Specification: HTML Living Standard, Section 10.2
Web Workers run JavaScript in a separate thread with a separate event loop, heap, and call stack.
Creating a worker:
// main.js
const worker = new Worker("worker.js");
worker.postMessage({ type: "compute", data: [1, 2, 3, 4, 5] });
worker.onmessage = (event) => {
console.log("Result:", event.data);
};// worker.js
self.onmessage = (event) => {
const { type, data } = event.data;
if (type === "compute") {
const sum = data.reduce((acc, x) => acc + x, 0);
self.postMessage(sum);
}
};Key points:
No shared memory (by default): Data is cloned via the structured clone algorithm.
No DOM access: Workers can’t access
document,window, or the DOM.Separate global object: Workers have
self(notwindow).
Structured clone: Serializes objects (including arrays, dates, maps, sets) but not functions, DOM nodes, or prototypes.
Transferable objects: For large data (TypedArrays, ArrayBuffers), use transferables to avoid copying:
const buffer = new ArrayBuffer(1024 * 1024); // 1 MB
worker.postMessage(buffer, [buffer]); // Transfer ownership
console.log(buffer.byteLength); // 0 (detached)Systems analogy: Web Workers are like POSIX threads, but with message passing (like Erlang) instead of shared memory.
SharedArrayBuffer: Shared Memory Concurrency
Specification: §25.2 (pp. 646-651)
SharedArrayBuffer (SAB) allows shared memory between the main thread and workers.
Creating a shared buffer:
// main.js
const sab = new SharedArrayBuffer(1024);
const view = new Int32Array(sab);
view[0] = 42;
const worker = new Worker("worker.js");
worker.postMessage(sab);// worker.js
self.onmessage = (event) => {
const sab = event.data;
const view = new Int32Array(sab);
console.log(view[0]); // 42 (shared!)
view[0] = 100;
};Race conditions: Without synchronization, reads/writes can race:
// main.js
view[0] = 0;
for (let i = 0; i < 10000; i++) {
view[0]++;
}
// worker.js
for (let i = 0; i < 10000; i++) {
view[0]++;
}
// Final value is NOT 20000! (data race)Atomics API: Provides atomic operations (§25.4, pp. 657-670):
// main.js
Atomics.store(view, 0, 0);
for (let i = 0; i < 10000; i++) {
Atomics.add(view, 0, 1); // Atomic increment
}
// worker.js
for (let i = 0; i < 10000; i++) {
Atomics.add(view, 0, 1);
}
// Final value is guaranteed to be 20000Atomic operations:
Atomics.load(ta, index): Atomic read.Atomics.store(ta, index, value): Atomic write.Atomics.add(ta, index, value): Atomic fetch-and-add.Atomics.compareExchange(ta, index, expected, replacement): CAS.Atomics.wait(ta, index, value, timeout): Block until value changes (only in workers).Atomics.notify(ta, index, count): Wake waiting threads.
Systems analogy: Atomics provides
the same primitives as C11’s _Atomic or C++’s
std::atomic.
Security note: SharedArrayBuffer was
disabled in browsers in 2018 due to Spectre
vulnerabilities. It’s now re-enabled with cross-origin
isolation requirements
(Cross-Origin-Opener-Policy: same-origin +
Cross-Origin-Embedder-Policy: require-corp
headers).
Node.js: libuv and the I/O Subsystem
libuv: The Cross-Platform Async I/O Library
Node.js’s event loop is implemented by libuv, a C library providing:
Event loop.
Asynchronous I/O (file system, network, etc.).
Thread pool (for blocking operations like
fs.readFileSync).
libuv architecture:
┌│─────────────────────────────────────────────────────────┐
││ Node.js Process │
├│─────────────────────────────────────────────────────────┤
││ JavaScript (V8 Engine) │
││ ↓ │
││ Node.js Bindings (C++) │
││ ↓ │
││ libuv (C Library) │
││ ↓ │
││ OS-Specific I/O APIs │
││ - epoll (Linux) │
││ - kqueue (BSD, macOS) │
││ - IOCP (Windows) │ └─────────────────────────────────────────────────────────┘
Thread pool: libuv maintains a default
thread pool of 4 threads (configurable via
UV_THREADPOOL_SIZE environment variable) for blocking
operations:
File system (except
fs.readFileSync, which blocks the main thread).DNS lookups (
dns.lookup()).Crypto operations (hashing, encryption).
Non-blocking I/O: Network operations (sockets) use the OS’s native non-blocking I/O (epoll, kqueue, IOCP).
Asynchronous File System Operations
Example:
const fs = require("fs");
fs.readFile("data.txt", "utf8", (err, data) => {
if (err) throw err;
console.log(data);
});
console.log("Reading file...");Output:
Reading file… [file contents]
How it works:
fs.readFile()enqueues a task to the thread pool.A worker thread performs the blocking
read()syscall.When done, the callback is enqueued to the event loop.
The event loop dequeues and executes the callback.
Synchronous alternative (blocks the event loop):
const data = fs.readFileSync("data.txt", "utf8");
console.log(data);Never use *Sync methods in production
code (except for startup scripts), as they block the event
loop and prevent all other I/O.
Streams: Backpressure and Flow Control
Node.js Streams (§10.5 in the Node.js docs) are an abstraction for reading/writing data incrementally:
Readable: Source of data (
fs.createReadStream,http.IncomingMessage).Writable: Destination (
fs.createWriteStream,http.ServerResponse).Duplex: Both readable and writable (
net.Socket).Transform: Duplex stream that modifies data (
zlib.createGzip).
Example:
const fs = require("fs");
const readable = fs.createReadStream("input.txt");
const writable = fs.createWriteStream("output.txt");
readable.pipe(writable);Backpressure: If the writable stream can’t keep
up with the readable stream, .pipe() automatically
pauses the readable until the writable drains.
Manual backpressure handling:
readable.on("data", (chunk) => {
const canContinue = writable.write(chunk);
if (!canContinue) {
readable.pause(); // Pause reading
}
});
writable.on("drain", () => {
readable.resume(); // Resume reading
});Systems analogy: Backpressure is like flow control in TCP—the receiver signals the sender to slow down when its buffer is full.
Performance Optimization: Understanding the Runtime
Avoid Blocking the Event Loop
Rule: Never perform synchronous blocking operations in the event loop.
Bad:
const result = someExpensiveComputation(); // 1 second
console.log(result);This blocks the event loop for 1 second, freezing all I/O, timers, and user interactions.
Good:
setImmediate(() => {
const result = someExpensiveComputation();
console.log(result);
});Or use a worker:
const worker = new Worker("worker.js");
worker.postMessage({ task: "compute" });
worker.onmessage = (event) => {
console.log(event.data);
};Minimize Microtask Queue Depth
Bad:
Promise.resolve()
.then(() => Promise.resolve())
.then(() => Promise.resolve())
.then(() => Promise.resolve())
// 1000 more .then() callsEach .then() enqueues a microtask. Deep promise
chains can starve macrotasks.
Good: Flatten chains:
async function run() {
await step1();
await step2();
await step3();
}Use Object Pools for High-Frequency Allocations
Problem: Frequent allocations pressure the GC.
Solution: Reuse objects:
class ObjectPool {
constructor(factory, size) {
this.factory = factory;
this.pool = Array(size).fill(null).map(() => factory());
}
acquire() {
return this.pool.pop() || this.factory();
}
release(obj) {
this.pool.push(obj);
}
}
const pool = new ObjectPool(() => ({ x: 0, y: 0 }), 100);
function usePoint() {
const point = pool.acquire();
point.x = 10;
point.y = 20;
// ... use point
pool.release(point);
}Systems note: Object pooling is common in game engines (Unity, Unreal) to reduce GC pressure.
Profile Before Optimizing
Chrome DevTools:
Performance tab → Record → Stop.
Analyze flame chart for bottlenecks.
Node.js:
node --inspect app.jsOpen chrome://inspect and profile.
Benchmarking:
const { performance } = require("perf_hooks");
const start = performance.now();
someFunction();
const end = performance.now();
console.log(`Elapsed: ${end - start}ms`);End of Chapter 3
In the next chapter, we’ll dive into JavaScript’s object model—prototypes, inheritance, property access, hidden classes, and how engines optimize object operations.
Chapter 4: Functions, Closures, and Scopes
Functions: First-Class Citizens and the Function Object
The Function as an Object: Callable, Constructable, and More
JavaScript functions are first-class objects.
They’re not just executable code—they’re instances of the
Function constructor with properties, methods, and
internal slots.
Specification: §20.2 (pp. 480-502)
Every function object has:
[[Call]]: Internal method making it callable.
[[Construct]]: Internal method making it constructable (for
new).prototype: Property pointing to the constructor’s prototype object.
length: Number of formal parameters.
name: The function’s name (or
"anonymous").
Example:
function greet(name, greeting = "Hello") {
console.log(`${greeting}, ${name}!`);
}
console.log(greet.length); // 1 (only counts non-default params)
console.log(greet.name); // "greet"
console.log(typeof greet); // "function"
console.log(greet instanceof Object); // trueInternal slots (§10.2.1, pp. 181-182):
FunctionObject { [[Call]]: executable code [[Construct]]: constructor behavior (if present) [[Environment]]: lexical environment (for closures) [[FormalParameters]]: parameter list [[ECMAScriptCode]]: parsed function body [[Realm]]: the realm in which the function was created [[HomeObject]]: for super references (methods) [[ThisMode]]: lexical, strict, or global }
Systems perspective: A JavaScript function is
like a C function pointer plus a struct containing
its lexical environment. The closure captures variables from outer
scopes, stored in [[Environment]].
Function Declaration vs. Function Expression vs. Arrow Function
Function Declaration
Syntax:
function name(params) {
// body
}Hoisting: Function declarations are hoisted—the entire function is available before the declaration in source order:
console.log(add(2, 3)); // 5
function add(a, b) {
return a + b;
}Specification: §14.1 (pp. 265-269)
The parser creates a function environment record during the instantiation phase, before executing code. This is why hoisting works.
Systems analogy: Similar to C’s function prototypes—the compiler knows the signature before seeing the definition.
Function Expression
Syntax:
const name = function(params) {
// body
};No hoisting: The variable is hoisted
(initialized to undefined), but the function assignment
happens at runtime:
console.log(add); // undefined
console.log(add(2, 3)); // TypeError: add is not a function
const add = function(a, b) {
return a + b;
};Named function expressions:
const factorial = function fact(n) {
if (n <= 1) return 1;
return n * fact(n - 1); // Can use 'fact' for recursion
};
console.log(factorial.name); // "fact"
console.log(fact); // ReferenceError: fact is not definedThe name fact is only visible inside the
function body.
Arrow Functions
Syntax:
const name = (params) => expression;
const name = (params) => { /* body */ };Key differences from regular functions:
No
thisbinding: Arrow functions inheritthisfrom the enclosing scope (lexicalthis).No
argumentsobject: Use rest parameters (...args) instead.Cannot be used as constructors: No
[[Construct]]method.No
prototypeproperty.No
super,new.target, oryield.
Specification: §15.3 (pp. 321-326)
Arrow functions have [[ThisMode]] set to
"lexical" (§10.2.1.1, p. 182).
Example:
function Timer() {
this.seconds = 0;
// Arrow function captures 'this' from Timer
setInterval(() => {
this.seconds++;
console.log(this.seconds);
}, 1000);
}
new Timer(); // 1, 2, 3, ...Contrast with regular function:
function Timer() {
this.seconds = 0;
setInterval(function() {
this.seconds++; // 'this' is global object or undefined (strict)
console.log(this.seconds); // NaN
}, 1000);
}
new Timer();Fix (pre-ES6 pattern):
function Timer() {
const self = this; // Capture 'this'
this.seconds = 0;
setInterval(function() {
self.seconds++;
console.log(self.seconds);
}, 1000);
}Systems insight: Arrow functions are
syntactic sugar for binding this. The
compiler rewrites:
const fn = () => this.value;Into (conceptually):
const fn = function() { return this.value; }.bind(this);But more efficiently—no runtime .bind() call.
The
arguments Object: Array-Like and Aliased
Specification: §10.2.1.3 (pp. 183-184)
Non-arrow functions have an implicit arguments
object:
function sum() {
let total = 0;
for (let i = 0; i < arguments.length; i++) {
total += arguments[i];
}
return total;
}
console.log(sum(1, 2, 3, 4)); // 10Array-like: arguments has a
length property and numeric indices, but is not
an Array:
function test() {
console.log(Array.isArray(arguments)); // false
console.log(arguments.length); // number of args
}
test(1, 2, 3);Convert to array:
function test() {
const args = Array.from(arguments); // ES6
const args2 = [...arguments]; // ES6 spread
const args3 = Array.prototype.slice.call(arguments); // ES5
}Aliasing (in non-strict mode):
function test(a, b) {
console.log(arguments[0]); // 1
a = 10;
console.log(arguments[0]); // 10 (aliased!)
}
test(1, 2);In strict mode, parameters are not aliased:
"use strict";
function test(a, b) {
a = 10;
console.log(arguments[0]); // 1 (not aliased)
}Systems note: Aliasing complicates optimization.
V8 must track whether parameters are aliased, potentially preventing
inlining. Avoid arguments in modern
code—use rest parameters.
Rest Parameters and Spread Syntax
Rest parameters (§15.1, p. 312):
function sum(...numbers) {
return numbers.reduce((acc, n) => acc + n, 0);
}
console.log(sum(1, 2, 3, 4)); // 10Key differences from arguments:
True array:
numbersis anArrayinstance.No aliasing: Changes to
numbersdon’t affect named parameters.Only trailing parameters: Must be the last parameter.
function fn(a, b, ...rest) {
console.log(a, b, rest);
}
fn(1, 2, 3, 4, 5); // 1, 2, [3, 4, 5]Spread syntax (§13.2.4, pp. 246-247):
const arr = [1, 2, 3];
console.log(Math.max(...arr)); // 3
const arr2 = [0, ...arr, 4]; // [0, 1, 2, 3, 4]Systems insight: Spread is implemented efficiently by engines—it’s not a loop in userland. V8 uses a fast path for spreading arrays.
Default Parameters: Temporal Dead Zone and Scope
Specification: §15.1.4 (pp. 313-315)
Syntax:
function greet(name = "World", greeting = "Hello") {
console.log(`${greeting}, ${name}!`);
}
greet(); // Hello, World!
greet("Alice"); // Hello, Alice!
greet("Bob", "Hi"); // Hi, Bob!Default parameters are evaluated at call time:
let counter = 0;
function fn(x = counter++) {
console.log(x);
}
fn(); // 0
fn(); // 1
fn(); // 2Parameters have their own scope:
function fn(a = 1, b = a + 1) {
console.log(a, b);
}
fn(); // 1, 2
fn(5); // 5, 6
fn(5, 10); // 5, 10Temporal Dead Zone (TDZ):
function fn(a = b, b = 1) { // ReferenceError: Cannot access 'b' before initialization
console.log(a, b);
}
fn();Why? Parameters are evaluated
left-to-right. When evaluating a = b,
b hasn’t been initialized yet (it’s in the TDZ).
Parameter scope vs. function scope:
let x = 1;
function fn(a = x, x = 2) {
console.log(a, x);
}
fn(); // ReferenceError: Cannot access 'x' before initializationExplanation: The parameter x = 2
shadows the outer x. When evaluating
a = x, the parameter x is
in the TDZ.
Systems lesson: JavaScript’s parameter defaults
create a separate lexical scope for parameters,
distinct from the function body scope. This is similar to Scheme’s
let* (sequential bindings).
Scopes: Lexical Environments and Closure Semantics
Lexical Scoping: The Foundation
Specification: §9.1 (pp. 129-145)
JavaScript uses lexical scoping (also called static scoping): A variable’s scope is determined by its position in the source code, not by the call stack at runtime.
Example:
let x = "global";
function outer() {
let x = "outer";
function inner() {
console.log(x); // "outer" (lexical scope)
}
return inner;
}
const fn = outer();
fn(); // "outer"Contrast with dynamic scoping (used in Emacs Lisp, early Perl):
// Hypothetical dynamic scoping (not JavaScript!)
let x = "global";
function outer() {
let x = "outer";
inner();
}
function inner() {
console.log(x); // Would print "outer" in dynamic scoping
}
outer();In dynamic scoping, inner() would look up
x in the call stack, finding
outer’s x. In JavaScript’s lexical
scoping, inner() looks up x in its
lexical environment (where it was defined), finding
the global x.
Environment Records: The Spec’s Abstraction
Specification: §9.1.1 (pp. 129-136)
An Environment Record is the spec’s abstraction for “where variables live.” There are four types:
Declarative Environment Record: For
let,const,var, function parameters.Object Environment Record: For
withstatements and global object properties.Global Environment Record: Hybrid of declarative + object records.
Module Environment Record: For ES6 modules.
Structure:
EnvironmentRecord { bindings: Map<String, Value> outer: EnvironmentRecord | null }
Example:
let globalVar = "global";
function outer() {
let outerVar = "outer";
function inner() {
let innerVar = "inner";
console.log(globalVar, outerVar, innerVar);
}
inner();
}
outer();Environment chain:
inner’s Environment { bindings: { innerVar: “inner” } outer: outer’s Environment }
outer’s Environment { bindings: { outerVar: “outer” } outer: Global Environment }
Global Environment { bindings: { globalVar: “global”, outer:
Variable lookup: Start at the current
environment, walk the outer chain until found (or
ReferenceError if not found).
Systems analogy: Environment records are like
stack frames in C, but they’re
heap-allocated if captured by closures. The
outer link is analogous to the static
link in ALGOL-style languages.
Block Scopes: let
and const
Specification: §14.3.1 (pp. 276-277)
ES6 introduced block-scoped variables
(let, const) that create a new environment
for each block:
{
let x = 1;
const y = 2;
var z = 3; // Function-scoped
}
console.log(z); // 3
console.log(x); // ReferenceErrorTemporal Dead Zone (TDZ):
let/const variables cannot be accessed
before their declaration:
console.log(x); // ReferenceError: Cannot access 'x' before initialization
let x = 1;Contrast with var (hoisted,
initialized to undefined):
console.log(x); // undefined
var x = 1;TDZ in loops:
// Bad: TDZ error
for (let i = 0; i < arr.length; i++) {
console.log(i); // OK
}
console.log(i); // ReferenceError
// Good: Each iteration has a new 'i'
for (let i = 0; i < 3; i++) {
setTimeout(() => console.log(i), 100);
}
// Logs: 0, 1, 2 (each closure captures a different 'i')Contrast with var:
for (var i = 0; i < 3; i++) {
setTimeout(() => console.log(i), 100);
}
// Logs: 3, 3, 3 (all closures share the same 'i')Why? let creates a new
binding for each loop iteration, while var
reuses the same binding.
Specification: §14.7.4.2 (pp. 299-300) describes loop iteration environment creation.
var
Hoisting: Function-Scoped and Initialized
Hoisting: Variable declarations are moved to the top of their function (or global scope):
function test() {
console.log(x); // undefined (not ReferenceError)
var x = 1;
console.log(x); // 1
}Desugared:
function test() {
var x; // Hoisted to top, initialized to undefined
console.log(x);
x = 1;
console.log(x);
}Function-scoped, not block-scoped:
function test() {
if (true) {
var x = 1;
}
console.log(x); // 1 (x is function-scoped)
}Systems lesson: var is a legacy of
JavaScript’s hasty design. Always use
let/const in modern code.
Closures: Capturing Lexical Environments
What is a Closure?
Definition: A closure is a function that captures variables from its enclosing lexical scope, even after the outer function has returned.
Example:
function makeCounter() {
let count = 0;
return function() {
return ++count;
};
}
const counter = makeCounter();
console.log(counter()); // 1
console.log(counter()); // 2
console.log(counter()); // 3How it works:
makeCountercreates a local variablecount.The inner function captures
countin its[[Environment]]slot.When
makeCounterreturns, its execution context is popped, but the environment record is not garbage-collected because the inner function still references it.Each call to
counter()accesses the capturedcount.
Systems analogy: Closures are like upvalues in Lua or lexical bindings in Scheme. The captured variables are stored in a heap-allocated environment (not on the stack).
Closure Implementation: Hidden Classes and Contexts
V8’s implementation:
Context object: A heap-allocated object storing captured variables.
[[Environment]] slot: The function object points to the context.
Variable access: Load from context, not from stack frame.
Pseudo-code:
// Source:
function makeCounter() {
let count = 0;
return function() { return ++count; };
}
// V8 internal representation:
Context {
count: 0
}
Function {
[[Environment]]: Context
[[Code]]: bytecode for "return ++count"
}Bytecode (simplified):
LdaContextSlot [0] ; Load count from context slot 0 Inc ; Increment StaContextSlot [0] ; Store back to context slot 0 Return
Systems note: V8 optimizes closures by only capturing used variables. If a variable isn’t referenced in the closure, it’s not stored in the context.
Closures in Loops: The Classic Pitfall
Problem:
const functions = [];
for (var i = 0; i < 3; i++) {
functions.push(function() {
console.log(i);
});
}
functions[0](); // 3
functions[1](); // 3
functions[2](); // 3Why? All closures capture the
same i (function-scoped by
var). After the loop, i is
3.
Fix 1: Use let:
for (let i = 0; i < 3; i++) {
functions.push(function() {
console.log(i);
});
}
functions[0](); // 0
functions[1](); // 1
functions[2](); // 2Fix 2: IIFE (Immediately Invoked Function Expression):
for (var i = 0; i < 3; i++) {
(function(i) {
functions.push(function() {
console.log(i);
});
})(i);
}Fix 3: .bind():
for (var i = 0; i < 3; i++) {
functions.push((function(i) {
return function() { console.log(i); };
})(i));
}Systems lesson: Understanding closures is critical for avoiding subtle bugs. The spec’s environment model explains why these fixes work.
Closure Memory Leaks: Retaining References
Problem:
function setupHandler() {
const largeData = new Array(1000000).fill("data");
document.getElementById("button").addEventListener("click", function() {
console.log("Button clicked!");
// Doesn't use largeData, but still captures it
});
}
setupHandler();Why? The closure captures the entire
lexical environment, including largeData, even
though it’s unused.
V8 optimization: Modern engines analyze closures and only capture used variables. But this isn’t guaranteed.
Fix: Explicitly null out references:
function setupHandler() {
const largeData = new Array(1000000).fill("data");
// Use largeData here...
const largeDataCopy = null; // Hint to GC
document.getElementById("button").addEventListener("click", function() {
console.log("Button clicked!");
});
}Or use WeakMap:
const dataMap = new WeakMap();
function setupHandler(button) {
const largeData = new Array(1000000).fill("data");
dataMap.set(button, largeData);
button.addEventListener("click", function() {
const data = dataMap.get(button);
// Use data if needed
});
}
// When button is removed from DOM, largeData is GC'dThe
this Keyword: Dynamic Binding and Confusion
this Binding Rules
JavaScript’s this is dynamically
bound based on how a function is called, not where it’s
defined.
Four binding rules:
Default binding: Global object (or
undefinedin strict mode).Implicit binding: The object the method is called on.
Explicit binding:
.call(),.apply(),.bind().newbinding: The newly created object.
Rule 1: Default Binding
function fn() {
console.log(this);
}
fn(); // Window (browser) or global (Node.js)Strict mode:
"use strict";
function fn() {
console.log(this);
}
fn(); // undefinedRule 2: Implicit Binding
const obj = {
value: 42,
fn: function() {
console.log(this.value);
}
};
obj.fn(); // 42Gotcha: Losing binding:
const fn = obj.fn;
fn(); // undefined (or error in strict mode)Why? fn is now a standalone
function, losing its context.
Rule 3: Explicit Binding
.call() and
.apply():
function greet(greeting) {
console.log(`${greeting}, ${this.name}!`);
}
const person = { name: "Alice" };
greet.call(person, "Hello"); // Hello, Alice!
greet.apply(person, ["Hi"]); // Hi, Alice!.bind(): Returns a new function
with this permanently bound:
const boundGreet = greet.bind(person);
boundGreet("Hey"); // Hey, Alice!Rule 4: new Binding
Specification: §20.2.3.1 (pp. 489-490)
When a function is called with new:
A new object is created.
The object’s
[[Prototype]]is set toConstructor.prototype.The constructor is called with
thisbound to the new object.If the constructor returns an object, that object is returned; otherwise, the new object is returned.
function Person(name) {
this.name = name;
}
const alice = new Person("Alice");
console.log(alice.name); // "Alice"Returning an object overrides the new object:
function Person(name) {
this.name = name;
return { name: "Overridden" };
}
const bob = new Person("Bob");
console.log(bob.name); // "Overridden"Arrow Functions: Lexical
this
Arrow functions do not bind
this—they inherit it from the enclosing
scope:
const obj = {
value: 42,
regularFn: function() {
setTimeout(function() {
console.log(this.value); // undefined (or error in strict)
}, 100);
},
arrowFn: function() {
setTimeout(() => {
console.log(this.value); // 42
}, 100);
}
};
obj.regularFn(); // undefined
obj.arrowFn(); // 42Systems insight: Arrow functions are
not closures over this—they
don’t have a this binding at all.
Variable lookup for this walks the environment chain,
just like any other variable.
The globalThis
Object (ES2020)
Specification: §19.1 (p. 447)
globalThis provides a standard way to access the
global object in any environment:
// Browser: globalThis === window
// Node.js: globalThis === global
// Web Worker: globalThis === self
console.log(globalThis);Pre-ES2020 workaround:
const globalObject = (function() {
return this || (new Function("return this"))();
})();Higher-Order Functions: Functional Programming Patterns
Functions as Arguments: Callbacks and Iterators
Higher-order function: A function that takes or returns other functions.
Example: Array methods:
const numbers = [1, 2, 3, 4, 5];
const doubled = numbers.map(x => x * 2);
const evens = numbers.filter(x => x % 2 === 0);
const sum = numbers.reduce((acc, x) => acc + x, 0);
console.log(doubled); // [2, 4, 6, 8, 10]
console.log(evens); // [2, 4]
console.log(sum); // 15Custom higher-order function:
function repeat(n, action) {
for (let i = 0; i < n; i++) {
action(i);
}
}
repeat(3, i => console.log(`Iteration ${i}`));Functions as Return Values: Function Factories
Currying:
function add(a) {
return function(b) {
return a + b;
};
}
const add5 = add(5);
console.log(add5(3)); // 8
console.log(add5(10)); // 15Generic currying:
function curry(fn) {
return function curried(...args) {
if (args.length >= fn.length) {
return fn(...args);
}
return (...nextArgs) => curried(...args, ...nextArgs);
};
}
function multiply(a, b, c) {
return a * b * c;
}
const curriedMultiply = curry(multiply);
console.log(curriedMultiply(2)(3)(4)); // 24
console.log(curriedMultiply(2, 3)(4)); // 24
console.log(curriedMultiply(2, 3, 4)); // 24Systems note: Currying is fundamental in Haskell/ML. In JavaScript, it’s less common but useful for partial application.
Partial Application
.bind() for partial
application:
function greet(greeting, name) {
console.log(`${greeting}, ${name}!`);
}
const sayHello = greet.bind(null, "Hello");
sayHello("Alice"); // Hello, Alice!
sayHello("Bob"); // Hello, Bob!Custom partial application:
function partial(fn, ...presetArgs) {
return function(...laterArgs) {
return fn(...presetArgs, ...laterArgs);
};
}
const add = (a, b, c) => a + b + c;
const add5and10 = partial(add, 5, 10);
console.log(add5and10(3)); // 18Composition and Pipelining
Function composition: Combine functions left-to-right or right-to-left:
const compose = (...fns) => x => fns.reduceRight((acc, fn) => fn(acc), x);
const pipe = (...fns) => x => fns.reduce((acc, fn) => fn(acc), x);
const double = x => x * 2;
const increment = x => x + 1;
const square = x => x * x;
const composed = compose(square, increment, double);
console.log(composed(3)); // square(increment(double(3))) = square(7) = 49
const piped = pipe(double, increment, square);
console.log(piped(3)); // square(increment(double(3))) = square(7) = 49Systems analogy: Function composition is like
Unix pipes: cmd1 | cmd2 | cmd3.
Function Optimization: Inline Caching and Hidden Classes
Inline Caching: Speeding Up Property Access
Problem: Property lookup in JavaScript is expensive—walk the prototype chain, check property attributes, etc.
Solution: Inline caching (IC)—cache the result of property lookups.
Example:
function getX(obj) {
return obj.x;
}
getX({ x: 1 }); // First call: IC miss, cache { x: 1 }'s shape
getX({ x: 2 }); // Second call: IC hit (same shape), fast pathIC states:
Uninitialized: Never called.
Monomorphic: Sees one object shape (fast).
Polymorphic: Sees 2-4 shapes (slower, but still cached).
Megamorphic: Sees >4 shapes (no caching, slow).
Systems lesson: Keep object shapes consistent for performance:
// Bad: Different shapes
function Point1(x, y) {
this.x = x;
this.y = y;
}
function Point2(y, x) { // Different order!
this.y = y;
this.x = x;
}
const p1 = new Point1(1, 2);
const p2 = new Point2(2, 1);
// p1 and p2 have different hidden classes!Good: Same shape:
function Point(x, y) {
this.x = x;
this.y = y;
}
const p1 = new Point(1, 2);
const p2 = new Point(3, 4);
// p1 and p2 share the same hidden classHidden Classes (Maps): V8’s Optimization
V8 internal: Objects don’t store property names—they store a pointer to a hidden class (also called a map or shape):
Object { map: HiddenClass properties: [value1, value2, …] }
HiddenClass { x: offset 0 y: offset 1 }
Property access: obj.x → look up
x in obj.map → get offset → load from
obj.properties[offset].
Hidden class transitions:
const obj = {};
// Hidden class: C0 (empty)
obj.x = 1;
// Transition: C0 → C1 (has 'x')
obj.y = 2;
// Transition: C1 → C2 (has 'x' and 'y')Adding properties in different orders creates different hidden classes:
const obj1 = {};
obj1.x = 1;
obj1.y = 2;
// Hidden class: C2
const obj2 = {};
obj2.y = 2;
obj2.x = 1;
// Hidden class: C2' (different from C2!)Systems advice: Initialize all properties in the constructor in the same order:
class Point {
constructor(x, y) {
this.x = x;
this.y = y;
}
}
// All Point instances share the same hidden classInlining and Deoptimization
Inlining: The JIT compiler inlines small functions for performance:
function add(a, b) {
return a + b;
}
function compute(x) {
return add(x, 10);
}
// After JIT:
function compute(x) {
return x + 10; // 'add' is inlined
}Deoptimization: If assumptions are violated (e.g., type changes), the JIT deoptimizes back to bytecode:
function add(a, b) {
return a + b;
}
add(1, 2); // JIT assumes integers, compiles fast path
add(1.5, 2.5); // Still integers (floats are fine)
add("a", "b"); // Type changed! Deoptimize.Systems lesson: Avoid polymorphic code in hot paths. Keep types stable.
Practical Patterns: Memoization, Debouncing, Throttling
Memoization: Caching Function Results
Pattern: Cache expensive function calls:
function memoize(fn) {
const cache = new Map();
return function(...args) {
const key = JSON.stringify(args);
if (cache.has(key)) {
return cache.get(key);
}
const result = fn(...args);
cache.set(key, result);
return result;
};
}
function fibonacci(n) {
if (n <= 1) return n;
return fibonacci(n - 1) + fibonacci(n - 2);
}
const memoizedFib = memoize(fibonacci);
console.log(memoizedFib(40)); // Fast!Systems note: For recursive functions, memoize the inner function:
const fibonacci = memoize(function fib(n) {
if (n <= 1) return n;
return fib(n - 1) + fib(n - 2); // Calls memoized version
});Debouncing: Delay Execution Until Idle
Pattern: Execute a function only after a delay since the last call:
function debounce(fn, delay) {
let timeoutId;
return function(...args) {
clearTimeout(timeoutId);
timeoutId = setTimeout(() => fn(...args), delay);
};
}
const search = debounce((query) => {
console.log(`Searching for: ${query}`);
}, 300);
// User types: "h" → "he" → "hel" → "hell" → "hello"
// Only one search after 300ms of no typingUse case: Search boxes, resize handlers.
Throttling: Limit Execution Rate
Pattern: Execute at most once per interval:
function throttle(fn, interval) {
let lastCall = 0;
return function(...args) {
const now = Date.now();
if (now - lastCall >= interval) {
lastCall = now;
fn(...args);
}
};
}
const logScroll = throttle(() => {
console.log("Scrolled!");
}, 1000);
window.addEventListener("scroll", logScroll);
// Logs at most once per secondUse case: Scroll handlers, mouse move handlers.
Generators and Iterators: Pausable Functions
Generator Functions:
function* and yield
Specification: §15.5 (pp. 332-340)
Syntax:
function* generatorFn() {
yield 1;
yield 2;
yield 3;
}
const gen = generatorFn();
console.log(gen.next()); // { value: 1, done: false }
console.log(gen.next()); // { value: 2, done: false }
console.log(gen.next()); // { value: 3, done: false }
console.log(gen.next()); // { value: undefined, done: true }Generators are iterators:
for (const value of generatorFn()) {
console.log(value); // 1, 2, 3
}Infinite generators:
function* fibonacci() {
let [a, b] = [0, 1];
while (true) {
yield a;
[a, b] = [b, a + b];
}
}
const fib = fibonacci();
console.log(fib.next().value); // 0
console.log(fib.next().value); // 1
console.log(fib.next().value); // 1
console.log(fib.next().value); // 2
console.log(fib.next().value); // 3yield*:
Delegating to Another Generator
function* gen1() {
yield 1;
yield 2;
}
function* gen2() {
yield* gen1();
yield 3;
}
console.log([...gen2()]); // [1, 2, 3]Generators as Coroutines: Bidirectional Communication
next(value) passes a value back to the
generator:
function* dialogue() {
const name = yield "What's your name?";
const age = yield `Hello, ${name}! How old are you?`;
return `${name} is ${age} years old.`;
}
const conv = dialogue();
console.log(conv.next().value); // "What's your name?"
console.log(conv.next("Alice").value); // "Hello, Alice! How old are you?"
console.log(conv.next(30).value); // "Alice is 30 years old."Systems note: This is similar to Lua coroutines or Python generators—pausable functions that can resume with input.
Async Generators (ES2018)
Specification: §27.7 (pp. 718-722)
Combine async/await with
generators:
async function* fetchPages() {
let page = 1;
while (page <= 3) {
const response = await fetch(`https://api.example.com/page/${page}`);
const data = await response.json();
yield data;
page++;
}
}
(async () => {
for await (const data of fetchPages()) {
console.log(data);
}
})();End of Chapter 4
In the next chapter, we’ll explore JavaScript’s object model—prototypes, inheritance, property descriptors, proxies, and how engines optimize object operations through hidden classes and inline caches.
Chapter 5: Objects, Prototypes, and the Class Syntax
Objects: Property Bags, Dictionaries, and More
The Object Type: JavaScript’s Fundamental Composite
Specification: §6.1.7 (pp. 73-80)
In JavaScript, everything except primitives is an object. An object is a collection of properties, where each property is a key-value pair. The key is always a string or Symbol; the value can be any type.
Object as a property bag:
const person = {
name: "Alice",
age: 30,
greet: function() {
console.log(`Hello, I'm ${this.name}`);
}
};
console.log(person.name); // "Alice"
person.greet(); // "Hello, I'm Alice"Internal structure (§6.1.7.1, pp. 74-75):
Object { [[Prototype]]: Object.prototype [[Extensible]]: true
properties: { “name”: { value: “Alice”, writable: true, enumerable:
true, configurable: true } “age”: { value: 30, writable: true,
enumerable: true, configurable: true } “greet”: { value:
Each property is represented by a Property Descriptor with attributes:
[[Value]]: The property’s value.[[Writable]]: Can the value be changed?[[Enumerable]]: Does it show up infor...inloops?[[Configurable]]: Can the property be deleted or its attributes changed?
For accessors (getters/setters):
[[Get]]: Getter function.[[Set]]: Setter function.[[Enumerable]]and[[Configurable]](same as data properties).
Object
Creation: Literals, Constructors, and
Object.create
Object Literal Syntax
Specification: §13.2.5 (pp. 247-251)
const obj = {
x: 1,
y: 2,
method() {
return this.x + this.y;
}
};Shorthand property names (ES6):
const x = 10, y = 20;
const obj = { x, y }; // Equivalent to { x: x, y: y }Computed property names:
const propName = "dynamicKey";
const obj = {
[propName]: "value",
[`${propName}_2`]: "value2"
};
console.log(obj.dynamicKey); // "value"
console.log(obj.dynamicKey_2); // "value2"Spread syntax (ES2018):
const obj1 = { a: 1, b: 2 };
const obj2 = { ...obj1, c: 3 };
console.log(obj2); // { a: 1, b: 2, c: 3 }
// Shallow copy
const copy = { ...obj1 };Constructor Functions
function Person(name, age) {
this.name = name;
this.age = age;
}
Person.prototype.greet = function() {
console.log(`Hello, I'm ${this.name}`);
};
const alice = new Person("Alice", 30);
alice.greet(); // "Hello, I'm Alice"What new does (§20.2.3.1,
pp. 489-490):
Create a new empty object:
const obj = {}.Set
obj.[[Prototype]]toConstructor.prototype.Call
Constructor.call(obj, ...args)(bindthistoobj).If the constructor returns an object, return that; otherwise, return
obj.
Pseudo-code:
function myNew(Constructor, ...args) {
const obj = Object.create(Constructor.prototype);
const result = Constructor.apply(obj, args);
return (typeof result === 'object' && result !== null) ? result : obj;
}Object.create():
Explicit Prototype Specification
Specification: §20.1.2.2 (pp. 461-462)
const proto = {
greet() {
console.log(`Hello, I'm ${this.name}`);
}
};
const alice = Object.create(proto);
alice.name = "Alice";
alice.greet(); // "Hello, I'm Alice"Create an object with null
prototype (no inherited properties):
const pureMap = Object.create(null);
pureMap.toString = "custom"; // No conflict with Object.prototype.toString
console.log(pureMap.toString); // "custom"Systems insight:
Object.create(null) is used for dictionaries to avoid
prototype pollution attacks.
Property Access: Dot vs. Bracket Notation
Dot notation:
const obj = { name: "Alice" };
console.log(obj.name); // "Alice"Bracket notation:
console.log(obj["name"]); // "Alice"
const key = "name";
console.log(obj[key]); // "Alice"When to use bracket notation:
Dynamic keys:
obj[variableKey].Invalid identifiers:
obj["invalid-key"],obj["123"].Symbols:
obj[Symbol.iterator].
Property access algorithm (§13.3.2, pp. 251-252):
Evaluate the object reference.
Evaluate the property key (convert to string or Symbol).
Perform
...GetValuevia the internal method **[[Get]]**, which traverses the prototype chain until it finds a matching property key or returnsundefined`.
Prototype lookup pseudocode:
function Get(obj, key) {
let current = obj;
while (current !== null) {
if (Object.hasOwn(current, key)) return current[key];
current = Object.getPrototypeOf(current);
}
return undefined;
}If the property is not found on the object itself, the lookup
continues upward through each object’s [[Prototype]]
link until reaching null.
The Prototype Chain and Inheritance
The
prototype Property of Constructor Functions
In JavaScript, constructor functions automatically have a
.prototype property that becomes the
[[Prototype]] of instances created with
new.
function Person(name) {
this.name = name;
}
Person.prototype.sayHi = function() {
console.log(`Hi, I'm ${this.name}`);
};
const alice = new Person('Alice');
alice.sayHi(); // from Person.prototypeInternally:
alice.[[Prototype]] → Person.prototype
Person.prototype.[[Prototype]] → Object.prototype
Object.prototype.[[Prototype]] → null
Thus, every property lookup climbs the chain
until a match is found or termination occurs at
null.
Manipulating the Prototype Directly
Object.getPrototypeOf(obj)— returns the current[[Prototype]].Object.setPrototypeOf(obj, proto)— changes the link (discouraged at runtime for performance).
Creating objects with a specific prototype:
const proto = { greet() { console.log('Hello'); } };
const obj = Object.create(proto);
obj.greet(); // “Hello”Changing prototype chains after object creation can deoptimize hidden‑class optimizations (see below).
Classes: Modern Syntax for Prototype Inheritance
Basic Form
Introduced in ES2015, the class keyword provides a
declarative layer over traditional prototype mechanics.
class Person {
constructor(name) {
this.name = name;
}
greet() {
console.log(`Hello, I'm ${this.name}`);
}
}Equivalent behavior:
function Person(name) { this.name = name; }
Person.prototype.greet = function() { console.log(`Hello, I'm ${this.name}`); };Adding Inheritance with
extends
class Employee extends Person {
constructor(name, role) {
super(name);
this.role = role;
}
describe() {
console.log(`${this.name} works as ${this.role}`);
}
}The super() call invokes the parent constructor and
ensures its initialization logic runs before the subclass’s.
Prototype chain under extends:
Employee.prototype.[[Prototype]] → Person.prototype Employee.[[Prototype]] → Person
Static methods on the parent class are inherited by the subclass through the second prototype link.
Static vs. Instance Methods
Instance methods live on the prototype and are shared by all instances.
Static methods belong directly to the constructor (class object itself).
class MathUtil {
static clamp(value, min, max) {
return Math.min(Math.max(value, min), max);
}
}
console.log(MathUtil.clamp(10, 0, 5)); // 5Static methods do not appear on instances.
this
Binding in Methods and Arrow Functions
Within class method bodies, this refers to the
instance. Arrow functions capture this lexically:
class Counter {
constructor() {
this.c = 0;
}
inc() {
this.c++;
}
incAsync() {
setTimeout(() => this.inc(), 100); // lexical this
}
}Using traditional functions inside setTimeout would
lose this unless bound manually
(.bind(this)), so arrow functions are common for
callbacks.
Object Representation in Engines
Engines like V8, SpiderMonkey, and JavaScriptCore translate objects into internal “hidden classes” (also called shapes).
When an object’s property layout changes, the engine may create a new hidden class.
Each hidden class maps property names → memory offsets.
Consistent object shapes enable Inline Caching (IC)—fast access when the same property is read repeatedly on likewise-shaped objects.
Example of deoptimization:
const obj = {};
obj.a = 1; // shape #1
obj.b = 2; // new hidden class (shape #2)Changing property addition order affects runtime efficiency.
Enumerability and Property Inspection
Object.keys(obj)lists own enumerable string keys.Object.getOwnPropertyNames(obj)returns all own string keys, enumerable or not.Object.getOwnPropertySymbols(obj)lists own Symbol properties.Reflect.ownKeys(obj)combines both string and Symbol keys.
Iteration with for...in traverses the entire
prototype chain, enumerating enumerable properties only.
Controlling Mutability: Sealing, Freezing, Extensions
| Method | Prevent Add | Prevent Delete | Prevent Reconfigure | Prevent Write |
|---|---|---|---|---|
Object.preventExtensions() |
✅ | ❌ | ❌ | ❌ |
Object.seal() |
✅ | ✅ | ✅ | ❌ |
Object.freeze() |
✅ | ✅ | ✅ | ✅ |
Each returns the same object reference after applying
restrictions. For immutability validation, use
Object.isFrozen(obj) etc.
Proxies and the Reflect API
Proxies
Proxies intercept fundamental operations like property getting, setting, calling, or construction. (§27.5 in ECMA‑262)
const target = { message: "Hi" };
const handler = {
get(obj, prop, receiver) {
console.log(`Accessing ${prop}`);
return obj[prop];
}
};
const proxy = new Proxy(target, handler);
console.log(proxy.message); // Logs + returns “Hi”Internally, the get trap intercepts the
[[Get]] operation; other traps include
set, has, deleteProperty,
construct, etc.
Reflect API
Reflect exposes the same internal operations without interception semantics, enabling transparent invocation:
Reflect.get(target, "message");
Reflect.set(target, "message", "Hello");When designing Proxy handlers, prefer delegating to
Reflect methods for consistent behavior.
Accessor and Data Properties
To define or inspect property attributes:
Object.defineProperty(obj, "x", {
value: 1,
writable: false,
enumerable: true,
configurable: false
});
console.log(Object.getOwnPropertyDescriptor(obj, "x"));Accessor example:
Object.defineProperty(obj, "y", {
get() { return this._y; },
set(v) { this._y = v; },
enumerable: true
});Defining properties precisely allows you to implement computed members and validations without exposing backing storage directly.
The
super Keyword and Lexical Home Objects
Inside a method using super, JavaScript references
the method on the prototype of the current home
object—the lexical container where the method was
defined—not the runtime receiver.
class A {
greet() { console.log("Hi from A"); }
}
class B extends A {
greet() {
super.greet();
console.log("Hi from B");
}
}
new B().greet();Call resolution:
B.prototype.greetuses its[[HomeObject]]=B.prototype.super.greetreads fromObject.getPrototypeOf(B.prototype)→A.prototype.
This ensures super behaves predictably even when
functions are borrowed or rebound.
Summary
Objects are extensible property maps managed through descriptors.
Prototype chains handle inheritance and influence lookup semantics.
Classes provide syntactic sugar around prototype linkage and constructor invocation.
Hidden classes and inline caches optimize property access internally.
Proxies and Reflect expose or intercept fundamental object operations.
Descriptors and freeze/seal control configurability and immutability.
superrelies on lexical[[HomeObject]]for predictable prototype method delegation.
Together, these mechanisms define the structure and performance behavior of objects—the backbone of all JavaScript execution models.
Chapter 6: Arrays, Typed Arrays, and Buffers
Arrays: JavaScript’s Flexible Ordered Collections
The Array Type: Objects with Special Length Behavior
Specification: §23.1 (pp. 545-584)
In JavaScript, arrays are specialized objects
where integer-indexed properties (0, 1, 2, …) are automatically
tracked via a magic length property. Arrays inherit
from Array.prototype, gaining powerful iteration and
transformation methods.
Basic array creation:
const arr = [1, 2, 3];
console.log(arr.length); // 3
console.log(arr[0]); // 1
arr[5] = 10;
console.log(arr.length); // 6 (automatically updated)
console.log(arr); // [1, 2, 3, <2 empty items>, 10]Internal representation (conceptual):
Array { [[Prototype]]: Array.prototype 0: 1 1: 2 2: 3 5: 10 length: 6 // automatically maintained }
The length property is writable:
setting it truncates or extends the array.
arr.length = 3;
console.log(arr); // [1, 2, 3] (elements beyond index 2 removed)
arr.length = 5;
console.log(arr); // [1, 2, 3, <2 empty items>]Array Creation Methods
Array Literals
const empty = [];
const nums = [1, 2, 3];
const mixed = [1, "two", { three: 3 }, [4, 5]];Sparse arrays (with holes):
const sparse = [1, , 3];
console.log(sparse.length); // 3
console.log(1 in sparse); // false (no property at index 1)Array Constructor
const arr1 = new Array(3); // [<3 empty items>], length = 3
const arr2 = new Array(1, 2, 3); // [1, 2, 3]
const arr3 = Array.of(3); // [3] (avoids ambiguity)Note: new Array(n) creates a sparse
array of length n if n is a single
number.
Array.from():
Converting Iterables to Arrays
Specification: §23.1.2.1 (pp. 547-548)
// From string
Array.from("hello"); // ['h', 'e', 'l', 'l', 'o']
// From Set
Array.from(new Set([1, 2, 2, 3])); // [1, 2, 3]
// With mapping function
Array.from([1, 2, 3], x => x * 2); // [2, 4, 6]
// From array-like object
const arrayLike = { 0: 'a', 1: 'b', length: 2 };
Array.from(arrayLike); // ['a', 'b']Array-like objects have a length
property and indexed elements, but lack array methods.
Array.from() converts them to real arrays.
Spread Syntax (ES6)
const arr1 = [1, 2];
const arr2 = [...arr1, 3, 4]; // [1, 2, 3, 4]
// Shallow copy
const copy = [...arr1];
// Concatenation
const combined = [...arr1, ...arr2];Array Methods: Iteration and Transformation
Specification: §23.1.3 (pp. 549-584)
Mutating Methods
| Method | Description | Returns |
|---|---|---|
push(item) |
Add to end | New length |
pop() |
Remove from end | Removed item |
shift() |
Remove from start | Removed item |
unshift(item) |
Add to start | New length |
splice(start, deleteCount, ...items) |
Remove/insert at index | Removed items array |
reverse() |
Reverse in place | The array itself |
sort([compareFn]) |
Sort in place | The array itself |
const arr = [3, 1, 4, 1, 5];
arr.push(9); // [3, 1, 4, 1, 5, 9]
arr.pop(); // [3, 1, 4, 1, 5]
arr.unshift(0); // [0, 3, 1, 4, 1, 5]
arr.shift(); // [3, 1, 4, 1, 5]
arr.splice(2, 1, 2); // Remove 1 item at index 2, insert 2
// [3, 1, 2, 1, 5]
arr.reverse(); // [5, 1, 2, 1, 3]
arr.sort(); // [1, 1, 2, 3, 5] (lexicographic by default!)Sorting gotcha: Default sort converts elements to strings!
[10, 2, 1].sort(); // [1, 10, 2] (string comparison!)
// Numeric sort:
[10, 2, 1].sort((a, b) => a - b); // [1, 2, 10]Non-Mutating Methods
| Method | Description | Returns |
|---|---|---|
concat(...arrays) |
Merge arrays | New array |
slice(start, end) |
Extract subarray | New array |
join(separator) |
Join to string | String |
indexOf(item) |
First index of item | Index or -1 |
lastIndexOf(item) |
Last index of item | Index or -1 |
includes(item) |
Check presence (ES7) | Boolean |
const arr = [1, 2, 3, 4, 5];
arr.slice(1, 3); // [2, 3]
arr.concat([6, 7]); // [1, 2, 3, 4, 5, 6, 7]
arr.join('-'); // "1-2-3-4-5"
arr.indexOf(3); // 2
arr.includes(4); // trueHigher-Order Methods (Iteration)
Specification: §23.1.3.10, §23.1.3.15, §23.1.3.7, §23.1.3.8 (pp. 554-564)
| Method | Description | Returns |
|---|---|---|
forEach(fn) |
Execute fn for each |
undefined |
map(fn) |
Transform each element | New array |
filter(fn) |
Keep elements where fn returns truthy |
New array |
reduce(fn, initial) |
Accumulate values | Single value |
find(fn) |
First element where fn returns truthy |
Element or undefined |
findIndex(fn) |
Index of first match | Index or -1 |
some(fn) |
Any element passes test | Boolean |
every(fn) |
All elements pass test | Boolean |
const nums = [1, 2, 3, 4, 5];
nums.forEach(x => console.log(x));
const doubled = nums.map(x => x * 2); // [2, 4, 6, 8, 10]
const evens = nums.filter(x => x % 2 === 0); // [2, 4]
const sum = nums.reduce((acc, x) => acc + x, 0); // 15
const firstEven = nums.find(x => x % 2 === 0); // 2
const firstEvenIdx = nums.findIndex(x => x % 2 === 0); // 1
nums.some(x => x > 10); // false
nums.every(x => x > 0); // trueKey insight: These methods do not
mutate the original array (except forEach
which has no return value).
Performance consideration: Chaining multiple array methods creates intermediate arrays. For large datasets, consider single-pass iteration or generator-based approaches.
Array Holes and Sparse Arrays
Arrays can have holes—missing indices that don’t hold a value.
const sparse = [1, , 3];
console.log(sparse.length); // 3
console.log(1 in sparse); // false
sparse.forEach(x => console.log(x)); // 1, 3 (skips hole)
sparse.map(x => x * 2); // [2, <1 empty item>, 6]Holes vs. undefined:
const withUndefined = [1, undefined, 3];
console.log(1 in withUndefined); // true (has property)
const withHole = [1, , 3];
console.log(1 in withHole); // false (no property)Most iteration methods skip holes, but some
(like map) preserve them in results.
Typed Arrays: Fixed-Size Binary Data Views
Motivation: Performance and Binary Data
From the extracted text (wasm-defguide.pdf):
“JavaScript is a flexible and dynamic language, but it has not historically made it easy or efficient to deal with individual bytes of large data sets. This complicates the use of low-level libraries, as the data has to be copied into and out of JavaScript-native formats, which is inefficient.”
Typed Arrays were introduced for WebGL and now power:
Canvas 2D
XMLHttpRequest2
File API
WebSockets (binary)
WebAssembly memory
The ArrayBuffer: Raw Binary Data
Specification: §25.1 (pp. 636-638)
An ArrayBuffer represents a fixed-length raw
binary data buffer—just bytes in memory.
const buffer = new ArrayBuffer(16); // 16 bytes
console.log(buffer.byteLength); // 16You cannot directly read or write an
ArrayBuffer—you need a
view:
Typed Array Views: Interpreting Bytes
Specification: §23.2 (pp. 585-618)
Typed arrays are views over an
ArrayBuffer, interpreting bytes according to a specific
type.
Available types:
| Type | Bytes per Element | C Equivalent | Range |
|---|---|---|---|
Int8Array |
1 | int8_t |
-128 to 127 |
Uint8Array |
1 | uint8_t |
0 to 255 |
Uint8ClampedArray |
1 | uint8_t (clamped) |
0 to 255 |
Int16Array |
2 | int16_t |
-32768 to 32767 |
Uint16Array |
2 | uint16_t |
0 to 65535 |
Int32Array |
4 | int32_t |
to |
Uint32Array |
4 | uint32_t |
0 to |
Float32Array |
4 | float |
IEEE 754 single |
Float64Array |
8 | double |
IEEE 754 double |
BigInt64Array |
8 | int64_t |
to |
BigUint64Array |
8 | uint64_t |
0 to |
Creating Typed Arrays
// From length (creates new buffer)
const u8 = new Uint8Array(10);
console.log(u8.length); // 10
console.log(u8.byteLength); // 10
console.log(u8.buffer); // ArrayBuffer(10)
// From array
const u32 = new Uint32Array([1, 2, 3]);
// From existing buffer
const buffer = new ArrayBuffer(16);
const view1 = new Uint32Array(buffer); // 4 elements (16 / 4)
const view2 = new Uint8Array(buffer); // 16 elements (16 / 1)Example: Multiple Views of Same Buffer
From the extracted text (wasm-defguide.pdf, page 68):
var u32arr = new Uint32Array(10);
u32arr[0] = 257;
var u32buf = u32arr.buffer;
var u8arr = new Uint8Array(u32buf);
console.log(u32arr); // Uint32Array(10) [ 257, 0, 0, 0, ... ]
console.log(u8arr); // Uint8Array(40) [ 1, 1, 0, 0, 0, 0, ... ]Why does 257 appear as [1, 1, 0, 0]?
The number 257 in binary: 0000 0001 0000 0001 (two
bytes).
Little-endian (most common on x86/x64):
Least significant byte stored first
257 =
0x0101→ stored as[0x01, 0x01, 0x00, 0x00]
Visual representation:
Uint32Array index 0: 257 [ byte 0 ][ byte 1 ][ byte 2 ][ byte 3 ] Uint8Array: [ 1 ][ 1 ][ 0 ][ 0 ] Binary: [00000001 ][00000001 ][00000000 ][00000000 ]
The Uint8Array view shows the raw
bytes underlying the Uint32Array.
Endianness: Little-Endian vs. Big-Endian
From the extracted text (wasm-defguide.pdf, page 69):
“In this case, a little endian system stores the least significant bytes first (the 1s). A big endian system would store the 0s first. In the grand scheme of things, it does not matter how they are stored, but different systems and protocols will pick one or the other.”
Little-endian (x86, ARM in little mode):
- Number 0x12345678 stored as:
[78, 56, 34, 12]
Big-endian (network byte order, some ARM modes):
- Number 0x12345678 stored as:
[12, 34, 56, 78]
Systems consideration: When interfacing with binary protocols or file formats, you must know the expected endianness. JavaScript Typed Arrays use the platform’s native endianness (usually little-endian).
DataView: Explicit Endianness Control
Specification: §25.3 (pp. 643-647)
DataView allows reading/writing values with
explicit endianness.
const buffer = new ArrayBuffer(4);
const view = new DataView(buffer);
// Write 32-bit integer in big-endian
view.setUint32(0, 0x12345678, false); // false = big-endian
// Read as bytes
const u8 = new Uint8Array(buffer);
console.log(u8); // [0x12, 0x34, 0x56, 0x78]
// Read back as little-endian
view.getUint32(0, true); // 0x78563412
// Read back as big-endian
view.getUint32(0, false); // 0x12345678DataView methods:
// Getters: getInt8, getUint8, getInt16, getUint16, getInt32, getUint32,
// getFloat32, getFloat64, getBigInt64, getBigUint64
// Setters: setInt8, setUint8, etc.
view.getInt16(byteOffset, littleEndian);
view.setFloat32(byteOffset, value, littleEndian);Use cases:
Parsing binary file formats (BMP, WAV, etc.)
Network protocols
WebAssembly memory interaction
Typed Arrays and WebAssembly Memory
WebAssembly Linear Memory
Specification: WebAssembly Core Spec §4.2.8 (pp. 37-38)
WebAssembly modules have a linear memory—a contiguous, resizable array of bytes starting at address 0.
Accessing from JavaScript:
const memory = new WebAssembly.Memory({ initial: 1 }); // 1 page = 64KB
const buffer = memory.buffer; // ArrayBuffer
const u8 = new Uint8Array(buffer);
u8[0] = 42;
const i32 = new Int32Array(buffer);
i32[1] = 0x12345678;
// Grow memory
memory.grow(1); // Add 1 page (64KB)
// Note: buffer reference becomes detached; must re-acquireMemory growth invalidates old buffer references:
const oldBuffer = memory.buffer;
memory.grow(1);
// oldBuffer is now detached (length becomes 0)
const newBuffer = memory.buffer; // Must get new referenceSharing Memory Between WebAssembly and JavaScript
Pattern: Allocate space in WebAssembly memory, pass offset to JS, manipulate via Typed Array.
// WebAssembly exports memory
const wasmModule = await WebAssembly.instantiate(wasmBytes);
const { memory, allocate, process } = wasmModule.instance.exports;
// Allocate 1024 bytes in Wasm memory
const ptr = allocate(1024);
// View as Uint8Array
const u8 = new Uint8Array(memory.buffer, ptr, 1024);
// Write data
u8.set([1, 2, 3, 4]);
// Call Wasm function to process data
process(ptr, 1024);This pattern avoids copying data between JavaScript and WebAssembly.
SharedArrayBuffer and Atomics: True Parallelism
SharedArrayBuffer: Memory Shared Between Workers
Specification: §25.2 (pp. 638-643)
SharedArrayBuffer allows multiple workers to
access the same memory—enabling true parallelism.
// Main thread
const sab = new SharedArrayBuffer(1024);
const worker = new Worker('worker.js');
worker.postMessage(sab);
// worker.js
self.onmessage = (e) => {
const sab = e.data;
const u32 = new Uint32Array(sab);
u32[0] = 42; // Visible to main thread
};Security note: SharedArrayBuffer was temporarily disabled after Spectre/Meltdown attacks. It now requires:
HTTPS
Cross-Origin-Opener-Policy: same-originCross-Origin-Embedder-Policy: require-corp
Atomics: Safe Concurrent Access
Specification: §25.4 (pp. 647-659)
Without synchronization, concurrent reads/writes cause data races.
Atomics operations:
| Method | Description |
|---|---|
Atomics.load(ta, index) |
Atomic read |
Atomics.store(ta, index, value) |
Atomic write |
Atomics.add(ta, index, value) |
Atomic add, return old value |
Atomics.sub(ta, index, value) |
Atomic subtract |
Atomics.and(ta, index, value) |
Atomic bitwise AND |
Atomics.or(ta, index, value) |
Atomic bitwise OR |
Atomics.xor(ta, index, value) |
Atomic bitwise XOR |
Atomics.exchange(ta, index, value) |
Atomic swap |
Atomics.compareExchange(ta, index, expected, replacement) |
CAS operation |
Wait/notify for coordination:
// Worker 1: Wait for signal
const i32 = new Int32Array(sab);
Atomics.wait(i32, 0, 0); // Block until i32[0] != 0
console.log('Woken up!');
// Worker 2: Send signal
Atomics.store(i32, 0, 1);
Atomics.notify(i32, 0, 1); // Wake 1 waiterExample: Atomic counter:
const sab = new SharedArrayBuffer(4);
const counter = new Int32Array(sab);
// Multiple workers increment safely
Atomics.add(counter, 0, 1);
// Read final value
console.log(Atomics.load(counter, 0));Array-like Objects and the Iterable Protocol
Array-like Objects
An array-like object has:
A
lengthpropertyIndexed properties (0, 1, 2, …)
const arrayLike = {
0: 'a',
1: 'b',
2: 'c',
length: 3
};
// Convert to real array
const arr = Array.from(arrayLike);
console.log(arr); // ['a', 'b', 'c']
// Use array methods
Array.prototype.forEach.call(arrayLike, item => console.log(item));Common array-like objects:
arguments(inside functions)DOM NodeLists
Typed Arrays
The Iterable Protocol
Specification: §27.1 (pp. 661-671)
An object is iterable if it implements
Symbol.iterator, which returns an
iterator.
const arr = [1, 2, 3];
const iterator = arr[Symbol.iterator]();
console.log(iterator.next()); // { value: 1, done: false }
console.log(iterator.next()); // { value: 2, done: false }
console.log(iterator.next()); // { value: 3, done: false }
console.log(iterator.next()); // { value: undefined, done: true }Custom iterable:
const range = {
start: 1,
end: 5,
[Symbol.iterator]() {
let current = this.start;
const end = this.end;
return {
next() {
if (current <= end) {
return { value: current++, done: false };
}
return { done: true };
}
};
}
};
for (const n of range) {
console.log(n); // 1, 2, 3, 4, 5
}Performance Considerations
Array Method Overhead
Chaining array methods creates intermediate arrays:
// Three intermediate arrays created
const result = arr
.filter(x => x > 0)
.map(x => x * 2)
.slice(0, 10);Optimization: Use single-pass iteration when possible:
const result = [];
for (const x of arr) {
if (x > 0) {
result.push(x * 2);
if (result.length === 10) break;
}
}Typed Array Benefits
Performance advantages:
Fixed size: No reallocation overhead
Type safety: No type checks at runtime
Memory efficient: Compact storage
Cache friendly: Contiguous memory layout
Native optimization: JIT can generate specialized code
// Slower: boxing overhead
const arr = [];
for (let i = 0; i < 1000000; i++) arr.push(i);
// Faster: no boxing, contiguous memory
const ta = new Uint32Array(1000000);
for (let i = 0; i < 1000000; i++) ta[i] = i;When to Use Typed Arrays
Use Typed Arrays when:
Working with binary data (files, network, WebAssembly)
Need fixed-size, homogeneous numeric data
Performance is critical
Interfacing with native APIs (WebGL, Canvas, Audio)
Use regular Arrays when:
Need dynamic sizing
Mixed types
Rich array methods (filter, map, reduce)
Readability over raw performance
Summary
Arrays are JavaScript’s flexible, dynamic ordered collections:
Automatic
lengthmanagementRich suite of transformation methods
Support for sparse arrays and holes
Based on prototype inheritance from
Array.prototype
Typed Arrays provide high-performance, fixed-size binary data views:
Multiple views (
Uint8Array,Float32Array, etc.) overArrayBufferEssential for WebGL, Canvas, WebAssembly, and binary protocols
Endianness matters for multi-byte values
DataViewoffers explicit endianness control
SharedArrayBuffer and Atomics enable true multi-threaded parallelism:
Workers share memory via
SharedArrayBufferAtomicsprevent data races through synchronized operationsRequires secure context and CORS headers
Performance insights:
Regular arrays optimize for flexibility
Typed arrays optimize for raw speed and memory efficiency
Choose based on your use case: dynamic vs. fixed, mixed vs. homogeneous, JS-only vs. native interop
Together, these mechanisms provide JavaScript with both high-level ergonomics and low-level control over memory—bridging the gap between scripting convenience and systems programming performance.
Chapter 7: Modules, Imports, and Code Organization
The Evolution of JavaScript Modules
Pre-Module Era: Script Tags and Global Scope
Before ES6 (2015), JavaScript had no native module system. Code organization relied on:
Multiple
<script>tags with global variablesImmediately Invoked Function Expressions (IIFE) for encapsulation
Community solutions: CommonJS (Node.js) and AMD (RequireJS)
The global namespace problem:
// file1.js
var counter = 0;
function increment() { counter++; }
// file2.js
var counter = 10; // Collision! Overwrites file1's counter
function increment() { /* different implementation */ } // Collision!HTML loading order dependency:
<script src="library.js"></script>
<script src="plugin.js"></script> <!-- Must load AFTER library -->
<script src="app.js"></script> <!-- Must load AFTER plugin -->Loading order errors were common and hard to debug.
IIFE Pattern: Manual Encapsulation
// Module pattern using IIFE
var myModule = (function() {
// Private variables
var privateVar = 'secret';
function privateFunction() {
return privateVar;
}
// Public API
return {
publicMethod: function() {
return privateFunction();
}
};
})();
myModule.publicMethod(); // 'secret'
myModule.privateVar; // undefined (encapsulated)Limitations:
No dependency management
Manual dependency ordering
No static analysis
No tree-shaking
Verbose syntax
CommonJS: Node.js Module System
Specification: Not part of ECMA-262 (Node.js specific)
// math.js
function add(a, b) {
return a + b;
}
module.exports = { add };
// or: exports.add = add;
// app.js
const math = require('./math');
console.log(math.add(2, 3)); // 5Characteristics:
Synchronous loading: Blocks until module loaded (fine for server, bad for browser)
Dynamic imports:
require()can be called conditionallyRuntime resolution: Dependencies resolved during execution
Single export object:
module.exportsorexports
Module caching:
// counter.js
let count = 0;
module.exports = {
increment: () => ++count,
get: () => count
};
// app.js
const counter1 = require('./counter');
const counter2 = require('./counter');
counter1.increment();
console.log(counter2.get()); // 1 (same instance!)Modules are cached after first load; subsequent
require() calls return the same
object.
ES6 Modules: Native JavaScript Module System
Basic Syntax and Semantics
Specification: §16.2 (pp. 377-396)
ES6 modules use static import/export syntax with these key features:
File-based: Each file is a separate module
Strict mode by default: All module code runs in strict mode
Top-level scope: Variables don’t leak to global
Static structure: Imports/exports must be at top level (not in blocks)
Asynchronous loading: Designed for browsers
Basic export:
// math.js
export function add(a, b) {
return a + b;
}
export function subtract(a, b) {
return a - b;
}
export const PI = 3.14159;Basic import:
// app.js
import { add, subtract, PI } from './math.js';
console.log(add(2, 3)); // 5
console.log(subtract(5, 2)); // 3
console.log(PI); // 3.14159Named Exports vs. Default Export
Named Exports
// utils.js
export function helper1() { }
export function helper2() { }
export const CONFIG = { };
// app.js
import { helper1, helper2, CONFIG } from './utils.js';Renaming on export:
function internalName() { }
export { internalName as publicName };Renaming on import:
import { helper1 as h1, helper2 as h2 } from './utils.js';Default Export
Each module can have one default export:
// logger.js
export default function log(message) {
console.log(message);
}
// app.js
import log from './logger.js'; // No braces
log('Hello');
// Can use any name
import myLogger from './logger.js';
myLogger('Hello');Default + named exports:
// module.js
export default function main() { }
export function helper() { }
// app.js
import main, { helper } from './module.js';Default export gotchas:
// ❌ Invalid syntax
export default const x = 1;
// ✅ Valid alternatives
const x = 1;
export default x;
// or
export default 1;Import Variations and Patterns
Namespace Import
Import all exports as a single object:
// math.js
export const PI = 3.14159;
export function add(a, b) { return a + b; }
export function multiply(a, b) { return a * b; }
// app.js
import * as math from './math.js';
console.log(math.PI); // 3.14159
console.log(math.add(2, 3)); // 5Systems insight: Namespace imports enable tree-shaking while maintaining clean code organization.
Re-exporting
Create barrel files (index.js) to aggregate exports:
// components/Button.js
export default function Button() { }
// components/Input.js
export default function Input() { }
// components/index.js (barrel)
export { default as Button } from './Button.js';
export { default as Input } from './Input.js';
// app.js
import { Button, Input } from './components/index.js';Re-export all:
// Re-export everything from another module
export * from './other-module.js';
// Re-export everything as namespace
export * as utils from './utils.js';Side-effect Imports
Import for side effects only (no bindings):
// polyfill.js
if (!Array.prototype.includes) {
Array.prototype.includes = function(item) {
return this.indexOf(item) !== -1;
};
}
// app.js
import './polyfill.js'; // Execute but don't import anythingUse cases:
Polyfills
Global CSS imports
Registering web components
Database connection initialization
Live Bindings: A Key Semantic Difference
Specification: §16.2.1.5 (pp. 381-383)
Unlike CommonJS (which exports values), ES6 modules export live bindings—references to the exported variables.
// counter.js (ES6 module)
export let count = 0;
export function increment() {
count++;
}
// app.js
import { count, increment } from './counter.js';
console.log(count); // 0
increment();
console.log(count); // 1 (automatically updated!)
// ❌ Cannot mutate imported binding
count = 5; // SyntaxError: "count" is read-onlyContrast with CommonJS:
// counter.js (CommonJS)
let count = 0;
module.exports = {
count,
increment() { count++; }
};
// app.js
const counter = require('./counter');
console.log(counter.count); // 0
counter.increment();
console.log(counter.count); // 0 (NOT updated! Value was copied)Systems insight: Live bindings enable:
Circular dependencies to work correctly
Hot module replacement (HMR)
Better tree-shaking (bundlers can trace usage)
Module Loading and Resolution
Browser Module Loading
Specification: HTML spec §8.1.4 (not in ECMA-262)
<!-- Load as module -->
<script type="module" src="./app.js"></script>
<!-- Inline module -->
<script type="module">
import { render } from './render.js';
render();
</script>Module script behavior:
Deferred by default: Wait for HTML parsing (like
deferattribute)CORS required: Cross-origin modules need CORS headers
Strict mode: Always runs in strict mode
Top-level
await: Allowed (blocks dependent modules)No
document.write(): Would break deferred loading
Module resolution algorithm:
Normalize module specifier
Relative: ‘./module.js’, ‘../utils.js’
Absolute: ‘/lib/module.js’
URL: ‘https://cdn.example.com/lib.js’
Bare specifier: ‘lodash’ (requires import map)
Check module cache
- If cached, return cached module
Fetch module source
Parse as module code
Recursively fetch dependencies
Instantiate module
Create module environment
Bind exports
Execute module code
- Run module body once
Import Maps: Bare Specifier Resolution
Specification: HTML spec (WICG proposal, now baseline)
<script type="importmap">
{
"imports": {
"lodash": "/node_modules/lodash-es/lodash.js",
"jquery": "https://cdn.jsdelivr.net/npm/jquery@3/dist/jquery.min.js",
"utils/": "/src/utils/"
}
}
</script>
<script type="module">
import _ from 'lodash'; // Resolves to /node_modules/...
import $ from 'jquery'; // Resolves to CDN
import { helper } from 'utils/helper.js'; // Resolves to /src/utils/helper.js
</script>Scoped imports (different resolution per path):
{
"imports": {
"lodash": "/node_modules/lodash-es/lodash.js"
},
"scopes": {
"/legacy/": {
"lodash": "/node_modules/lodash/lodash.js"
}
}
}Node.js Module Resolution
Algorithm (simplified):
Core modules:
fs,path,http(built-in)Relative/absolute paths: Resolve directly
Bare specifiers: Search
node_modules/Current directory’s
node_modules/Parent directory’s
node_modules/Recursively up to filesystem root
File/directory resolution:
// import './module'
// Tries in order:
// 1. ./module.js
// 2. ./module.json
// 3. ./module.node
// 4. ./module/package.json (check "main" field)
// 5. ./module/index.jsES Modules in Node.js (v12.17+):
.mjsextension: Always treated as ES module.jsextension: Check nearestpackage.jsonfor"type": "module".cjsextension: Always treated as CommonJS
// package.json
{
"type": "module", // All .js files are ES modules
"exports": {
".": "./src/index.js",
"./utils": "./src/utils.js"
}
}Dynamic Imports: Runtime Module Loading
import() Expression
Specification: §13.3.10 (pp. 258-259)
Dynamic imports return a Promise that resolves to the module namespace:
// Static import (top-level only)
import { helper } from './utils.js';
// Dynamic import (anywhere, returns Promise)
import('./utils.js').then(module => {
module.helper();
});
// With async/await
async function load() {
const module = await import('./utils.js');
module.helper();
}Use cases:
1. Code Splitting (Lazy Loading)
async function showAdmin() {
const { AdminPanel } = await import('./admin-panel.js');
// admin-panel.js only loaded when needed
return new AdminPanel();
}
document.getElementById('admin-btn').onclick = showAdmin;2. Conditional Loading
async function loadPolyfills() {
if (!('IntersectionObserver' in window)) {
await import('./intersection-observer-polyfill.js');
}
}3. Computed Module Paths
const locale = navigator.language;
const translations = await import(`./i18n/${locale}.js`);4. Dynamic Feature Detection
async function getRenderer() {
if (supportsWebGL()) {
return import('./webgl-renderer.js');
} else {
return import('./canvas-renderer.js');
}
}Error handling:
try {
const module = await import('./might-not-exist.js');
} catch (err) {
console.error('Module load failed:', err);
// Fallback logic
}Systems insight: Bundlers (webpack, Rollup) use dynamic imports as split points to create separate chunks, enabling:
Faster initial load (smaller main bundle)
Parallel loading (chunks load independently)
Better caching (unchanged chunks stay cached)
Module Patterns and Best Practices
The Singleton Pattern
Modules execute once and cache their state:
// database.js
let connection = null;
export function connect(config) {
if (!connection) {
connection = new DatabaseConnection(config);
}
return connection;
}
export function query(sql) {
return connection.execute(sql);
}Multiple imports get the same instance:
// file1.js
import { connect } from './database.js';
connect({ host: 'localhost' });
// file2.js
import { query } from './database.js';
query('SELECT * FROM users'); // Uses connection from file1The Factory Pattern
Export functions that create instances:
// logger.js
export function createLogger(name) {
return {
log(message) {
console.log(`[${name}] ${message}`);
}
};
}
// app.js
import { createLogger } from './logger.js';
const userLogger = createLogger('User');
const authLogger = createLogger('Auth');
userLogger.log('Created'); // [User] Created
authLogger.log('Login'); // [Auth] LoginDependency Injection via Modules
// services.js
let db = null;
let cache = null;
export function configure(dependencies) {
db = dependencies.db;
cache = dependencies.cache;
}
export function getUserById(id) {
const cached = cache.get(id);
if (cached) return cached;
const user = db.query('SELECT * FROM users WHERE id = ?', id);
cache.set(id, user);
return user;
}
// main.js
import { configure } from './services.js';
import { createDB } from './database.js';
import { createCache } from './cache.js';
configure({
db: createDB(),
cache: createCache()
});Circular Dependencies
ES6 modules handle circular dependencies via live bindings:
// a.js
import { b } from './b.js';
export const a = 'A';
console.log(b); // Works! b is a live binding
// b.js
import { a } from './a.js';
export const b = 'B';
console.log(a); // Works! a is a live bindingExecution order:
Start loading
a.jsEncounter import of
b.js, start loading itb.jsimportsa.js(already loading, continue)b.jsexportsb, accessesa(binding exists but not yet initialized)Return to
a.js, exporta, accessb(now initialized)
Best practice: Avoid circular dependencies when possible. If necessary:
Use functions (not top-level code) to access circular imports
Ensure initialization order doesn’t matter
// ✅ Safe circular dependency
// a.js
import { getB } from './b.js';
export function getA() { return 'A'; }
console.log(getB());
// b.js
import { getA } from './a.js';
export function getB() { return 'B'; }
console.log(getA());Module Bundling and Build Tools
Why Bundling?
Problems with native modules in production:
Too many HTTP requests: Each import = separate request
No minification: Unoptimized source code
No transpilation: Can’t use newer syntax on older browsers
No tree-shaking: Dead code included
Bundlers solve these by:
Concatenating modules into fewer files
Resolving dependencies statically
Eliminating dead code (tree-shaking)
Minifying output
Code splitting for optimal loading
Tree-Shaking: Dead Code Elimination
Specification: Not in ECMA-262 (bundler feature)
Tree-shaking removes unused exports by analyzing static imports:
// utils.js
export function used() { }
export function unused() { }
// app.js
import { used } from './utils.js';
used();
// Bundled output (unused eliminated)
function used() { }
used();Why static imports matter:
// ✅ Tree-shakeable (static)
import { func } from './module.js';
// ❌ Not tree-shakeable (dynamic)
const module = require('./module.js');
const { func } = module;
// ❌ Not tree-shakeable (conditional)
if (condition) {
import { func } from './module.js';
}Best practices for tree-shaking:
Use named exports (not default)
Avoid namespace imports (
import * as)Mark side-effect-free in
package.json:
{
"sideEffects": false
}Or specify files with side effects:
{
"sideEffects": ["./src/polyfills.js", "*.css"]
}Common Bundlers
Webpack: Full-featured, complex configuration
Entry/output configuration
Loaders for non-JS assets
Plugins for optimization
Code splitting via dynamic imports
Rollup: Optimized for libraries
Better tree-shaking
Smaller output
Multiple output formats (ESM, CJS, UMD)
esbuild: Extremely fast (written in Go)
10-100× faster than webpack
Built-in TypeScript support
Minimal configuration
Vite: Modern dev server + Rollup for production
Native ES modules in development
Instant server start
Hot Module Replacement (HMR)
Module Scope and
this
Top-Level this is
undefined
Specification: §16.1.7 (pp. 375-376)
// module.js
console.log(this); // undefined
// script.js (non-module)
console.log(this); // window (in browser) or global (in Node.js)Why? Modules are always strict mode, and strict
mode makes top-level this undefined to prevent
accidental global access.
Variables Don’t Leak to Global
// module.js
var x = 1;
let y = 2;
const z = 3;
// None of these leak to window/global
console.log(window.x); // undefinedContrast with scripts:
// script.js
var x = 1;
console.log(window.x); // 1 (leaked to global!)Top-Level await
Specification: §16.1.7 (pp. 375-376)
Modules can use await at the top level:
// data.js
export const users = await fetch('/api/users').then(r => r.json());
// app.js
import { users } from './data.js';
console.log(users); // Waits for data.js to finish loadingExecution model:
Module starts loading
Reaches
await, pauses executionDependent modules wait for this module
Once resolved, continues execution
Use cases:
Fetch configuration before app starts
Load translations
Initialize database connections
Caution: Blocks dependent modules—use sparingly for critical resources only.
Worker Modules
Specification: HTML spec (Web Workers)
Workers can load ES modules:
// main.js
const worker = new Worker('./worker.js', { type: 'module' });
worker.postMessage({ cmd: 'process', data: [1, 2, 3] });
// worker.js (ES module)
import { process } from './processor.js';
self.onmessage = (e) => {
const result = process(e.data.data);
self.postMessage(result);
};Benefits:
Clean imports (no
importScripts())Static analysis
Same module syntax as main thread
Summary
JavaScript modules have evolved from global scripts to a sophisticated system:
Historical progression:
IIFE pattern: Manual encapsulation
CommonJS: Synchronous, dynamic, Node.js
ES6 modules: Static, asynchronous, native
ES6 module characteristics:
Static structure: Enables tree-shaking and static analysis
Live bindings: Exports are references, not values
Strict mode: Always enabled
Deferred execution: In browsers, like
deferattributeSingleton by default: Module code runs once
Import/export patterns:
Named exports: Multiple exports per module
Default export: One primary export
Namespace import: Import all as object
Re-exports: Barrel files for aggregation
Side-effect imports: Execute without bindings
Dynamic imports:
Runtime loading:
import()returns PromiseCode splitting: Lazy load features
Conditional loading: Load based on runtime conditions
Build tools:
Bundlers: Webpack, Rollup, esbuild, Vite
Tree-shaking: Remove unused code
Code splitting: Multiple output chunks
Best practices:
Use named exports for tree-shaking
Avoid circular dependencies
Mark side effects explicitly
Use dynamic imports for code splitting
Prefer static imports for core dependencies
The module system bridges JavaScript’s dynamic nature with the performance and tooling benefits of static analysis—enabling both developer productivity and runtime efficiency.
Chapter 8: The Browser Environment (Beyond ECMA-262)
The Boundary Between ECMAScript and the Web Platform
What ECMA-262 Does NOT Specify
Important distinction: ECMA-262 defines the JavaScript language, not the browser environment. The following are NOT part of the ECMAScript specification:
window,document,navigatorDOM APIs (
getElementById,querySelector, etc.)setTimeout,setInterval,requestAnimationFramefetch,XMLHttpRequestlocalStorage,sessionStorage,IndexedDBconsole(though engines implement it)Browser events (
click,load,resize, etc.)Canvas, WebGL, Web Audio
alert,confirm,prompt
These are defined by:
HTML Standard (WHATWG)
W3C Web APIs
CSSWG (CSS Object Model)
Browser vendor extensions
Why this matters: Understanding the boundary helps explain:
Why Node.js lacks
windowbut hasglobalWhy Deno has
windowbut notdocumentWhy “JavaScript” behaves differently across environments
The Global Object in Browsers
window: The
Browser’s Global Object
HTML Standard §8.1.1 (not ECMA-262)
In browsers, the global object is window:
// All equivalent
var x = 1;
window.x = 1;
this.x = 1; // In non-module, non-strict code
console.log(window.x); // 1window properties:
window.document // The DOM
window.location // Current URL
window.history // Navigation history
window.navigator // Browser info
window.screen // Screen dimensions
window.localStorage // Persistent storage
window.sessionStorage // Session storage
window.console // Developer console
window.fetch // HTTP requests
window.setTimeout // Timer functions
window.addEventListener // Event registrationglobalThis:
Universal Global Access
ECMA-262 §19.3 (pp. 505-506)
ES2020 introduced globalThis for
platform-independent global access:
// Works everywhere
globalThis.setTimeout // Browser
globalThis.global // Node.js (has both)
globalThis.window // Browser (also works)
// Platform-specific checks
if (typeof window !== 'undefined') {
// Browser environment
} else if (typeof global !== 'undefined') {
// Node.js environment
}
// Better approach
if (typeof globalThis.document !== 'undefined') {
// Browser (DOM available)
}Why globalThis?
Before ES2020, accessing the global object required:
// Unreliable cross-platform hack
const globalObj = (function() {
return this;
})() || (typeof window !== 'undefined' ? window : global);
// Now just:
const globalObj = globalThis;self:
Worker-Compatible Global
HTML Standard (Web Workers)
self refers to the global object in both
main thread and Workers:
// main.js (works)
self.addEventListener('load', () => {});
// worker.js (also works)
self.addEventListener('message', () => {});
// But window doesn't work in Workers:
// window.addEventListener('message', () => {}); // ❌ ReferenceErrorBest practice: Use self in code
that might run in Workers.
The Document Object Model (DOM)
DOM Structure and Representation
DOM Standard (WHATWG)
The DOM is a tree representation of HTML:
<!DOCTYPE html>
<html>
<head>
<title>Page Title</title>
</head>
<body>
<div id="app">
<h1>Hello</h1>
<p class="intro">Text</p>
</div>
</body>
</html>Tree structure:
Document └── html (HTMLHtmlElement) ├── head (HTMLHeadElement) │ └── title (HTMLTitleElement) │ └── #text: “Page Title” └── body (HTMLBodyElement) └── div#app (HTMLDivElement) ├── h1 (HTMLHeadingElement) │ └── #text: “Hello” └── p.intro (HTMLParagraphElement) └── #text: “Text”
Node Types and Hierarchy
Node interface (base for all DOM nodes):
// Node types (constants)
Node.ELEMENT_NODE // 1
Node.ATTRIBUTE_NODE // 2 (deprecated)
Node.TEXT_NODE // 3
Node.CDATA_SECTION_NODE // 4
Node.PROCESSING_INSTRUCTION_NODE // 7
Node.COMMENT_NODE // 8
Node.DOCUMENT_NODE // 9
Node.DOCUMENT_TYPE_NODE // 10
Node.DOCUMENT_FRAGMENT_NODE // 11
// Example
const div = document.createElement('div');
console.log(div.nodeType); // 1 (ELEMENT_NODE)Inheritance hierarchy:
EventTarget └── Node ├── Element │ ├── HTMLElement │ │ ├── HTMLDivElement │ │ ├── HTMLSpanElement │ │ ├── HTMLInputElement │ │ └── … │ └── SVGElement ├── Text ├── Comment └── Document └── HTMLDocument
DOM Traversal
Properties for navigation:
const element = document.getElementById('app');
// Parent/child
element.parentNode // Parent node
element.parentElement // Parent element (null if parent is Document)
element.childNodes // NodeList (includes text nodes)
element.children // HTMLCollection (elements only)
element.firstChild // First node (may be text)
element.firstElementChild // First element
element.lastChild
element.lastElementChild
// Siblings
element.nextSibling // Next node
element.nextElementSibling // Next element
element.previousSibling
element.previousElementSiblingExample: Walking the tree:
function walkDOM(node, callback) {
callback(node);
node = node.firstChild;
while (node) {
walkDOM(node, callback);
node = node.nextSibling;
}
}
walkDOM(document.body, (node) => {
if (node.nodeType === Node.ELEMENT_NODE) {
console.log(node.tagName);
}
});DOM Querying: Finding Elements
getElementById
const el = document.getElementById('app');
// Returns Element or null
// Only searches by ID attributegetElementsByClassName
const elements = document.getElementsByClassName('intro');
// Returns live HTMLCollection
// Updates automatically when DOM changesgetElementsByTagName
const divs = document.getElementsByTagName('div');
// Returns live HTMLCollectionquerySelector /
querySelectorAll
Recommended approach (uses CSS selectors):
// Returns first match or null
const first = document.querySelector('.intro');
const firstDiv = document.querySelector('div');
const specific = document.querySelector('#app > p.intro');
// Returns static NodeList (not live)
const all = document.querySelectorAll('.intro');
const complex = document.querySelectorAll('div.container > p:not(.hidden)');
// Iterate NodeList
all.forEach(element => {
console.log(element);
});
// Or convert to array
const array = Array.from(all);Live vs. static collections:
// Live HTMLCollection
const liveList = document.getElementsByClassName('item');
console.log(liveList.length); // 3
// Add new element
const newItem = document.createElement('div');
newItem.className = 'item';
document.body.appendChild(newItem);
console.log(liveList.length); // 4 (automatically updated!)
// Static NodeList
const staticList = document.querySelectorAll('.item');
console.log(staticList.length); // 4
document.body.appendChild(anotherItem);
console.log(staticList.length); // Still 4 (snapshot)DOM Manipulation
Creating Elements
// Create element
const div = document.createElement('div');
const text = document.createTextNode('Hello');
const comment = document.createComment('This is a comment');
// Set attributes
div.id = 'myDiv';
div.className = 'container';
div.setAttribute('data-value', '123');
// Add content
div.textContent = 'Hello'; // Escapes HTML
div.innerHTML = '<b>Bold</b>'; // Parses HTML (XSS risk!)
// Add to DOM
document.body.appendChild(div);Modifying Elements
const element = document.getElementById('app');
// Content
element.textContent = 'New text';
element.innerHTML = '<p>HTML content</p>';
// Attributes
element.setAttribute('data-id', '123');
element.getAttribute('data-id'); // '123'
element.removeAttribute('data-id');
element.hasAttribute('data-id'); // false
// Classes
element.classList.add('active');
element.classList.remove('hidden');
element.classList.toggle('expanded');
element.classList.contains('active'); // true
element.classList.replace('old', 'new');
// Styles
element.style.color = 'red';
element.style.backgroundColor = 'blue';
element.style.fontSize = '16px';
// Better: Use CSS classes instead
element.classList.add('highlighted');Inserting Elements
const parent = document.getElementById('container');
const child = document.createElement('div');
// Classic methods
parent.appendChild(child); // Add to end
parent.insertBefore(child, refNode); // Insert before reference
parent.removeChild(child); // Remove child
parent.replaceChild(newChild, oldChild);
// Modern methods (more intuitive)
parent.append(child); // Add to end (can add multiple)
parent.prepend(child); // Add to beginning
child.before(newElement); // Insert before child
child.after(newElement); // Insert after child
child.replaceWith(newElement); // Replace child
child.remove(); // Remove from parent
// Example: Multiple insertions
parent.append(
document.createElement('div'),
'Plain text',
document.createElement('span')
);Document Fragments (Performance)
Best practice for bulk operations:
// ❌ Slow: Multiple reflows
for (let i = 0; i < 1000; i++) {
const div = document.createElement('div');
div.textContent = i;
document.body.appendChild(div); // Reflow each time!
}
// ✅ Fast: Single reflow
const fragment = document.createDocumentFragment();
for (let i = 0; i < 1000; i++) {
const div = document.createElement('div');
div.textContent = i;
fragment.appendChild(div); // No reflow
}
document.body.appendChild(fragment); // Single reflowSystems insight: Document fragments exist in memory only—modifying them doesn’t trigger layout recalculation (reflow) or repaint.
The Event System
Event Flow: Capturing and Bubbling
DOM Events Standard (WHATWG)
Events propagate in three phases:
Capturing phase: From
windowdown to targetTarget phase: Event at the target element
Bubbling phase: From target back up to
window
<div id="outer">
<div id="inner">
<button id="btn">Click</button>
</div>
</div>Event flow when button clicked:
Capturing: window → document → html → body → outer → inner → btn Target: btn Bubbling: btn → inner → outer → body → html → document → window
Registering listeners:
const btn = document.getElementById('btn');
const inner = document.getElementById('inner');
const outer = document.getElementById('outer');
// Bubbling phase (default)
btn.addEventListener('click', (e) => {
console.log('Button clicked');
});
inner.addEventListener('click', (e) => {
console.log('Inner div clicked');
});
outer.addEventListener('click', (e) => {
console.log('Outer div clicked');
});
// Click button logs:
// "Button clicked"
// "Inner div clicked"
// "Outer div clicked"
// Capturing phase (third argument = true)
outer.addEventListener('click', (e) => {
console.log('Outer (capturing)');
}, true);
// Click button logs:
// "Outer (capturing)" ← Capturing phase
// "Button clicked" ← Target phase
// "Inner div clicked" ← Bubbling phase
// "Outer div clicked" ← Bubbling phaseEvent Object
Properties and methods:
element.addEventListener('click', (event) => {
// Target information
event.target // Element that triggered event
event.currentTarget // Element with listener attached
// Event details
event.type // 'click'
event.timeStamp // High-resolution timestamp
event.isTrusted // true if user-initiated
// Mouse events
event.clientX // X relative to viewport
event.clientY // Y relative to viewport
event.pageX // X relative to document
event.pageY // Y relative to document
event.screenX // X relative to screen
event.screenY // Y relative to screen
event.button // Which mouse button (0=left, 1=middle, 2=right)
// Keyboard events
event.key // Key name ('Enter', 'a', 'Shift')
event.code // Physical key ('KeyA', 'Enter')
event.keyCode // Deprecated numeric code
event.ctrlKey // Ctrl pressed?
event.shiftKey // Shift pressed?
event.altKey // Alt pressed?
event.metaKey // Meta/Cmd pressed?
// Control flow
event.preventDefault(); // Prevent default action
event.stopPropagation(); // Stop bubbling/capturing
event.stopImmediatePropagation(); // Stop other listeners on same element
});Example: target
vs. currentTarget:
outer.addEventListener('click', (e) => {
console.log('target:', e.target.id); // 'btn' (what was clicked)
console.log('currentTarget:', e.currentTarget.id); // 'outer' (listener owner)
});
// Click button logs:
// target: btn
// currentTarget: outerEvent Delegation
Pattern: Attach listener to parent, handle events from children.
// ❌ Inefficient: Multiple listeners
document.querySelectorAll('.item').forEach(item => {
item.addEventListener('click', handleClick);
});
// ✅ Efficient: Single listener on parent
document.getElementById('list').addEventListener('click', (e) => {
if (e.target.classList.contains('item')) {
handleClick(e);
}
});Benefits:
Fewer event listeners → less memory
Works with dynamically added elements
Better performance for large lists
Advanced delegation (with
closest):
document.getElementById('list').addEventListener('click', (e) => {
// Find nearest ancestor with .item class
const item = e.target.closest('.item');
if (item) {
console.log('Item clicked:', item.dataset.id);
}
});Preventing Default Behavior
// Prevent form submission
form.addEventListener('submit', (e) => {
e.preventDefault();
// Handle with fetch instead
});
// Prevent link navigation
link.addEventListener('click', (e) => {
e.preventDefault();
// Custom navigation logic
});
// Prevent context menu
element.addEventListener('contextmenu', (e) => {
e.preventDefault();
// Show custom menu
});Custom Events
// Create custom event
const customEvent = new CustomEvent('userLogin', {
detail: { username: 'alice', timestamp: Date.now() },
bubbles: true,
cancelable: true
});
// Dispatch event
element.dispatchEvent(customEvent);
// Listen for custom event
element.addEventListener('userLogin', (e) => {
console.log('User logged in:', e.detail.username);
});Timers and Animation
setTimeout and
setInterval
HTML Standard §8.6 (not ECMA-262)
// Execute once after delay
const timeoutId = setTimeout(() => {
console.log('Executed after 1 second');
}, 1000);
// Cancel timer
clearTimeout(timeoutId);
// Execute repeatedly
const intervalId = setInterval(() => {
console.log('Executed every 1 second');
}, 1000);
// Cancel interval
clearInterval(intervalId);Important gotchas:
// ❌ Incorrect: 'this' context lost
class Timer {
constructor() {
this.count = 0;
}
start() {
setInterval(function() {
this.count++; // 'this' is window, not Timer instance!
}, 1000);
}
}
// ✅ Correct: Use arrow function
class Timer {
constructor() {
this.count = 0;
}
start() {
setInterval(() => {
this.count++; // 'this' is Timer instance
}, 1000);
}
}Minimum delay:
// HTML spec mandates minimum 4ms delay for nested timers
setTimeout(() => {
setTimeout(() => {
// This has minimum 4ms delay, not 0ms
}, 0);
}, 0);requestAnimationFrame
Optimal for animations (60 FPS when possible):
function animate() {
// Update animation state
element.style.left = `${position}px`;
position += velocity;
// Schedule next frame
requestAnimationFrame(animate);
}
// Start animation
requestAnimationFrame(animate);Advantages over setInterval:
Synchronized with screen refresh (60 Hz = ~16.67ms)
Pauses when tab inactive (saves battery)
Automatic throttling (prevents overload)
With timestamp:
let startTime = null;
function animate(timestamp) {
if (!startTime) startTime = timestamp;
const elapsed = timestamp - startTime;
const progress = Math.min(elapsed / 1000, 1); // 0 to 1 over 1 second
element.style.left = `${progress * 100}px`;
if (progress < 1) {
requestAnimationFrame(animate);
}
}
requestAnimationFrame(animate);Cancel animation:
const animationId = requestAnimationFrame(animate);
cancelAnimationFrame(animationId);Web Storage APIs
localStorage
and sessionStorage
HTML Standard §12.2
Both provide key-value storage (strings only):
// localStorage: Persistent across sessions
localStorage.setItem('username', 'alice');
localStorage.getItem('username'); // 'alice'
localStorage.removeItem('username');
localStorage.clear(); // Remove all
localStorage.length; // Number of items
// sessionStorage: Cleared when tab closes
sessionStorage.setItem('tempToken', 'xyz123');Storing objects (requires serialization):
// ❌ Incorrect: Stores "[object Object]"
localStorage.setItem('user', { name: 'alice' });
// ✅ Correct: Serialize to JSON
const user = { name: 'alice', age: 30 };
localStorage.setItem('user', JSON.stringify(user));
// Retrieve and parse
const retrieved = JSON.parse(localStorage.getItem('user'));
console.log(retrieved.name); // 'alice'Quota: Typically 5-10 MB per origin (varies by browser).
Storage events (cross-tab communication):
// Listen for storage changes from other tabs
window.addEventListener('storage', (e) => {
console.log('Key:', e.key);
console.log('Old value:', e.oldValue);
console.log('New value:', e.newValue);
console.log('URL:', e.url);
});
// In another tab
localStorage.setItem('sharedData', 'value'); // Triggers event in first tabIndexedDB: Client-Side Database
Indexed Database API (W3C)
Key-value database with indexes and transactions:
// Open database
const request = indexedDB.open('MyDatabase', 1);
request.onupgradeneeded = (event) => {
const db = event.target.result;
// Create object store
const store = db.createObjectStore('users', { keyPath: 'id' });
// Create indexes
store.createIndex('name', 'name', { unique: false });
store.createIndex('email', 'email', { unique: true });
};
request.onsuccess = (event) => {
const db = event.target.result;
// Add data
const transaction = db.transaction(['users'], 'readwrite');
const store = transaction.objectStore('users');
store.add({ id: 1, name: 'Alice', email: 'alice@example.com' });
// Query by key
const getRequest = store.get(1);
getRequest.onsuccess = () => {
console.log(getRequest.result);
};
// Query by index
const index = store.index('email');
const searchRequest = index.get('alice@example.com');
};Modern Promise-based wrapper (idb library):
import { openDB } from 'idb';
const db = await openDB('MyDatabase', 1, {
upgrade(db) {
const store = db.createObjectStore('users', { keyPath: 'id' });
store.createIndex('name', 'name');
}
});
// Add
await db.add('users', { id: 1, name: 'Alice' });
// Get
const user = await db.get('users', 1);
// Query by index
const aliceRecords = await db.getAllFromIndex('users', 'name', 'Alice');Use cases:
Offline applications
Large datasets (GBs, not just MBs)
Structured data with complex queries
Network: fetch API
Making HTTP Requests
Fetch Standard (WHATWG)
// Simple GET
const response = await fetch('https://api.example.com/users');
const data = await response.json();
// With options
const response = await fetch('https://api.example.com/users', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer token123'
},
body: JSON.stringify({ name: 'Alice' })
});
// Check response
if (!response.ok) {
throw new Error(`HTTP error: ${response.status}`);
}
const data = await response.json();Response Methods
const response = await fetch(url);
// Parse as JSON
const json = await response.json();
// Get as text
const text = await response.text();
// Get as blob (binary)
const blob = await response.blob();
// Get as ArrayBuffer
const buffer = await response.arrayBuffer();
// Get as FormData
const formData = await response.formData();Request Configuration
fetch(url, {
method: 'POST', // GET, POST, PUT, DELETE, PATCH
headers: new Headers({ // or plain object
'Content-Type': 'application/json'
}),
body: JSON.stringify(data), // string, FormData, Blob, ArrayBuffer
mode: 'cors', // cors, no-cors, same-origin
credentials: 'include', // include, same-origin, omit
cache: 'default', // default, no-cache, reload, force-cache
redirect: 'follow', // follow, error, manual
referrer: 'client', // URL or client
signal: abortController.signal // Abort signal
});Aborting Requests
const controller = new AbortController();
fetch(url, { signal: controller.signal })
.then(response => response.json())
.catch(err => {
if (err.name === 'AbortError') {
console.log('Request aborted');
}
});
// Abort after 5 seconds
setTimeout(() => controller.abort(), 5000);CORS: Cross-Origin Resource Sharing
Same-Origin Policy restricts cross-origin requests:
// ❌ Blocked by default
fetch('https://other-domain.com/api/data')
.catch(err => {
// CORS error: No 'Access-Control-Allow-Origin' header
});
// ✅ Allowed if server sends proper headers
// Server must respond with:
// Access-Control-Allow-Origin: https://your-domain.com
// Access-Control-Allow-Methods: GET, POST
// Access-Control-Allow-Headers: Content-TypePreflight request for non-simple requests:
Browser sends OPTIONS request: Origin: https://your-domain.com Access-Control-Request-Method: POST Access-Control-Request-Headers: Content-Type
Server responds: Access-Control-Allow-Origin: https://your-domain.com Access-Control-Allow-Methods: POST Access-Control-Allow-Headers: Content-Type
Then actual POST request is sent.
Browser APIs Overview
Location and Navigation
// Current URL
console.log(window.location.href); // Full URL
console.log(window.location.protocol); // 'https:'
console.log(window.location.host); // 'example.com:443'
console.log(window.location.hostname); // 'example.com'
console.log(window.location.port); // '443'
console.log(window.location.pathname); // '/path/to/page'
console.log(window.location.search); // '?key=value'
console.log(window.location.hash); // '#section'
// Navigate
window.location.href = 'https://example.com';
window.location.assign('https://example.com');
window.location.replace('https://example.com'); // No history entry
window.location.reload(); // Refresh
// History API
history.pushState({ page: 1 }, 'Title', '/page1');
history.replaceState({ page: 2 }, 'Title', '/page2');
history.back();
history.forward();
history.go(-2); // Go back 2 pagesNavigator: Browser Information
navigator.userAgent // Browser identification string
navigator.language // 'en-US'
navigator.languages // ['en-US', 'en']
navigator.onLine // Network connectivity
navigator.cookieEnabled // Cookies allowed?
navigator.geolocation // Geolocation API
navigator.mediaDevices // Camera/microphone access
navigator.serviceWorker // Service Worker registrationConsole API
Not in ECMA-262 (but universally implemented):
console.log('Message', variable);
console.info('Info');
console.warn('Warning');
console.error('Error');
console.table([{ a: 1, b: 2 }, { a: 3, b: 4 }]);
console.group('Group');
console.log('Inside group');
console.groupEnd();
console.time('Timer');
// ... code ...
console.timeEnd('Timer'); // Logs elapsed time
console.assert(condition, 'Assertion failed');
console.trace(); // Stack tracePerformance APIs
performance.now()
High-resolution timestamp (microsecond precision):
const start = performance.now();
expensiveOperation();
const end = performance.now();
console.log(`Took ${end - start} milliseconds`);
// More precise than Date.now()
Date.now(); // ~1ms resolution
performance.now(); // ~0.001ms resolution (if high-resolution timing allowed)Performance Monitoring
// Navigation timing
const timing = performance.timing;
console.log('DOM loaded:', timing.domContentLoadedEventEnd - timing.navigationStart);
console.log('Page loaded:', timing.loadEventEnd - timing.navigationStart);
// Resource timing
const resources = performance.getEntriesByType('resource');
resources.forEach(resource => {
console.log(`${resource.name}: ${resource.duration}ms`);
});
// Mark and measure
performance.mark('start-render');
renderComponents();
performance.mark('end-render');
performance.measure('render-duration', 'start-render', 'end-render');
const measure = performance.getEntriesByName('render-duration')[0];
console.log(`Rendering took ${measure.duration}ms`);Summary
The browser environment extends JavaScript with:
Global objects:
window: Browser global object (HTML Standard)globalThis: Universal global access (ECMA-262)self: Worker-compatible global reference
DOM (Document Object Model):
Tree structure: Nodes with parent/child relationships
Querying:
querySelector,getElementById, etc.Manipulation: Creating, modifying, removing elements
Live vs. static collections
Event system:
Three phases: Capturing → Target → Bubbling
Event delegation: Efficient pattern for dynamic content
Custom events: Application-specific communication
Timers and animation:
setTimeout/setInterval: Basic timing (≥4ms minimum)requestAnimationFrame: Synchronized with display refresh
Storage:
localStorage/sessionStorage: Simple key-value (5-10 MB)IndexedDB: Structured database (GBs)
Networking:
fetchAPI: Modern HTTP requestsCORS: Cross-origin security mechanism
AbortController: Canceling requests
Other APIs:
Location/History: Navigation control
Navigator: Browser capabilities
Performance: High-resolution timing
Understanding the boundary between ECMA-262 and web standards is essential for:
Writing portable code (works in Node.js, Deno, browsers)
Debugging environment-specific issues
Leveraging platform-specific optimizations
The browser is a rich execution environment—JavaScript is just the language that coordinates it all.
Chapter 9: Node.js Runtime (Resident ECMA-262)
Node.js: JavaScript Beyond the Browser
What Node.js Is (and Isn’t)
Node.js is a JavaScript runtime built on Chrome’s V8 engine, designed for server-side execution. It is not:
A programming language (uses JavaScript/ECMA-262)
A framework (like Express or Nest.js)
A browser environment
Core architecture:
┌│─────────────────────────────────────────┐
││ JavaScript Application Code │
├│─────────────────────────────────────────┤
││ Node.js APIs (C++ bindings) │
││ fs, http, crypto, path, os, etc. │
├│─────────────────────────────────────────┤
││ V8 Engine │
││ (ECMA-262 implementation + JIT) │
├│─────────────────────────────────────────┤
││ libuv │
││ (Event loop, async I/O, thread pool) │
├│─────────────────────────────────────────┤
││ Operating System │ └─────────────────────────────────────────┘
Key components:
V8: Executes JavaScript (same engine as Chrome)
libuv: Cross-platform asynchronous I/O library (written in C)
Native modules: C++ bindings to OS functionality
Node.js APIs: JavaScript APIs wrapping native functionality
Global Object:
global vs. globalThis
Node.js specifics:
// Node.js global object (pre-ES2020)
global.setTimeout // Available
global.Buffer // Node.js-specific
global.process // Node.js-specific
global.__dirname // In CommonJS modules only
global.__filename // In CommonJS modules only
// ES2020 universal global
globalThis.setTimeout // Works everywhere
globalThis === global // true in Node.js
// Browser comparison
// Browser: globalThis === window === self
// Node.js: globalThis === globalTop-level this differences:
// CommonJS module (default in Node.js)
console.log(this); // {} (empty object, module.exports)
// ES Module (.mjs or "type": "module")
console.log(this); // undefined (same as browsers)
// Browser (non-module)
console.log(this); // windowThe Node.js Event Loop
Architecture: Single-Threaded with Thread Pool
Important distinction: Node.js is single-threaded for JavaScript execution, but uses a thread pool for I/O operations.
// JavaScript code runs on single thread
console.log('1');
setTimeout(() => console.log('2'), 0);
console.log('3');
// Output: 1, 3, 2
// But I/O operations use libuv's thread pool
const fs = require('fs');
fs.readFile('large-file.txt', (err, data) => {
// This callback runs on main thread,
// but file reading happened on thread pool
});Event Loop Phases (Node.js-Specific)
Six phases in each iteration:
┌───────────────────────────┐
┌│─>│ timers │
││ └─────────────┬─────────────┘
││ ┌─────────────┴─────────────┐
││ │ pending callbacks │
││ └─────────────┬─────────────┘
││ ┌─────────────┴─────────────┐
││ │ idle, prepare │
││ └─────────────┬─────────────┘ ┌───────────────┐
││ ┌─────────────┴─────────────┐ │ incoming: │
││ │ poll │<─────┤ connections, │
││ └─────────────┬─────────────┘ │ data, etc. │
││ ┌─────────────┴─────────────┐ └───────────────┘
││ │ check │
││ └─────────────┬─────────────┘
││ ┌─────────────┴─────────────┐ └──┤ close callbacks │ └───────────────────────────┘
Phase descriptions:
Timers: Execute
setTimeoutandsetIntervalcallbacksPending callbacks: Execute I/O callbacks deferred from previous iteration
Idle, prepare: Internal use only
Poll: Retrieve new I/O events; execute I/O callbacks (except close, timers,
setImmediate)Check: Execute
setImmediatecallbacksClose callbacks: Execute close event callbacks (e.g.,
socket.on('close', ...))
Between each phase: Process
process.nextTick() queue and microtasks (Promises)
process.nextTick()
vs. Microtasks vs. Macrotasks
Critical ordering:
console.log('1');
setTimeout(() => console.log('2'), 0); // Macrotask (timers phase)
Promise.resolve().then(() => console.log('3')); // Microtask
process.nextTick(() => console.log('4')); // nextTick queue
console.log('5');
// Output: 1, 5, 4, 3, 2Execution order:
Synchronous code (1, 5)
process.nextTick() queue (4)
Microtask queue (Promises) (3)
Macrotask queue (setTimeout) (2)
Example showing all phases:
const fs = require('fs');
console.log('Start');
// Timers phase
setTimeout(() => console.log('setTimeout'), 0);
// Check phase
setImmediate(() => console.log('setImmediate'));
// Poll phase
fs.readFile(__filename, () => {
console.log('readFile callback');
setTimeout(() => console.log('setTimeout in readFile'), 0);
setImmediate(() => console.log('setImmediate in readFile'));
process.nextTick(() => console.log('nextTick in readFile'));
});
// nextTick queue
process.nextTick(() => console.log('nextTick'));
// Microtask queue
Promise.resolve().then(() => console.log('Promise'));
console.log('End');
// Output:
// Start
// End
// nextTick
// Promise
// setTimeout (or setImmediate - order not guaranteed initially)
// setImmediate (or setTimeout)
// readFile callback
// nextTick in readFile
// setImmediate in readFile
// setTimeout in readFileWhy setImmediate in I/O callback runs before
setTimeout:
After an I/O callback (poll phase), the event loop immediately
moves to the check phase (where
setImmediate runs), then circles back to timers
phase (where setTimeout runs).
setImmediate
vs. setTimeout(..., 0)
Outside I/O cycle (order not guaranteed):
setTimeout(() => console.log('setTimeout'), 0);
setImmediate(() => console.log('setImmediate'));
// Output varies:
// Could be: setTimeout, setImmediate
// Could be: setImmediate, setTimeoutInside I/O cycle (setImmediate
always first):
fs.readFile(__filename, () => {
setTimeout(() => console.log('setTimeout'), 0);
setImmediate(() => console.log('setImmediate'));
});
// Output (guaranteed):
// setImmediate
// setTimeoutSystems insight: When event loop enters poll
phase and finishes I/O callbacks, it checks for
setImmediate callbacks before returning to timers.
Module Systems in Node.js
CommonJS: The Original Node.js Module System
Default in Node.js (files with .js
extension without "type": "module"):
// math.js (CommonJS module)
function add(a, b) {
return a + b;
}
function multiply(a, b) {
return a * b;
}
// Export individual functions
exports.add = add;
exports.multiply = multiply;
// Or export object
module.exports = { add, multiply };
// Or export single function
module.exports = add;Importing CommonJS modules:
// Import entire module
const math = require('./math');
console.log(math.add(2, 3));
// Destructure imports
const { add, multiply } = require('./math');
console.log(add(2, 3));
// Built-in modules
const fs = require('fs');
const path = require('path');
const http = require('http');CommonJS characteristics:
// 1. Synchronous loading
const data = require('./data.json'); // Blocks until loaded
// 2. Cached after first load
const math1 = require('./math');
const math2 = require('./math');
console.log(math1 === math2); // true (same object)
// 3. Dynamic imports possible
const moduleName = './math';
const math = require(moduleName); // Works
// 4. Module wrapper function
// Node.js wraps every module in:
(function(exports, require, module, __filename, __dirname) {
// Your module code here
});
// 5. Available variables
console.log(__filename); // Absolute path to current file
console.log(__dirname); // Absolute path to directory
console.log(module); // Module object
console.log(exports); // Reference to module.exports
console.log(require); // Function to load modulesES Modules in Node.js
Enable ES modules:
Option 1: Use .mjs extension:
// math.mjs
export function add(a, b) {
return a + b;
}
export function multiply(a, b) {
return a * b;
}
// main.mjs
import { add, multiply } from './math.mjs';Option 2: Set "type": "module" in
package.json:
{
"type": "module"
}// Now .js files are ES modules
// math.js
export function add(a, b) {
return a + b;
}
// main.js
import { add } from './math.js'; // Must include .js extension!ES Module characteristics in Node.js:
// 1. Asynchronous loading (top-level await supported)
const data = await fetch('https://api.example.com/data');
// 2. Static imports (must be at top level)
import { add } from './math.js'; // ✅
if (condition) {
import { add } from './math.js'; // ❌ Syntax error
}
// 3. Dynamic imports (returns Promise)
const modulePath = './math.js';
const math = await import(modulePath); // ✅
// 4. No __filename, __dirname
// Use import.meta instead
console.log(import.meta.url); // file:///path/to/module.js
// Get __dirname equivalent
import { fileURLToPath } from 'url';
import { dirname } from 'path';
const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);
// 5. File extensions required
import { add } from './math'; // ❌ Error
import { add } from './math.js'; // ✅ RequiredInteroperability: CommonJS ↔︎ ES Modules
Importing CommonJS from ES Module:
// math.cjs (CommonJS)
module.exports = { add: (a, b) => a + b };
// main.mjs (ES Module)
import math from './math.cjs'; // Default import works
console.log(math.add(2, 3));
// Named imports DON'T work
import { add } from './math.cjs'; // ❌ ErrorImporting ES Module from CommonJS:
// math.mjs (ES Module)
export function add(a, b) {
return a + b;
}
// main.cjs (CommonJS)
// Cannot use require() with ES modules!
const math = require('./math.mjs'); // ❌ Error: require() of ES Module not supported
// Must use dynamic import
(async () => {
const math = await import('./math.mjs');
console.log(math.add(2, 3));
})();Package exports field (best practice):
{
"name": "my-package",
"exports": {
".": {
"import": "./esm/index.js",
"require": "./cjs/index.js"
}
}
}Built-in Node.js Modules
File System: fs
and fs/promises
Callback-based API (traditional):
const fs = require('fs');
// Read file
fs.readFile('file.txt', 'utf8', (err, data) => {
if (err) {
console.error(err);
return;
}
console.log(data);
});
// Write file
fs.writeFile('output.txt', 'Hello', (err) => {
if (err) throw err;
console.log('File written');
});
// Check if file exists
fs.access('file.txt', fs.constants.F_OK, (err) => {
console.log(err ? 'Does not exist' : 'Exists');
});Promise-based API (modern, cleaner):
const fs = require('fs').promises;
// or: import fs from 'fs/promises';
// Read file
try {
const data = await fs.readFile('file.txt', 'utf8');
console.log(data);
} catch (err) {
console.error(err);
}
// Write file
await fs.writeFile('output.txt', 'Hello');
// Read directory
const files = await fs.readdir('.');
console.log(files);
// File stats
const stats = await fs.stat('file.txt');
console.log(stats.size);
console.log(stats.isFile());
console.log(stats.isDirectory());
// Create directory
await fs.mkdir('new-dir', { recursive: true });
// Delete file
await fs.unlink('file.txt');
// Rename/move file
await fs.rename('old.txt', 'new.txt');Synchronous API (blocks event loop, use sparingly):
const fs = require('fs');
// Read file synchronously
const data = fs.readFileSync('file.txt', 'utf8');
// Write file synchronously
fs.writeFileSync('output.txt', 'Hello');
// Use case: Loading config at startup
const config = JSON.parse(fs.readFileSync('config.json', 'utf8'));Streams (for large files):
const fs = require('fs');
// Read stream
const readStream = fs.createReadStream('large-file.txt', 'utf8');
readStream.on('data', (chunk) => {
console.log('Chunk:', chunk.length);
});
readStream.on('end', () => {
console.log('Done reading');
});
// Write stream
const writeStream = fs.createWriteStream('output.txt');
writeStream.write('Hello\n');
writeStream.write('World\n');
writeStream.end();
// Pipe (copy file efficiently)
fs.createReadStream('input.txt').pipe(fs.createWriteStream('output.txt'));Path: Cross-Platform Path Handling
const path = require('path');
// Join paths (cross-platform)
const filePath = path.join(__dirname, 'data', 'file.txt');
// Windows: C:\project\data\file.txt
// Unix: /project/data/file.txt
// Resolve to absolute path
const absolute = path.resolve('data', 'file.txt');
// /current/working/directory/data/file.txt
// Get directory name
path.dirname('/path/to/file.txt'); // '/path/to'
// Get base name
path.basename('/path/to/file.txt'); // 'file.txt'
path.basename('/path/to/file.txt', '.txt'); // 'file'
// Get extension
path.extname('file.txt'); // '.txt'
// Parse path
const parsed = path.parse('/path/to/file.txt');
// {
// root: '/',
// dir: '/path/to',
// base: 'file.txt',
// ext: '.txt',
// name: 'file'
// }
// Build path from object
path.format(parsed); // '/path/to/file.txt'
// Normalize path
path.normalize('/path//to/../file.txt'); // '/path/file.txt'
// Platform-specific separator
path.sep; // '/' on Unix, '\' on WindowsProcess: Runtime Information and Control
// Command-line arguments
console.log(process.argv);
// ['node', '/path/to/script.js', 'arg1', 'arg2']
// Environment variables
console.log(process.env.NODE_ENV);
console.log(process.env.PATH);
// Set environment variable
process.env.MY_VAR = 'value';
// Current working directory
console.log(process.cwd());
// Change directory
process.chdir('/new/directory');
// Platform information
console.log(process.platform); // 'linux', 'darwin', 'win32'
console.log(process.arch); // 'x64', 'arm64'
// Process ID
console.log(process.pid);
// Exit process
process.exit(0); // Success
process.exit(1); // Error
// Exit handlers
process.on('exit', (code) => {
console.log(`Exiting with code ${code}`);
});
// Uncaught exception handler
process.on('uncaughtException', (err) => {
console.error('Uncaught exception:', err);
process.exit(1);
});
// Unhandled promise rejection
process.on('unhandledRejection', (reason, promise) => {
console.error('Unhandled rejection:', reason);
});
// Memory usage
console.log(process.memoryUsage());
// {
// rss: 4935680, // Resident set size
// heapTotal: 1826816, // Total heap size
// heapUsed: 650472, // Used heap size
// external: 49879 // C++ objects bound to JS
// }
// CPU usage
console.log(process.cpuUsage());
// { user: 38579, system: 6986 }
// Uptime (seconds)
console.log(process.uptime());HTTP: Creating Servers
Basic HTTP server:
const http = require('http');
const server = http.createServer((req, res) => {
console.log(`${req.method} ${req.url}`);
// Set response headers
res.writeHead(200, { 'Content-Type': 'text/plain' });
// Send response
res.end('Hello World\n');
});
server.listen(3000, () => {
console.log('Server running at http://localhost:3000/');
});JSON API server:
const http = require('http');
const server = http.createServer((req, res) => {
if (req.url === '/api/users' && req.method === 'GET') {
res.writeHead(200, { 'Content-Type': 'application/json' });
res.end(JSON.stringify({ users: ['Alice', 'Bob'] }));
} else if (req.url === '/api/users' && req.method === 'POST') {
let body = '';
req.on('data', chunk => {
body += chunk.toString();
});
req.on('end', () => {
const data = JSON.parse(body);
res.writeHead(201, { 'Content-Type': 'application/json' });
res.end(JSON.stringify({ created: data }));
});
} else {
res.writeHead(404);
res.end('Not Found');
}
});
server.listen(3000);Making HTTP requests:
const http = require('http');
http.get('http://api.example.com/data', (res) => {
let data = '';
res.on('data', (chunk) => {
data += chunk;
});
res.on('end', () => {
console.log(JSON.parse(data));
});
}).on('error', (err) => {
console.error(err);
});Crypto: Hashing and Encryption
const crypto = require('crypto');
// Hash (SHA-256)
const hash = crypto.createHash('sha256')
.update('password')
.digest('hex');
console.log(hash);
// HMAC (keyed hash)
const hmac = crypto.createHmac('sha256', 'secret-key')
.update('message')
.digest('hex');
// Random bytes
const randomBytes = crypto.randomBytes(16).toString('hex');
// Password hashing (with salt)
const password = 'user-password';
const salt = crypto.randomBytes(16).toString('hex');
const hash = crypto.pbkdf2Sync(password, salt, 100000, 64, 'sha512').toString('hex');
// Encryption (AES-256-GCM)
const algorithm = 'aes-256-gcm';
const key = crypto.randomBytes(32);
const iv = crypto.randomBytes(16);
const cipher = crypto.createCipheriv(algorithm, key, iv);
let encrypted = cipher.update('secret message', 'utf8', 'hex');
encrypted += cipher.final('hex');
const authTag = cipher.getAuthTag();
// Decryption
const decipher = crypto.createDecipheriv(algorithm, key, iv);
decipher.setAuthTag(authTag);
let decrypted = decipher.update(encrypted, 'hex', 'utf8');
decrypted += decipher.final('utf8');URL: Parsing and Formatting URLs
const { URL } = require('url');
// Parse URL
const myURL = new URL('https://user:pass@example.com:8080/path?query=value#hash');
console.log(myURL.protocol); // 'https:'
console.log(myURL.hostname); // 'example.com'
console.log(myURL.port); // '8080'
console.log(myURL.pathname); // '/path'
console.log(myURL.search); // '?query=value'
console.log(myURL.hash); // '#hash'
console.log(myURL.username); // 'user'
console.log(myURL.password); // 'pass'
// Query parameters
console.log(myURL.searchParams.get('query')); // 'value'
myURL.searchParams.append('key', 'new');
myURL.searchParams.delete('query');
// Convert back to string
console.log(myURL.href);
console.log(myURL.toString());Buffer: Binary Data Handling
Creating Buffers
Node.js-specific (not in ECMA-262):
// Create buffer from string
const buf1 = Buffer.from('Hello');
console.log(buf1); // <Buffer 48 65 6c 6c 6f>
// Create buffer from array
const buf2 = Buffer.from([72, 101, 108, 108, 111]);
// Create buffer with size (uninitialized)
const buf3 = Buffer.alloc(10); // Filled with zeros
const buf4 = Buffer.allocUnsafe(10); // May contain old data (faster)
// Create from hexadecimal
const buf5 = Buffer.from('48656c6c6f', 'hex');
// Create from base64
const buf6 = Buffer.from('SGVsbG8=', 'base64');Reading and Writing Buffers
const buf = Buffer.alloc(10);
// Write string
buf.write('Hello', 0, 'utf8');
// Write integers
buf.writeUInt8(255, 5); // 1 byte at index 5
buf.writeUInt16BE(1000, 6); // 2 bytes, big-endian
buf.writeUInt32LE(1000000, 8); // 4 bytes, little-endian
// Read integers
const byte = buf.readUInt8(5);
const short = buf.readUInt16BE(6);
const int = buf.readUInt32LE(8);
// Read string
const str = buf.toString('utf8', 0, 5); // 'Hello'Buffer Encoding
const buf = Buffer.from('Hello');
// Various encodings
buf.toString('utf8'); // 'Hello'
buf.toString('hex'); // '48656c6c6f'
buf.toString('base64'); // 'SGVsbG8='
buf.toString('binary'); // (legacy)
// Supported encodings:
// 'utf8', 'utf16le', 'latin1', 'base64', 'hex', 'ascii', 'binary'Buffer vs. TypedArray
Similarity: Both provide views over binary data
// Buffer (Node.js-specific)
const buf = Buffer.from([1, 2, 3, 4]);
// TypedArray (standard JavaScript)
const arr = new Uint8Array([1, 2, 3, 4]);
// Convert Buffer to TypedArray
const typedArray = new Uint8Array(buf.buffer, buf.byteOffset, buf.byteLength);
// Convert TypedArray to Buffer
const buffer = Buffer.from(arr.buffer, arr.byteOffset, arr.byteLength);Difference: Buffer is more
convenient for Node.js I/O operations:
const fs = require('fs');
// Read file as Buffer
const buf = fs.readFileSync('file.bin');
console.log(buf instanceof Buffer); // true
// Write Buffer to file
fs.writeFileSync('output.bin', buf);Streams: Efficient Data Processing
Stream Types
Four types:
Readable: Source of data (e.g.,
fs.createReadStream)Writable: Destination (e.g.,
fs.createWriteStream)Duplex: Both readable and writable (e.g., TCP socket)
Transform: Duplex that modifies data (e.g., compression)
Readable Streams
const fs = require('fs');
const readable = fs.createReadStream('large-file.txt', {
encoding: 'utf8',
highWaterMark: 16 * 1024 // 16 KB chunks
});
// Event-based consumption
readable.on('data', (chunk) => {
console.log(`Received ${chunk.length} bytes`);
});
readable.on('end', () => {
console.log('No more data');
});
readable.on('error', (err) => {
console.error(err);
});
// Pause and resume
readable.pause();
setTimeout(() => readable.resume(), 1000);Async iteration (modern approach):
const fs = require('fs');
async function processFile() {
const readable = fs.createReadStream('file.txt', 'utf8');
for await (const chunk of readable) {
console.log(chunk);
}
}Writable Streams
const fs = require('fs');
const writable = fs.createWriteStream('output.txt');
writable.write('Hello\n');
writable.write('World\n');
writable.end('Done\n'); // Closes stream
writable.on('finish', () => {
console.log('All writes completed');
});
writable.on('error', (err) => {
console.error(err);
});Backpressure handling:
const writable = fs.createWriteStream('output.txt');
function writeMillionTimes(writer, data, callback) {
let i = 1000000;
write();
function write() {
let ok = true;
while (i > 0 && ok) {
i--;
if (i === 0) {
writer.write(data, callback);
} else {
// Returns false if internal buffer is full
ok = writer.write(data);
}
}
if (i > 0) {
// Wait for 'drain' event before continuing
writer.once('drain', write);
}
}
}
writeMillionTimes(writable, 'Hello\n', () => {
console.log('Done');
});Piping Streams
Most efficient way to transfer data:
const fs = require('fs');
const zlib = require('zlib');
// Copy file
fs.createReadStream('input.txt')
.pipe(fs.createWriteStream('output.txt'));
// Compress file
fs.createReadStream('input.txt')
.pipe(zlib.createGzip())
.pipe(fs.createWriteStream('input.txt.gz'));
// Chain multiple transforms
fs.createReadStream('input.txt.gz')
.pipe(zlib.createGunzip())
.pipe(transform())
.pipe(zlib.createGzip())
.pipe(fs.createWriteStream('output.txt.gz'));Pipeline (better error handling):
const { pipeline } = require('stream');
const fs = require('fs');
const zlib = require('zlib');
pipeline(
fs.createReadStream('input.txt'),
zlib.createGzip(),
fs.createWriteStream('input.txt.gz'),
(err) => {
if (err) {
console.error('Pipeline failed:', err);
} else {
console.log('Pipeline succeeded');
}
}
);Transform Streams
Custom transform:
const { Transform } = require('stream');
// Uppercase transform
const uppercaseTransform = new Transform({
transform(chunk, encoding, callback) {
this.push(chunk.toString().toUpperCase());
callback();
}
});
fs.createReadStream('input.txt')
.pipe(uppercaseTransform)
.pipe(fs.createWriteStream('output.txt'));Child Processes: Running External Commands
exec: Simple
Command Execution
const { exec } = require('child_process');
exec('ls -la', (error, stdout, stderr) => {
if (error) {
console.error(`Error: ${error.message}`);
return;
}
if (stderr) {
console.error(`stderr: ${stderr}`);
return;
}
console.log(`stdout: ${stdout}`);
});
// Promise version
const { promisify } = require('util');
const execPromise = promisify(exec);
const { stdout, stderr } = await execPromise('ls -la');
console.log(stdout);spawn: Streaming
Output
const { spawn } = require('child_process');
const ls = spawn('ls', ['-la']);
ls.stdout.on('data', (data) => {
console.log(`stdout: ${data}`);
});
ls.stderr.on('data', (data) => {
console.error(`stderr: ${data}`);
});
ls.on('close', (code) => {
console.log(`Process exited with code ${code}`);
});Pipe data to child process:
const child = spawn('wc', ['-w']);
child.stdout.on('data', (data) => {
console.log(`Word count: ${data}`);
});
child.stdin.write('Hello world\n');
child.stdin.write('This is a test\n');
child.stdin.end();fork: Node.js
Child Processes
Communication via IPC:
// parent.js
const { fork } = require('child_process');
const child = fork('child.js');
child.on('message', (msg) => {
console.log('Message from child:', msg);
});
child.send({ hello: 'world' });
// child.js
process.on('message', (msg) => {
console.log('Message from parent:', msg);
process.send({ received: true });
});Worker Threads: True Parallelism
Creating Worker Threads
Node.js 10.5.0+ (experimental in 10.x, stable in 12+):
// main.js
const { Worker } = require('worker_threads');
const worker = new Worker('./worker.js');
worker.on('message', (msg) => {
console.log('Message from worker:', msg);
});
worker.on('error', (err) => {
console.error('Worker error:', err);
});
worker.on('exit', (code) => {
console.log(`Worker exited with code ${code}`);
});
worker.postMessage({ task: 'compute' });// worker.js
const { parentPort } = require('worker_threads');
parentPort.on('message', (msg) => {
console.log('Message from main:', msg);
// Perform computation
const result = heavyComputation();
parentPort.postMessage({ result });
});SharedArrayBuffer in Workers
Shared memory between threads:
// main.js
const { Worker } = require('worker_threads');
const sharedBuffer = new SharedArrayBuffer(4);
const sharedArray = new Int32Array(sharedBuffer);
const worker = new Worker('./worker.js', {
workerData: { sharedBuffer }
});
// Atomically increment
Atomics.add(sharedArray, 0, 1);
console.log(sharedArray[0]); // 1
worker.on('message', () => {
console.log(sharedArray[0]); // 2 (incremented by worker)
});// worker.js
const { parentPort, workerData } = require('worker_threads');
const sharedArray = new Int32Array(workerData.sharedBuffer);
// Atomically increment
Atomics.add(sharedArray, 0, 1);
parentPort.postMessage('done');Package Management: npm and package.json
package.json Structure
{
"name": "my-app",
"version": "1.0.0",
"description": "My application",
"main": "index.js",
"type": "module",
"scripts": {
"start": "node index.js",
"test": "jest",
"build": "webpack"
},
"dependencies": {
"express": "^4.18.0",
"lodash": "~4.17.21"
},
"devDependencies": {
"jest": "^29.0.0",
"webpack": "^5.75.0"
},
"engines": {
"node": ">=16.0.0"
},
"keywords": ["example"],
"author": "Your Name",
"license": "MIT"
}Semantic versioning:
^4.18.0 → >=4.18.0 <5.0.0 (compatible changes) ~4.18.0 → >=4.18.0 <4.19.0 (bug fixes only)
4.18.0 → Exact version
→ Latest version (dangerous!)
npm Commands
# Install dependencies
npm install
npm install express # Add to dependencies
npm install --save-dev jest # Add to devDependencies
npm install -g nodemon # Install globally
# Update packages
npm update
npm outdated # Check for updates
# Remove package
npm uninstall express
# Run scripts
npm start # Runs "start" script
npm test # Runs "test" script
npm run build # Runs custom script
# View package info
npm info express
npm list # Installed packages
npm list --depth=0 # Top-level packages only
# Audit security
npm audit
npm audit fix # Auto-fix vulnerabilitiesSummary
Node.js runtime provides:
Global environment:
global(pre-ES2020) /globalThis(universal)process: Runtime information, exit handlingBuffer: Binary data handling
Event loop architecture:
Six phases: timers → pending → poll → check → close
process.nextTick(): Executed before next phasesetImmediate(): Executed in check phase
Module systems:
CommonJS:
require(),module.exports, synchronousES Modules:
import/export, asynchronous,.mjsor"type": "module"
Built-in modules:
fs: File system operations (callback, promise, sync APIs)path: Cross-platform path handlinghttp: HTTP server and clientcrypto: Hashing, encryptionstream: Efficient data processing
Concurrency:
Child processes: Run external commands (
exec,spawn,fork)Worker threads: True parallelism with
SharedArrayBuffer
Package management:
npm: Install, update, manage dependencies
package.json: Project configuration, scripts, versioning
Node.js extends ECMA-262 with system-level capabilities while preserving the single-threaded execution model that makes JavaScript reasoning tractable. Understanding the event loop phases and module interoperability is critical for building robust server-side applications.
Chapter 10: Browser Extensions and Userscripts
Introduction: Extending Browser Functionality
Browser extensions and userscripts allow developers to modify web page behavior, inject custom JavaScript, and extend browser capabilities beyond what standard web applications can achieve. They operate with elevated privileges compared to regular web pages, accessing APIs unavailable to standard JavaScript.
Key differences:
┌│─────────────────────────────────────────────────────────┐
││ Privilege Levels │
├│─────────────────────────────────────────────────────────┤
││ Web Page JavaScript │
││ • Sandboxed execution │
││ • No cross-origin requests (CORS) │
││ • Limited browser APIs │
├│─────────────────────────────────────────────────────────┤
││ Userscripts (Tampermonkey/Greasemonkey) │
││ • Runs in page context │
││ • Cross-origin XHR with GM_xmlhttpRequest │
││ • Can modify page DOM │
││ • Limited storage (GM_setValue/getValue) │
├│─────────────────────────────────────────────────────────┤
││ Browser Extensions (WebExtensions API) │
││ • Isolated world execution │
││ • Full browser.* API access │
││ • Background service workers │
││ • Cross-origin requests without CORS │
││ • Persistent storage │
││ • Tab/window management │ └─────────────────────────────────────────────────────────┘
Userscripts: Quick Page Modifications
What Are Userscripts?
Userscripts are JavaScript programs that modify web pages in the browser. They require a userscript manager extension:
Tampermonkey (Chrome, Firefox, Safari, Edge)
Greasemonkey (Firefox only, original)
Violentmonkey (Chrome, Firefox, open-source)
Use cases:
Remove ads or annoying elements
Add missing features to websites
Customize page appearance
Automate repetitive tasks
Enhance privacy
Basic Userscript Structure
Metadata block (required):
// ==UserScript==
// @name Example Userscript
// @namespace http://example.com/
// @version 1.0.0
// @description Demonstrates userscript basics
// @author Your Name
// @match https://www.example.com/*
// @grant GM_xmlhttpRequest
// @grant GM_setValue
// @grant GM_getValue
// @require https://code.jquery.com/jquery-3.6.0.min.js
// @run-at document-end
// ==/UserScript==
(function() {
'use strict';
// Your code here
console.log('Userscript loaded!');
// Modify page
document.body.style.backgroundColor = '#f0f0f0';
// Add button
const button = document.createElement('button');
button.textContent = 'Click me';
button.onclick = () => alert('Userscript button clicked!');
document.body.prepend(button);
})();Metadata directives:
| Directive | Purpose | Example |
|---|---|---|
@name |
Script name | My Script |
@namespace |
Unique identifier | http://example.com/ |
@version |
Version number | 1.0.0 |
@description |
What the script does | Removes ads |
@author |
Script creator | Your Name |
@match |
URL pattern to run on | https://example.com/* |
@include |
Alternative URL pattern | *://example.com/* |
@exclude |
URLs to skip | https://example.com/admin/* |
@grant |
API permissions | GM_xmlhttpRequest |
@require |
External libraries | jQuery URL |
@run-at |
Execution timing | document-start, document-end,
document-idle |
URL Matching Patterns
Match patterns:
// Exact domain
// @match https://www.example.com/*
// Any subdomain
// @match https://*.example.com/*
// Any protocol
// @match *://example.com/*
// Multiple patterns
// @match https://example.com/*
// @match https://example.org/*
// Include (wildcard allowed)
// @include /^https?://example\.com/.*/
// Exclude specific pages
// @exclude https://example.com/loginRun timing:
// @run-at document-start
// Runs as soon as HTML is available (before DOM ready)
// @run-at document-end
// Runs after DOM is ready (default)
// @run-at document-idle
// Runs after page load eventGreasemonkey API (GM.* Functions)
Cross-origin requests (bypass CORS):
// @grant GM_xmlhttpRequest
GM_xmlhttpRequest({
method: 'GET',
url: 'https://api.example.com/data',
headers: {
'User-Agent': 'MyUserscript/1.0'
},
onload: function(response) {
console.log(response.responseText);
const data = JSON.parse(response.responseText);
// Use data
},
onerror: function(error) {
console.error('Request failed:', error);
}
});
// Modern async/await wrapper
function gmXHR(config) {
return new Promise((resolve, reject) => {
GM_xmlhttpRequest({
...config,
onload: resolve,
onerror: reject
});
});
}
// Usage
const response = await gmXHR({
method: 'GET',
url: 'https://api.example.com/data'
});
const data = JSON.parse(response.responseText);Persistent storage:
// @grant GM_setValue
// @grant GM_getValue
// Save data
GM_setValue('username', 'Alice');
GM_setValue('settings', JSON.stringify({ theme: 'dark' }));
// Read data
const username = GM_getValue('username', 'default'); // with default
const settingsJSON = GM_getValue('settings');
const settings = JSON.parse(settingsJSON);
// Delete data
GM_deleteValue('username');
// List all keys
const keys = await GM_listValues();Other GM functions:
// Open new tab
GM_openInTab('https://example.com', { active: true });
// Get resource URL
// @resource logo https://example.com/logo.png
const logoURL = GM_getResourceURL('logo');
// Get resource text
const cssText = GM_getResourceText('customCSS');
// Add CSS
GM_addStyle(`
.annoying-ad {
display: none !important;
}
`);
// Script info
const info = GM_info;
console.log(info.script.name);
console.log(info.script.version);Practical Userscript Examples
Example 1: Remove elements:
// ==UserScript==
// @name Remove Ads
// @match https://example.com/*
// @run-at document-end
// ==/UserScript==
(function() {
'use strict';
// Remove by class
document.querySelectorAll('.ad, .advertisement').forEach(el => el.remove());
// Remove by ID
const banner = document.getElementById('annoying-banner');
if (banner) banner.remove();
// Observe for dynamically added ads
const observer = new MutationObserver((mutations) => {
mutations.forEach((mutation) => {
mutation.addedNodes.forEach((node) => {
if (node.nodeType === 1) { // Element node
if (node.matches('.ad')) {
node.remove();
}
}
});
});
});
observer.observe(document.body, {
childList: true,
subtree: true
});
})();Example 2: Auto-fill form:
// ==UserScript==
// @name Auto-fill Login
// @match https://example.com/login
// @grant GM_getValue
// @grant GM_setValue
// ==/UserScript==
(function() {
'use strict';
const username = GM_getValue('saved_username', '');
const password = GM_getValue('saved_password', '');
const usernameInput = document.querySelector('input[name="username"]');
const passwordInput = document.querySelector('input[name="password"]');
if (usernameInput && username) {
usernameInput.value = username;
}
if (passwordInput && password) {
passwordInput.value = password;
}
// Add save button
const saveButton = document.createElement('button');
saveButton.textContent = 'Save Credentials';
saveButton.type = 'button';
saveButton.onclick = () => {
GM_setValue('saved_username', usernameInput.value);
GM_setValue('saved_password', passwordInput.value);
alert('Credentials saved!');
};
document.querySelector('form').appendChild(saveButton);
})();Example 3: Fetch external data:
// ==UserScript==
// @name Weather Widget
// @match https://example.com/*
// @grant GM_xmlhttpRequest
// @grant GM_addStyle
// ==/UserScript==
(function() {
'use strict';
GM_addStyle(`
#weather-widget {
position: fixed;
top: 10px;
right: 10px;
padding: 10px;
background: white;
border: 1px solid #ccc;
border-radius: 5px;
box-shadow: 0 2px 5px rgba(0,0,0,0.2);
z-index: 9999;
}
`);
const widget = document.createElement('div');
widget.id = 'weather-widget';
widget.textContent = 'Loading weather...';
document.body.appendChild(widget);
GM_xmlhttpRequest({
method: 'GET',
url: 'https://api.weatherapi.com/v1/current.json?key=YOUR_KEY&q=London',
onload: function(response) {
const data = JSON.parse(response.responseText);
widget.innerHTML = `
<strong>${data.location.name}</strong><br>
${data.current.temp_c}°C, ${data.current.condition.text}
`;
}
});
})();Browser Extensions: Full Browser Integration
WebExtensions API
WebExtensions is a cross-browser API standard supported by:
Chrome/Chromium (Manifest V3)
Firefox
Edge
Safari (partial support)
Manifest V2 vs V3:
| Feature | Manifest V2 | Manifest V3 |
|---|---|---|
| Background scripts | Persistent pages | Service workers |
| Host permissions | permissions |
host_permissions |
| Blocking webRequest | Synchronous | Declarative (limited) |
| Remote code | eval(), remote scripts |
Forbidden |
| Status | Deprecated (2024+) | Current standard |
Extension Structure
Directory layout:
my-extension/
├│── manifest.json # Required: metadata and configuration
├│── background.js # Service worker (MV3) or background script (MV2)
├│── content-script.js # Runs in page context
├│── popup.html # Extension popup UI
├│── popup.js # Popup logic
├│── options.html # Settings page
├│── options.js # Settings logic
├│── icons/
││ ├── icon16.png
││ ├── icon48.png
││ └── icon128.png └── lib/ └── external-lib.js
manifest.json (Manifest V3)
Complete example:
{
"manifest_version": 3,
"name": "Example Extension",
"version": "1.0.0",
"description": "Demonstrates browser extension capabilities",
"icons": {
"16": "icons/icon16.png",
"48": "icons/icon48.png",
"128": "icons/icon128.png"
},
"action": {
"default_popup": "popup.html",
"default_icon": {
"16": "icons/icon16.png",
"48": "icons/icon48.png"
},
"default_title": "Click to open"
},
"background": {
"service_worker": "background.js"
},
"content_scripts": [
{
"matches": ["https://example.com/*"],
"js": ["content-script.js"],
"css": ["styles.css"],
"run_at": "document_end"
}
],
"permissions": [
"storage",
"tabs",
"activeTab",
"notifications"
],
"host_permissions": [
"https://api.example.com/*"
],
"options_page": "options.html",
"web_accessible_resources": [
{
"resources": ["images/*"],
"matches": ["https://example.com/*"]
}
]
}Key fields:
| Field | Purpose |
|---|---|
manifest_version |
Must be 3 (or 2 for legacy) |
name, version,
description |
Extension metadata |
icons |
Extension icons (various sizes) |
action |
Browser toolbar button (popup UI) |
background |
Background service worker |
content_scripts |
Scripts injected into pages |
permissions |
API permissions required |
host_permissions |
Cross-origin request permissions |
options_page |
Settings page |
Content Scripts: Isolated Worlds
Content scripts run in an isolated world:
┌│─────────────────────────────────────────┐
││ Web Page Context │
││ • Page’s JavaScript │
││ • Page’s variables/functions │
││ • Shared DOM │ └─────────────────────────────────────────┘ ↕ (DOM only)
┌│─────────────────────────────────────────┐
││ Content Script Context │
││ • Extension’s JavaScript │
││ • Cannot access page variables │
││ • Shared DOM │
││ • Limited chrome.* APIs │ └─────────────────────────────────────────┘ ↕ (message passing)
┌│─────────────────────────────────────────┐
││ Background Script Context │
││ • Full chrome.* API access │
││ • Persistent state │
││ • No DOM access │ └─────────────────────────────────────────┘
content-script.js:
// This runs in isolated world
console.log('Content script loaded');
// Can access DOM
const title = document.querySelector('h1').textContent;
// Can modify DOM
const banner = document.createElement('div');
banner.textContent = 'Extension is active!';
banner.style.cssText = 'position:fixed; top:0; left:0; right:0; background:yellow; padding:10px; z-index:99999;';
document.body.prepend(banner);
// CANNOT access page variables
// console.log(window.somePageVariable); // undefined
// Can use limited chrome APIs
chrome.runtime.sendMessage({ title: title }, (response) => {
console.log('Background responded:', response);
});
// Listen for messages from background
chrome.runtime.onMessage.addListener((message, sender, sendResponse) => {
if (message.action === 'highlightText') {
document.querySelectorAll('p').forEach(p => {
p.style.backgroundColor = 'yellow';
});
sendResponse({ done: true });
}
});Injecting into page context (to access page variables):
// content-script.js
function injectScript() {
const script = document.createElement('script');
script.textContent = `
// This code runs in page context
console.log('Injected script running');
console.log(window.somePageVariable); // Now accessible
// Send data back to content script via custom events
window.postMessage({ type: 'FROM_PAGE', data: somePageVariable }, '*');
`;
document.documentElement.appendChild(script);
script.remove();
}
// Listen for messages from page
window.addEventListener('message', (event) => {
if (event.source === window && event.data.type === 'FROM_PAGE') {
console.log('Data from page:', event.data.data);
}
});
injectScript();Background Scripts: Service Workers (MV3)
background.js (service worker in MV3):
// Service workers must be event-driven (no persistent state)
// Installation
chrome.runtime.onInstalled.addListener((details) => {
if (details.reason === 'install') {
console.log('Extension installed');
// Set default settings
chrome.storage.sync.set({ enabled: true });
} else if (details.reason === 'update') {
console.log('Extension updated');
}
});
// Listen for messages from content scripts
chrome.runtime.onMessage.addListener((message, sender, sendResponse) => {
console.log('Message from content script:', message);
console.log('Sender tab:', sender.tab.id);
// Async operation
(async () => {
const data = await fetchData();
sendResponse({ data: data });
})();
return true; // Required for async sendResponse
});
// Browser action (toolbar button) clicked
chrome.action.onClicked.addListener((tab) => {
console.log('Extension icon clicked on tab:', tab.id);
// Send message to content script
chrome.tabs.sendMessage(tab.id, { action: 'highlightText' });
});
// Tab updated
chrome.tabs.onUpdated.addListener((tabId, changeInfo, tab) => {
if (changeInfo.status === 'complete' && tab.url.includes('example.com')) {
console.log('Tab loaded:', tab.url);
// Inject content script dynamically
chrome.scripting.executeScript({
target: { tabId: tabId },
files: ['content-script.js']
});
}
});
// Web request interception (declarative in MV3)
chrome.declarativeNetRequest.updateDynamicRules({
addRules: [{
id: 1,
priority: 1,
action: { type: 'block' },
condition: {
urlFilter: '*://ads.example.com/*',
resourceTypes: ['script', 'image']
}
}],
removeRuleIds: []
});Service worker lifecycle:
// Service workers can be terminated after inactivity
// Must re-establish state on each wake-up
// BAD: This won't persist
let counter = 0;
chrome.runtime.onMessage.addListener((msg) => {
counter++; // Lost when service worker terminates
});
// GOOD: Use chrome.storage
chrome.runtime.onMessage.addListener(async (msg) => {
const { counter = 0 } = await chrome.storage.local.get('counter');
await chrome.storage.local.set({ counter: counter + 1 });
});Message Passing
Content script ↔︎ Background:
// content-script.js → background.js
chrome.runtime.sendMessage(
{ type: 'GET_DATA', url: window.location.href },
(response) => {
console.log('Response:', response);
}
);
// background.js listening
chrome.runtime.onMessage.addListener((message, sender, sendResponse) => {
if (message.type === 'GET_DATA') {
fetchData(message.url).then(data => {
sendResponse({ data: data });
});
return true; // Keep message channel open for async
}
});Long-lived connections:
// content-script.js
const port = chrome.runtime.connect({ name: 'my-channel' });
port.postMessage({ type: 'INIT' });
port.onMessage.addListener((msg) => {
console.log('Received:', msg);
});
// background.js
chrome.runtime.onConnect.addListener((port) => {
console.log('Connected:', port.name);
port.onMessage.addListener((msg) => {
console.log('Message:', msg);
port.postMessage({ response: 'acknowledged' });
});
});Cross-extension messaging:
// Send to another extension
chrome.runtime.sendMessage(
'other-extension-id',
{ type: 'HELLO' },
(response) => {
console.log('Response from other extension:', response);
}
);Storage API
Types of storage:
// 1. chrome.storage.local (local to device, ~5MB)
await chrome.storage.local.set({ key: 'value' });
const { key } = await chrome.storage.local.get('key');
// 2. chrome.storage.sync (synced across devices, ~100KB)
await chrome.storage.sync.set({ theme: 'dark' });
const { theme } = await chrome.storage.sync.get('theme');
// 3. chrome.storage.session (session-only, MV3)
await chrome.storage.session.set({ tempData: '...' });
// Save multiple items
await chrome.storage.local.set({
username: 'Alice',
settings: { theme: 'dark', lang: 'en' }
});
// Get multiple items
const items = await chrome.storage.local.get(['username', 'settings']);
console.log(items.username);
console.log(items.settings.theme);
// Get all items
const allItems = await chrome.storage.local.get(null);
// Remove item
await chrome.storage.local.remove('username');
// Clear all
await chrome.storage.local.clear();
// Listen for changes
chrome.storage.onChanged.addListener((changes, areaName) => {
for (let [key, { oldValue, newValue }] of Object.entries(changes)) {
console.log(`${key} changed from ${oldValue} to ${newValue} in ${areaName}`);
}
});Tabs API
// Get current tab
const [tab] = await chrome.tabs.query({ active: true, currentWindow: true });
console.log('Current URL:', tab.url);
// Create new tab
const newTab = await chrome.tabs.create({
url: 'https://example.com',
active: true // Switch to new tab
});
// Update tab
await chrome.tabs.update(tab.id, {
url: 'https://example.org'
});
// Close tab
await chrome.tabs.remove(tab.id);
// Get all tabs
const tabs = await chrome.tabs.query({});
// Query tabs by URL pattern
const exampleTabs = await chrome.tabs.query({
url: '*://example.com/*'
});
// Reload tab
await chrome.tabs.reload(tab.id);
// Execute script in tab
await chrome.scripting.executeScript({
target: { tabId: tab.id },
func: () => {
document.body.style.backgroundColor = 'red';
}
});
// Inject CSS
await chrome.scripting.insertCSS({
target: { tabId: tab.id },
css: 'body { background: blue !important; }'
});Other Useful APIs
Notifications:
chrome.notifications.create({
type: 'basic',
iconUrl: 'icons/icon48.png',
title: 'Extension Notification',
message: 'Something happened!',
priority: 2
});Context menus (right-click menu):
// background.js
chrome.runtime.onInstalled.addListener(() => {
chrome.contextMenus.create({
id: 'search-selection',
title: 'Search "%s" on Example',
contexts: ['selection']
});
});
chrome.contextMenus.onClicked.addListener((info, tab) => {
if (info.menuItemId === 'search-selection') {
const query = encodeURIComponent(info.selectionText);
chrome.tabs.create({
url: `https://example.com/search?q=${query}`
});
}
});Alarms (scheduled tasks):
// Create alarm
chrome.alarms.create('fetchData', {
periodInMinutes: 30
});
// Listen for alarm
chrome.alarms.onAlarm.addListener((alarm) => {
if (alarm.name === 'fetchData') {
console.log('Time to fetch data!');
fetchData();
}
});Cookies:
// Get cookies
const cookies = await chrome.cookies.getAll({
url: 'https://example.com'
});
// Set cookie
await chrome.cookies.set({
url: 'https://example.com',
name: 'session',
value: 'abc123'
});
// Remove cookie
await chrome.cookies.remove({
url: 'https://example.com',
name: 'session'
});Web request (Manifest V3 - declarative only):
// Block ads
chrome.declarativeNetRequest.updateDynamicRules({
addRules: [
{
id: 1,
priority: 1,
action: { type: 'block' },
condition: {
urlFilter: '*://ads.example.com/*',
resourceTypes: ['script', 'image', 'sub_frame']
}
}
],
removeRuleIds: []
});
// Redirect
chrome.declarativeNetRequest.updateDynamicRules({
addRules: [
{
id: 2,
priority: 1,
action: {
type: 'redirect',
redirect: { url: 'https://example.org/alternative.js' }
},
condition: {
urlFilter: 'https://example.com/script.js',
resourceTypes: ['script']
}
}
],
removeRuleIds: []
});Popup UI
popup.html:
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<style>
body {
width: 300px;
padding: 10px;
font-family: Arial, sans-serif;
}
button {
width: 100%;
padding: 10px;
margin: 5px 0;
}
</style>
</head>
<body>
<h3>Extension Popup</h3>
<button id="highlight">Highlight Page</button>
<button id="screenshot">Take Screenshot</button>
<div id="status"></div>
<script src="popup.js"></script>
</body>
</html>popup.js:
document.getElementById('highlight').addEventListener('click', async () => {
const [tab] = await chrome.tabs.query({ active: true, currentWindow: true });
chrome.tabs.sendMessage(tab.id, { action: 'highlight' }, (response) => {
document.getElementById('status').textContent = 'Highlighted!';
});
});
document.getElementById('screenshot').addEventListener('click', async () => {
const dataUrl = await chrome.tabs.captureVisibleTab(null, {
format: 'png'
});
// Download screenshot
chrome.downloads.download({
url: dataUrl,
filename: 'screenshot.png'
});
});Complete Example: YouTube Enhancer
manifest.json:
{
"manifest_version": 3,
"name": "YouTube Enhancer",
"version": "1.0.0",
"permissions": ["storage"],
"content_scripts": [
{
"matches": ["*://*.youtube.com/*"],
"js": ["content.js"],
"run_at": "document_end"
}
],
"action": {
"default_popup": "popup.html"
}
}content.js:
(async function() {
'use strict';
// Get settings
const { autoHD = true, hideComments = false } =
await chrome.storage.sync.get(['autoHD', 'hideComments']);
// Auto-switch to HD
if (autoHD) {
const observer = new MutationObserver(() => {
const video = document.querySelector('video');
if (video && video.getAvailableQualityLevels) {
const levels = video.getAvailableQualityLevels();
if (levels.includes('hd1080')) {
video.setPlaybackQualityRange('hd1080');
}
}
});
observer.observe(document.body, { childList: true, subtree: true });
}
// Hide comments
if (hideComments) {
const style = document.createElement('style');
style.textContent = '#comments { display: none !important; }';
document.head.appendChild(style);
}
// Add custom button
function addCustomButton() {
const controls = document.querySelector('.ytp-right-controls');
if (!controls || document.getElementById('custom-btn')) return;
const button = document.createElement('button');
button.id = 'custom-btn';
button.textContent = '⚡';
button.className = 'ytp-button';
button.title = 'Custom action';
button.onclick = () => {
alert('Custom button clicked!');
};
controls.prepend(button);
}
// Wait for player to load
const playerObserver = new MutationObserver(() => {
if (document.querySelector('.ytp-right-controls')) {
addCustomButton();
playerObserver.disconnect();
}
});
playerObserver.observe(document.body, { childList: true, subtree: true });
})();Security Considerations
Content Security Policy (CSP)
Extensions have strict CSP (Manifest V3):
// ❌ FORBIDDEN in MV3:
eval('console.log("test")');
new Function('return 1')();
// ❌ FORBIDDEN: Inline scripts
<script>alert('test')</script>
// ❌ FORBIDDEN: Remote scripts
<script src="https://example.com/script.js"></script>
// ✅ ALLOWED: External scripts in extension package
<script src="local-script.js"></script>Workarounds:
// Instead of eval, use safe alternatives
const data = JSON.parse(jsonString);
// Instead of new Function, use pre-defined functions
const operations = {
add: (a, b) => a + b,
multiply: (a, b) => a * b
};
const result = operations[operation](x, y);Permissions
Principle of least privilege:
{
"permissions": [
"activeTab" // ✅ Only current tab
// "tabs" // ❌ Avoid: All tabs' URLs
],
"host_permissions": [
"https://api.example.com/*" // ✅ Specific domain
// "*://*/*" // ❌ Avoid: All websites
]
}Optional permissions (request at runtime):
{
"optional_permissions": ["downloads"],
"optional_host_permissions": ["*://*/*"]
}// Request permission when needed
document.getElementById('enable').addEventListener('click', async () => {
const granted = await chrome.permissions.request({
permissions: ['downloads']
});
if (granted) {
console.log('Permission granted!');
}
});
// Check if permission is granted
const hasPermission = await chrome.permissions.contains({
permissions: ['downloads']
});XSS Prevention
Never inject unsanitized content:
// ❌ DANGEROUS
const userInput = message.text;
document.body.innerHTML = userInput; // XSS vulnerability!
// ✅ SAFE
const userInput = message.text;
document.body.textContent = userInput; // Text only, no HTML
// ✅ SAFE (if HTML needed)
const sanitized = DOMPurify.sanitize(userInput);
document.body.innerHTML = sanitized;Publishing Extensions
Chrome Web Store
Create developer account ($5 one-time fee)
Prepare assets:
Icon (128×128 PNG)
Screenshots (1280×800 or 640×400)
Promotional images (optional)
Description, screenshots
Zip extension:
zip -r extension.zip * -x "*.git*" -x "*node_modules*"Upload to dashboard
Fill metadata
Submit for review (1-3 days typically)
Firefox Add-ons
Create Mozilla account (free)
Sign extension:
web-ext sign --api-key=$API_KEY --api-secret=$API_SECRETUpload .xpi file
Review (automated + manual for some extensions)
Summary
Userscripts:
Quick page modifications
Require userscript manager (Tampermonkey)
Limited API (
GM_*functions)Easy to share and install
Browser extensions:
Full browser integration
WebExtensions API (cross-browser)
Manifest V3 (service workers, declarative)
Isolated worlds (content scripts)
Message passing architecture
Rich APIs (tabs, storage, notifications, etc.)
Key concepts:
Content scripts: Run in isolated world, access DOM
Background scripts: Service workers, full API access
Message passing: Communication between contexts
Storage:
chrome.storagefor persistenceSecurity: CSP, permissions, XSS prevention
Extensions and userscripts extend ECMA-262 JavaScript with browser-specific capabilities, allowing developers to enhance web experiences beyond what standard web pages can achieve.
Chapter 11: JavaScript as a Compilation Target
Introduction: The Shift in Perspective
For most of its history, JavaScript was written directly by developers. However, as web applications grew in complexity and developers sought to use other languages, JavaScript increasingly became a compilation target—the output of compilers rather than hand-written code.
Why compile to JavaScript?
Universal runtime: Every browser executes JavaScript
No installation: Zero-friction deployment
Performance: Modern engines (V8, SpiderMonkey) are highly optimized
Ecosystem: Rich library and tooling support
Portability: Write once, run everywhere
Evolution timeline:
1995-2005: Hand-written JavaScript 2006-2010: JavaScript libraries (jQuery, Prototype) 2009: CoffeeScript (first major compile-to-JS language) 2012: TypeScript, Dart 2013: React JSX 2015: Babel (ES6+ → ES5 transpilation) 2015: asm.js (C/C++ → optimizable JS subset) 2017: WebAssembly MVP (binary compilation target) 2020+: WASM + JavaScript interop dominance
Categories of Compilation to JavaScript
1. Transpilation (Source-to-Source)
Definition: Converting code from one high-level language to another high-level language (JavaScript).
Examples:
TypeScript → JavaScript ES6+ → ES5 (Babel) JSX → JavaScript CoffeeScript → JavaScript Dart → JavaScript
Characteristics:
Readable output: Generated JS often resembles input
Similar abstraction level: Both input and output are high-level
Debugging: Source maps map output back to input
Type erasure: Type information (if present) is stripped
2. Compilation (Low-level to High-level)
Definition: Converting code from lower-level languages (C, C++, Rust) to JavaScript.
Examples:
C/C++ → asm.js → JavaScript C/C++ → WASM → (JS glue code) Rust → WASM → (JS interop)
Characteristics:
Less readable output: Heavily optimized, machine-like code
Performance focus: Targets asm.js subset or WebAssembly
Memory management: Manual memory via
ArrayBuffer/TypedArraysFFI required: Foreign function interface for host APIs
Transpilers: TypeScript, Babel, and Beyond
TypeScript: Static Typing for JavaScript
TypeScript adds static type checking to JavaScript while remaining a superset of valid JavaScript.
Type system features:
// Basic types
let name: string = "Alice";
let age: number = 30;
let active: boolean = true;
// Arrays and tuples
let numbers: number[] = [1, 2, 3];
let tuple: [string, number] = ["Alice", 30];
// Objects
interface User {
name: string;
age: number;
email?: string; // Optional property
}
const user: User = {
name: "Bob",
age: 25
};
// Functions
function greet(name: string): string {
return `Hello, ${name}!`;
}
// Generics
function identity<T>(arg: T): T {
return arg;
}
const result = identity<string>("test");
// Union types
function process(value: string | number): void {
if (typeof value === "string") {
console.log(value.toUpperCase());
} else {
console.log(value.toFixed(2));
}
}
// Type aliases
type Point = { x: number; y: number };
type Callback = (data: string) => void;
// Enums
enum Direction {
Up,
Down,
Left,
Right
}
// Classes with access modifiers
class Animal {
private name: string;
protected age: number;
constructor(name: string, age: number) {
this.name = name;
this.age = age;
}
public speak(): void {
console.log(`${this.name} makes a sound`);
}
}Compilation process:
# Install TypeScript
npm install -g typescript
# Compile single file
tsc example.ts
# Output: example.js
# Compile with config
tsc --project tsconfig.jsontsconfig.json:
{
"compilerOptions": {
"target": "ES2020",
"module": "ESNext",
"lib": ["ES2020", "DOM"],
"strict": true,
"esModuleInterop": true,
"skipLibCheck": true,
"outDir": "./dist",
"rootDir": "./src",
"sourceMap": true,
"declaration": true
},
"include": ["src/**/*"],
"exclude": ["node_modules"]
}Input vs. Output:
// Input: example.ts
interface Person {
name: string;
age: number;
}
function greet(person: Person): string {
return `Hello, ${person.name}!`;
}
const user: Person = { name: "Alice", age: 30 };
console.log(greet(user));// Output: example.js (types erased)
function greet(person) {
return `Hello, ${person.name}!`;
}
const user = { name: "Alice", age: 30 };
console.log(greet(user));Type erasure: All type annotations are removed during compilation. No runtime type checking occurs.
Babel: Modern JavaScript to Legacy JavaScript
Babel transpiles modern JavaScript (ES6+) to older versions (ES5) for browser compatibility.
Common transformations:
// Input: ES6+
const greet = (name) => `Hello, ${name}!`;
class Counter {
count = 0;
increment() {
this.count++;
}
}
const [first, ...rest] = [1, 2, 3, 4];
const obj = { x: 1, y: 2, ...other };
async function fetchData() {
const response = await fetch('/api/data');
return response.json();
}// Output: ES5
"use strict";
var greet = function greet(name) {
return "Hello, ".concat(name, "!");
};
var Counter = function Counter() {
_classCallCheck(this, Counter);
this.count = 0;
};
Counter.prototype.increment = function increment() {
this.count++;
};
var first = [1, 2, 3, 4][0];
var rest = [1, 2, 3, 4].slice(1);
var obj = Object.assign({ x: 1, y: 2 }, other);
function fetchData() {
return regeneratorRuntime.async(function fetchData$(_context) {
while (1) {
switch (_context.prev = _context.next) {
case 0:
_context.next = 2;
return regeneratorRuntime.awrap(fetch('/api/data'));
case 2:
response = _context.sent;
return _context.abrupt("return", response.json());
}
}
});
}Configuration (.babelrc):
{
"presets": [
["@babel/preset-env", {
"targets": "> 0.25%, not dead",
"useBuiltIns": "usage",
"corejs": 3
}]
],
"plugins": [
"@babel/plugin-proposal-class-properties",
"@babel/plugin-transform-runtime"
]
}Polyfills: Babel can inject polyfills for missing APIs:
// Input
const arr = [1, 2, 3];
const doubled = arr.map(x => x * 2);
const hasTwo = arr.includes(2); // ES2016 method
// Output with polyfill injection
require("core-js/modules/es.array.includes");
const arr = [1, 2, 3];
const doubled = arr.map(x => x * 2);
const hasTwo = arr.includes(2); // Polyfill loadedJSX: Declarative UI Syntax
JSX (JavaScript XML) embeds XML-like syntax in JavaScript, primarily used by React.
Input (JSX):
const App = ({ name, count }) => {
const [state, setState] = React.useState(0);
const handleClick = () => {
setState(state + 1);
};
return (
<div className="container">
<h1>Hello, {name}!</h1>
<p>Count: {count}</p>
<button onClick={handleClick}>
Increment
</button>
{state > 5 && <p>State is greater than 5</p>}
<List items={[1, 2, 3]} />
</div>
);
};Output (JavaScript):
const App = ({ name, count }) => {
const [state, setState] = React.useState(0);
const handleClick = () => {
setState(state + 1);
};
return React.createElement(
"div",
{ className: "container" },
React.createElement("h1", null, "Hello, ", name, "!"),
React.createElement("p", null, "Count: ", count),
React.createElement(
"button",
{ onClick: handleClick },
"Increment"
),
state > 5 && React.createElement("p", null, "State is greater than 5"),
React.createElement(List, { items: [1, 2, 3] })
);
};Transformation details:
Tags →
React.createElement(type, props, ...children)Attributes → Props object
Self-closing tags → Single element
Expressions in
{}→ Evaluated JavaScript
Other Notable Transpilers
CoffeeScript (2009):
# Input: CoffeeScript
square = (x) -> x * x
numbers = [1..10]
squares = (square num for num in numbers)// Output: JavaScript
var square, numbers, squares;
square = function(x) {
return x * x;
};
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
squares = (function() {
var i, len, results;
results = [];
for (i = 0, len = numbers.length; i < len; i++) {
num = numbers[i];
results.push(square(num));
}
return results;
})();Dart (with dart2js):
// Input: Dart
void main() {
var greeting = 'Hello, World!';
print(greeting);
var numbers = [1, 2, 3, 4, 5];
var doubled = numbers.map((n) => n * 2).toList();
print(doubled);
}Output is heavily optimized JavaScript (several thousand lines for even simple programs due to runtime library).
asm.js: The Optimizable Subset
What is asm.js?
asm.js is a strict subset of JavaScript designed to be optimized ahead-of-time (AOT) by JavaScript engines.
Key principles:
Statically typed (through type annotations via bitwise ops)
No garbage collection (manual memory management)
Predictable performance (no dynamic behavior)
Validation (can be verified as valid asm.js)
Example asm.js module:
function MyModule(stdlib, foreign, heap) {
"use asm";
// Imports
var sqrt = stdlib.Math.sqrt;
var log = foreign.log;
// Heap (typed array view)
var HEAP32 = new stdlib.Int32Array(heap);
var HEAPF64 = new stdlib.Float64Array(heap);
// Functions
function add(x, y) {
x = x | 0; // x is int32
y = y | 0; // y is int32
return (x + y) | 0;
}
function distance(x1, y1, x2, y2) {
x1 = +x1; // x1 is double
y1 = +y1; // y1 is double
x2 = +x2; // x2 is double
y2 = +y2; // y2 is double
var dx = 0.0;
var dy = 0.0;
dx = x2 - x1;
dy = y2 - y1;
return +sqrt(dx * dx + dy * dy);
}
function processArray(start, length) {
start = start | 0;
length = length | 0;
var i = 0;
var sum = 0;
for (i = 0; (i | 0) < (length | 0); i = (i + 1) | 0) {
sum = (sum + HEAP32[(start + i) >> 2]) | 0;
}
return sum | 0;
}
return {
add: add,
distance: distance,
processArray: processArray
};
}
// Usage
const stdlib = {
Math: Math,
Int32Array: Int32Array,
Float64Array: Float64Array
};
const foreign = {
log: console.log
};
const heap = new ArrayBuffer(1024 * 1024); // 1MB
const module = MyModule(stdlib, foreign, heap);
console.log(module.add(5, 3)); // 8
console.log(module.distance(0, 0, 3, 4)); // 5Type Annotations via Coercion
asm.js uses bitwise/unary operations for type annotations:
function typed(x, y) {
"use asm";
x = x | 0; // x: int32
y = +y; // y: double
var i = 0; // i: int32 (inferred)
var f = 0.0; // f: double (inferred)
i = (x + 10) | 0; // Ensure int32
f = y + 3.14; // Already double
return +f; // Return double
}Type coercion operators:
| Operator | Type | Example |
|---|---|---|
\| 0 |
int32 | x = x \| 0 |
+ (unary) |
double | y = +y |
~~ |
int32 | x = ~~x |
>>> 0 |
uint32 | x = x >>> 0 |
Memory Model
asm.js uses typed array views over
ArrayBuffer:
function MemoryModule(stdlib, foreign, heap) {
"use asm";
var HEAP8 = new stdlib.Int8Array(heap);
var HEAP16 = new stdlib.Int16Array(heap);
var HEAP32 = new stdlib.Int32Array(heap);
var HEAPF32 = new stdlib.Float32Array(heap);
var HEAPF64 = new stdlib.Float64Array(heap);
function writeInt32(offset, value) {
offset = offset | 0;
value = value | 0;
HEAP32[offset >> 2] = value;
}
function readInt32(offset) {
offset = offset | 0;
return HEAP32[offset >> 2] | 0;
}
function writeFloat64(offset, value) {
offset = offset | 0;
value = +value;
HEAPF64[offset >> 3] = value;
}
return {
writeInt32: writeInt32,
readInt32: readInt32,
writeFloat64: writeFloat64
};
}Byte offset calculations:
Int32:
offset >> 2(divide by 4)Float64:
offset >> 3(divide by 8)Int16:
offset >> 1(divide by 2)
Emscripten: C/C++ to asm.js
Emscripten compiles C/C++ to asm.js using LLVM.
Compilation pipeline:
C/C++ source ↓ Clang (frontend) ↓ LLVM IR (intermediate representation) ↓ LLVM optimization passes ↓ Emscripten backend ↓ asm.js output
Example C code:
// example.c
#include <stdio.h>
#include <emscripten.h>
EMSCRIPTEN_KEEPALIVE
int fibonacci(int n) {
if (n <= 1) return n;
return fibonacci(n - 1) + fibonacci(n - 2);
}
int main() {
printf("fib(10) = %d\n", fibonacci(10));
return 0;
}Compile:
emcc example.c -o example.js -s EXPORTED_FUNCTIONS='["_fibonacci"]' -s EXPORTED_RUNTIME_METHODS='["ccall"]'Generated output (simplified):
// example.js (thousands of lines)
var Module = {
// ... runtime setup ...
};
function _fibonacci(n) {
n = n | 0;
var $0 = 0, $1 = 0;
if ((n | 0) <= 1) {
return n | 0;
}
$0 = _fibonacci((n - 1) | 0) | 0;
$1 = _fibonacci((n - 2) | 0) | 0;
return ($0 + $1) | 0;
}
// Exported interface
Module.ccall = function(name, returnType, argTypes, args) {
// ... calling convention ...
};Usage from JavaScript:
// Load generated module
const result = Module.ccall('fibonacci', 'number', ['number'], [10]);
console.log(result); // 55Performance characteristics:
2-4x slower than native C/C++
Much faster than regular JavaScript (2-10x speedup)
Predictable: No JIT warm-up time
WebAssembly: The Binary Compilation Target
WebAssembly vs. asm.js
From the reference materials provided, we understand that WebAssembly (WASM) evolved from the asm.js concept as a true binary format rather than a JavaScript subset.
Key differences:
┌│────────────────────────────────────────────────────┐
││ asm.js vs WebAssembly │
├│────────────────────────────────────────────────────┤
││ asm.js │ WebAssembly │
├│────────────────────────────┼───────────────────────┤
││ Text format (JavaScript) │ Binary format (.wasm)│
││ Large file size │ Compact (~50% smaller)│
││ Parsing overhead │ Fast decode/validate │
││ ~2x native speed │ ~1.5x native speed │
││ JIT compilation │ AOT compilation │
││ JavaScript subset │ Independent bytecode │ └────────────────────────────────────────────────────┘
WASM Module Structure
Binary format (.wasm) consists of sections:
┌│─────────────────────────────────────┐
││ WebAssembly Module │
├│─────────────────────────────────────┤
││ Magic number: 0x00 0x61 0x73 0x6d │
││ Version: 0x01 0x00 0x00 0x00 │
├│─────────────────────────────────────┤
││ 1. Type Section │
││ Function signatures │
├│─────────────────────────────────────┤
││ 2. Import Section │
││ External functions/memory │
├│─────────────────────────────────────┤
││ 3. Function Section │
││ Function type indices │
├│─────────────────────────────────────┤
││ 4. Memory Section │
││ Linear memory definition │
├│─────────────────────────────────────┤
││ 5. Export Section │
││ Exported functions/memory │
├│─────────────────────────────────────┤
││ 6. Code Section │
││ Function bodies (bytecode) │
├│─────────────────────────────────────┤
││ 7. Data Section │
││ Initial memory contents │ └─────────────────────────────────────┘
Text Format (WAT)
WebAssembly Text format for human readability:
;; example.wat
(module
;; Import console.log from JavaScript
(import "env" "log" (func $log (param i32)))
;; Define memory (1 page = 64KB)
(memory 1)
;; Export memory to JavaScript
(export "memory" (memory 0))
;; Function: add two numbers
(func $add (param $x i32) (param $y i32) (result i32)
local.get $x
local.get $y
i32.add
)
;; Export add function
(export "add" (func $add))
;; Function: fibonacci
(func $fib (param $n i32) (result i32)
(local $a i32)
(local $b i32)
(local $temp i32)
(local $i i32)
;; Base cases
(if (i32.le_s (local.get $n) (i32.const 1))
(then
(return (local.get $n))
)
)
;; Initialize
(local.set $a (i32.const 0))
(local.set $b (i32.const 1))
(local.set $i (i32.const 2))
;; Loop
(block $break
(loop $continue
;; temp = a + b
(local.set $temp
(i32.add (local.get $a) (local.get $b))
)
;; a = b
(local.set $a (local.get $b))
;; b = temp
(local.set $b (local.get $temp))
;; i++
(local.set $i
(i32.add (local.get $i) (i32.const 1))
)
;; if i <= n, continue
(br_if $continue
(i32.le_s (local.get $i) (local.get $n))
)
)
)
(local.get $b)
)
(export "fib" (func $fib))
)
Compile WAT to WASM:
wat2wasm example.wat -o example.wasmLoading and Using WASM in JavaScript
Basic loading:
// Fetch and instantiate
const response = await fetch('example.wasm');
const buffer = await response.arrayBuffer();
const module = await WebAssembly.instantiate(buffer, {
env: {
log: (x) => console.log('WASM log:', x)
}
});
const { add, fib, memory } = module.instance.exports;
console.log(add(5, 3)); // 8
console.log(fib(10)); // 55
// Access linear memory
const view = new Uint8Array(memory.buffer);
console.log(view[0]); // First byte of memoryStreaming compilation (more efficient):
const { instance } = await WebAssembly.instantiateStreaming(
fetch('example.wasm'),
{
env: {
log: console.log
}
}
);
const { add, fib } = instance.exports;Memory Sharing Between JS and WASM
Linear memory is shared via
ArrayBuffer:
// JavaScript side
const memory = new WebAssembly.Memory({ initial: 1 }); // 1 page = 64KB
const { writeString, readString } = await WebAssembly.instantiateStreaming(
fetch('string.wasm'),
{
env: { memory }
}
).then(m => m.instance.exports);
// Write string to WASM memory
const encoder = new TextEncoder();
const text = "Hello, WASM!";
const bytes = encoder.encode(text);
const view = new Uint8Array(memory.buffer);
view.set(bytes, 0); // Write at offset 0
// Call WASM function (processes string in memory)
const length = bytes.length;
writeString(0, length);
// Read back result
const resultBytes = view.slice(0, length);
const decoder = new TextDecoder();
console.log(decoder.decode(resultBytes));WAT code for string processing:
(module
(memory (import "env" "memory") 1)
(func $writeString (param $offset i32) (param $length i32)
(local $i i32)
(local $byte i32)
(local.set $i (i32.const 0))
(block $break
(loop $continue
;; Get byte
(local.set $byte
(i32.load8_u (i32.add (local.get $offset) (local.get $i)))
)
;; Convert to uppercase (if lowercase letter)
(if (i32.and
(i32.ge_u (local.get $byte) (i32.const 97))
(i32.le_u (local.get $byte) (i32.const 122))
)
(then
(local.set $byte
(i32.sub (local.get $byte) (i32.const 32))
)
(i32.store8
(i32.add (local.get $offset) (local.get $i))
(local.get $byte)
)
)
)
;; Increment
(local.set $i (i32.add (local.get $i) (i32.const 1)))
;; Loop condition
(br_if $continue
(i32.lt_u (local.get $i) (local.get $length))
)
)
)
)
(export "writeString" (func $writeString))
)
Compiling C/C++ to WASM with Emscripten
Modern Emscripten targets WASM instead of asm.js:
// example.c
#include <emscripten.h>
#include <math.h>
EMSCRIPTEN_KEEPALIVE
double calculate(double x, double y) {
return sqrt(x * x + y * y);
}
EMSCRIPTEN_KEEPALIVE
int* createArray(int size) {
int* arr = (int*)malloc(size * sizeof(int));
for (int i = 0; i < size; i++) {
arr[i] = i * i;
}
return arr;
}
EMSCRIPTEN_KEEPALIVE
void freeArray(int* arr) {
free(arr);
}Compile to WASM:
emcc example.c -o example.js \
-s WASM=1 \
-s EXPORTED_FUNCTIONS='["_calculate","_createArray","_freeArray"]' \
-s EXPORTED_RUNTIME_METHODS='["ccall","cwrap"]' \
-s ALLOW_MEMORY_GROWTH=1Generated files:
example.wasm: Binary moduleexample.js: JavaScript glue code (loading, memory management, exports)
Usage:
// Load generated module
const Module = await createModule();
// Call C function directly
const result = Module._calculate(3.0, 4.0);
console.log(result); // 5.0
// Or use ccall wrapper
const result2 = Module.ccall(
'calculate', // Function name
'number', // Return type
['number', 'number'], // Argument types
[3.0, 4.0] // Arguments
);
// Create array in WASM memory
const ptr = Module._createArray(10);
// Access array
const HEAP32 = new Int32Array(Module.HEAP32.buffer);
const offset = ptr >> 2; // Convert byte pointer to int32 index
for (let i = 0; i < 10; i++) {
console.log(HEAP32[offset + i]); // 0, 1, 4, 9, 16, ...
}
// Free memory
Module._freeArray(ptr);Compiling Rust to WASM
Rust has excellent WASM support:
// src/lib.rs
use wasm_bindgen::prelude::*;
// Export to JavaScript
#[wasm_bindgen]
pub fn greet(name: &str) -> String {
format!("Hello, {}!", name)
}
#[wasm_bindgen]
pub fn fibonacci(n: u32) -> u32 {
match n {
0 => 0,
1 => 1,
_ => fibonacci(n - 1) + fibonacci(n - 2),
}
}
#[wasm_bindgen]
pub struct Point {
x: f64,
y: f64,
}
#[wasm_bindgen]
impl Point {
#[wasm_bindgen(constructor)]
pub fn new(x: f64, y: f64) -> Point {
Point { x, y }
}
pub fn distance(&self, other: &Point) -> f64 {
let dx = self.x - other.x;
let dy = self.y - other.y;
(dx * dx + dy * dy).sqrt()
}
}Build:
# Install wasm-pack
cargo install wasm-pack
# Build for web
wasm-pack build --target webGenerated files (in pkg/):
example_bg.wasm: WASM moduleexample.js: JavaScript bindingsexample.d.ts: TypeScript definitions
Usage:
import init, { greet, fibonacci, Point } from './pkg/example.js';
// Initialize WASM module
await init();
// Call functions
console.log(greet("Alice")); // "Hello, Alice!"
console.log(fibonacci(10)); // 55
// Use exported class
const p1 = new Point(0, 0);
const p2 = new Point(3, 4);
console.log(p1.distance(p2)); // 5.0Performance Considerations
When to Use WASM
Good use cases:
CPU-intensive computations (image processing, cryptography, compression)
Existing C/C++/Rust codebases
Predictable performance requirements
Games and simulations
Audio/video processing
Poor use cases:
DOM manipulation (JS is faster due to zero overhead)
Simple logic (overhead of WASM call not justified)
Frequent string operations (encoding/decoding overhead)
Heavy JS interop (boundary crossing has cost)
Benchmark: JS vs. WASM
Fibonacci (recursive):
// JavaScript
function fib(n) {
if (n <= 1) return n;
return fib(n - 1) + fib(n - 2);
}
console.time('JS fib(40)');
console.log(fib(40));
console.timeEnd('JS fib(40)');
// ~800ms (varies by engine)// C compiled to WASM
int fib(int n) {
if (n <= 1) return n;
return fib(n - 1) + fib(n - 2);
}
// From JavaScript
console.time('WASM fib(40)');
console.log(Module._fib(40));
console.timeEnd('WASM fib(40)');
// ~400ms (2x faster)Image processing (pixel manipulation):
// JavaScript
function grayscale(imageData) {
const data = imageData.data;
for (let i = 0; i < data.length; i += 4) {
const avg = (data[i] + data[i+1] + data[i+2]) / 3;
data[i] = data[i+1] = data[i+2] = avg;
}
}
// 1920×1080 image: ~15ms// C compiled to WASM
void grayscale(unsigned char* data, int length) {
for (int i = 0; i < length; i += 4) {
unsigned char avg = (data[i] + data[i+1] + data[i+2]) / 3;
data[i] = data[i+1] = data[i+2] = avg;
}
}
// 1920×1080 image: ~5ms (3x faster)Call Overhead
Crossing JS/WASM boundary has cost:
// Many small calls: SLOWER
for (let i = 0; i < 1000000; i++) {
Module._add(i, i); // Call overhead × 1M
}
// One large call: FASTER
const result = Module._processArray(ptr, 1000000);Rule of thumb: Minimize boundary crossings. Do bulk work in WASM.
Source Maps and Debugging
Source Maps
Source maps map generated code back to original source.
TypeScript example:
tsc --sourceMap example.tsGenerated files:
example.js: Compiled JavaScriptexample.js.map: Source map
Source map structure (JSON):
{
"version": 3,
"file": "example.js",
"sourceRoot": "",
"sources": ["example.ts"],
"names": [],
"mappings": "AAAA,IAAM,KAAK,GAAG,CAAC,CAAC,EAAE,CAAC..."
}Browser support: Modern browsers automatically load source maps and show original TypeScript in DevTools.
Debugging WASM
DWARF debug info (C/C++):
emcc example.c -o example.js -g
# Includes source-level debug informationChrome DevTools supports WASM debugging:
Set breakpoints in WASM code
Inspect local variables
Step through instructions
View call stack
Console:
// View WASM text representation
WebAssembly.Module.exports(module);
// Disassemble function
console.log(instance.exports.add.toString());Tooling Ecosystem
Build Tools
Webpack:
// webpack.config.js
module.exports = {
entry: './src/index.js',
output: {
filename: 'bundle.js'
},
module: {
rules: [
{
test: /\.tsx?$/,
use: 'ts-loader',
exclude: /node_modules/
},
{
test: /\.jsx?$/,
use: 'babel-loader'
}
]
},
experiments: {
asyncWebAssembly: true
}
};Vite (modern, faster):
// vite.config.js
export default {
build: {
target: 'es2020'
},
optimizeDeps: {
exclude: ['example.wasm']
}
};Package Managers
npm/yarn support WASM packages:
npm install @example/wasm-packageimport init, { process } from '@example/wasm-package';
await init();
const result = process(data);Summary
JavaScript as a compilation target has transformed web development:
Transpilers (source-to-source):
TypeScript: Static typing, compile-time checks
Babel: Modern JS → Legacy JS, polyfills
JSX: Declarative UI →
React.createElementOutput is readable, high-level JavaScript
Compilers (low-level to JS/WASM):
asm.js: Optimizable JS subset, manual memory management
WebAssembly: Binary format, near-native performance
Emscripten: C/C++ → WASM/asm.js
Rust: First-class WASM support via
wasm-bindgen
Key insights:
WASM is not a replacement for JavaScript, but a complement
Use WASM for CPU-intensive tasks
Use JS for DOM manipulation and high-level logic
Minimize JS ↔︎ WASM boundary crossings
Source maps enable debugging of generated code
Evolution: Hand-written JS → Transpiled JS → asm.js → WebAssembly
The future is polyglot: write in any language, compile to WASM, interoperate seamlessly with JavaScript in the browser runtime defined by ECMA-262 and extended by Web APIs.
Chapter 12: Building a Source-to-Source Compiler
Introduction: Anatomy of a Transpiler
A source-to-source compiler (transpiler) transforms code from one high-level language to another. Unlike traditional compilers that target machine code, transpilers maintain the abstraction level.
This chapter builds a complete mini-transpiler that converts a simple ML-like language to JavaScript.
Source language features: let x = 5 let y = x + 3 let square = fn(n) => n * n print(square(y))
Target (JavaScript):
const x = 5;
const y = x + 3;
const square = (n) => n * n;
console.log(square(y));Compiler pipeline:
┌│─────────────────────────────────────────────────────┐
││ Compilation Phases │
├│─────────────────────────────────────────────────────┤
││ │
││ Source Code │
││ ↓ │
││ 1. Lexical Analysis (Lexer/Tokenizer) │
││ └→ Token stream │
││ ↓ │
││ 2. Syntax Analysis (Parser) │
││ └→ Abstract Syntax Tree (AST) │
││ ↓ │
││ 3. Semantic Analysis (optional) │
││ └→ Annotated AST / Symbol tables │
││ ↓ │
││ 4. Optimization (optional) │
││ └→ Transformed AST │
││ ↓ │
││ 5. Code Generation (Emitter) │
││ └→ Target code │
││ │ └─────────────────────────────────────────────────────┘
Phase 1: Lexical Analysis (Tokenization)
Token Definition
A token is the smallest meaningful unit of source code.
Token types:
const TokenType = {
// Literals
NUMBER: 'NUMBER',
STRING: 'STRING',
IDENTIFIER: 'IDENTIFIER',
// Keywords
LET: 'LET',
FN: 'FN',
IF: 'IF',
ELSE: 'ELSE',
RETURN: 'RETURN',
PRINT: 'PRINT',
// Operators
PLUS: 'PLUS',
MINUS: 'MINUS',
STAR: 'STAR',
SLASH: 'SLASH',
PERCENT: 'PERCENT',
EQUAL: 'EQUAL',
EQUAL_EQUAL: 'EQUAL_EQUAL',
BANG_EQUAL: 'BANG_EQUAL',
LESS: 'LESS',
LESS_EQUAL: 'LESS_EQUAL',
GREATER: 'GREATER',
GREATER_EQUAL: 'GREATER_EQUAL',
// Delimiters
LPAREN: 'LPAREN',
RPAREN: 'RPAREN',
LBRACE: 'LBRACE',
RBRACE: 'RBRACE',
COMMA: 'COMMA',
ARROW: 'ARROW',
// Special
NEWLINE: 'NEWLINE',
EOF: 'EOF'
};
class Token {
constructor(type, value, line, column) {
this.type = type;
this.value = value;
this.line = line;
this.column = column;
}
toString() {
return `Token(${this.type}, ${JSON.stringify(this.value)}, ${this.line}:${this.column})`;
}
}Lexer Implementation
The lexer scans source code character-by-character and produces tokens:
class Lexer {
constructor(source) {
this.source = source;
this.pos = 0;
this.line = 1;
this.column = 1;
this.current = this.source[0] || null;
}
// Advance to next character
advance() {
if (this.current === '\n') {
this.line++;
this.column = 1;
} else {
this.column++;
}
this.pos++;
this.current = this.pos < this.source.length ? this.source[this.pos] : null;
}
// Peek ahead without consuming
peek(offset = 1) {
const pos = this.pos + offset;
return pos < this.source.length ? this.source[pos] : null;
}
// Skip whitespace (except newlines, which are significant)
skipWhitespace() {
while (this.current && /[ \t\r]/.test(this.current)) {
this.advance();
}
}
// Skip comments
skipComment() {
if (this.current === '#') {
while (this.current && this.current !== '\n') {
this.advance();
}
}
}
// Read number literal
readNumber() {
const startLine = this.line;
const startColumn = this.column;
let numStr = '';
while (this.current && /[0-9]/.test(this.current)) {
numStr += this.current;
this.advance();
}
// Handle decimal point
if (this.current === '.' && /[0-9]/.test(this.peek())) {
numStr += this.current;
this.advance();
while (this.current && /[0-9]/.test(this.current)) {
numStr += this.current;
this.advance();
}
}
return new Token(
TokenType.NUMBER,
parseFloat(numStr),
startLine,
startColumn
);
}
// Read string literal
readString() {
const startLine = this.line;
const startColumn = this.column;
const quote = this.current;
let str = '';
this.advance(); // Skip opening quote
while (this.current && this.current !== quote) {
if (this.current === '\\') {
this.advance();
// Handle escape sequences
switch (this.current) {
case 'n': str += '\n'; break;
case 't': str += '\t'; break;
case 'r': str += '\r'; break;
case '\\': str += '\\'; break;
case quote: str += quote; break;
default:
throw new Error(
`Invalid escape sequence \\${this.current} at ${startLine}:${startColumn}`
);
}
this.advance();
} else {
str += this.current;
this.advance();
}
}
if (!this.current) {
throw new Error(`Unterminated string at ${startLine}:${startColumn}`);
}
this.advance(); // Skip closing quote
return new Token(TokenType.STRING, str, startLine, startColumn);
}
// Read identifier or keyword
readIdentifier() {
const startLine = this.line;
const startColumn = this.column;
let ident = '';
while (this.current && /[a-zA-Z0-9_]/.test(this.current)) {
ident += this.current;
this.advance();
}
// Check if keyword
const keywords = {
'let': TokenType.LET,
'fn': TokenType.FN,
'if': TokenType.IF,
'else': TokenType.ELSE,
'return': TokenType.RETURN,
'print': TokenType.PRINT
};
const type = keywords[ident] || TokenType.IDENTIFIER;
return new Token(type, ident, startLine, startColumn);
}
// Get next token
nextToken() {
while (this.current) {
// Skip whitespace
if (/[ \t\r]/.test(this.current)) {
this.skipWhitespace();
continue;
}
// Skip comments
if (this.current === '#') {
this.skipComment();
continue;
}
const line = this.line;
const column = this.column;
// Newline
if (this.current === '\n') {
this.advance();
return new Token(TokenType.NEWLINE, '\\n', line, column);
}
// Numbers
if (/[0-9]/.test(this.current)) {
return this.readNumber();
}
// Strings
if (this.current === '"' || this.current === "'") {
return this.readString();
}
// Identifiers and keywords
if (/[a-zA-Z_]/.test(this.current)) {
return this.readIdentifier();
}
// Two-character operators
if (this.current === '=' && this.peek() === '>') {
this.advance();
this.advance();
return new Token(TokenType.ARROW, '=>', line, column);
}
if (this.current === '=' && this.peek() === '=') {
this.advance();
this.advance();
return new Token(TokenType.EQUAL_EQUAL, '==', line, column);
}
if (this.current === '!' && this.peek() === '=') {
this.advance();
this.advance();
return new Token(TokenType.BANG_EQUAL, '!=', line, column);
}
if (this.current === '<' && this.peek() === '=') {
this.advance();
this.advance();
return new Token(TokenType.LESS_EQUAL, '<=', line, column);
}
if (this.current === '>' && this.peek() === '=') {
this.advance();
this.advance();
return new Token(TokenType.GREATER_EQUAL, '>=', line, column);
}
// Single-character tokens
const singleChar = {
'+': TokenType.PLUS,
'-': TokenType.MINUS,
'*': TokenType.STAR,
'/': TokenType.SLASH,
'%': TokenType.PERCENT,
'=': TokenType.EQUAL,
'<': TokenType.LESS,
'>': TokenType.GREATER,
'(': TokenType.LPAREN,
')': TokenType.RPAREN,
'{': TokenType.LBRACE,
'}': TokenType.RBRACE,
',': TokenType.COMMA
};
if (this.current in singleChar) {
const token = new Token(
singleChar[this.current],
this.current,
line,
column
);
this.advance();
return token;
}
throw new Error(
`Unexpected character '${this.current}' at ${line}:${column}`
);
}
return new Token(TokenType.EOF, null, this.line, this.column);
}
// Tokenize entire source
tokenize() {
const tokens = [];
let token;
do {
token = this.nextToken();
// Skip newlines for simplicity (could be kept for semicolon inference)
if (token.type !== TokenType.NEWLINE) {
tokens.push(token);
}
} while (token.type !== TokenType.EOF);
return tokens;
}
}Example usage:
const source = `
let x = 42
let greeting = "Hello, World!"
let add = fn(a, b) => a + b
print(add(x, 8))
`;
const lexer = new Lexer(source);
const tokens = lexer.tokenize();
tokens.forEach(token => console.log(token.toString()));
// Output:
// Token(LET, "let", 2:1)
// Token(IDENTIFIER, "x", 2:5)
// Token(EQUAL, "=", 2:7)
// Token(NUMBER, 42, 2:9)
// Token(LET, "let", 3:1)
// Token(IDENTIFIER, "greeting", 3:5)
// Token(EQUAL, "=", 3:14)
// Token(STRING, "Hello, World!", 3:16)
// Token(LET, "let", 4:1)
// Token(IDENTIFIER, "add", 4:5)
// Token(EQUAL, "=", 4:9)
// Token(FN, "fn", 4:11)
// ...Phase 2: Syntax Analysis (Parsing)
Abstract Syntax Tree (AST)
The AST represents the hierarchical structure of the program.
AST node types:
// Base class
class ASTNode {
constructor(type) {
this.type = type;
}
}
// Program (root node)
class Program extends ASTNode {
constructor(statements) {
super('Program');
this.statements = statements;
}
}
// Variable declaration: let x = 5
class LetStatement extends ASTNode {
constructor(name, value) {
super('LetStatement');
this.name = name; // Identifier node
this.value = value; // Expression node
}
}
// Identifier: x
class Identifier extends ASTNode {
constructor(name) {
super('Identifier');
this.name = name;
}
}
// Number literal: 42
class NumberLiteral extends ASTNode {
constructor(value) {
super('NumberLiteral');
this.value = value;
}
}
// String literal: "hello"
class StringLiteral extends ASTNode {
constructor(value) {
super('StringLiteral');
this.value = value;
}
}
// Binary operation: a + b
class BinaryExpression extends ASTNode {
constructor(left, operator, right) {
super('BinaryExpression');
this.left = left;
this.operator = operator;
this.right = right;
}
}
// Function call: foo(a, b)
class CallExpression extends ASTNode {
constructor(callee, args) {
super('CallExpression');
this.callee = callee;
this.args = args;
}
}
// Function expression: fn(x) => x * 2
class FunctionExpression extends ASTNode {
constructor(params, body) {
super('FunctionExpression');
this.params = params; // Array of Identifier
this.body = body; // Expression or BlockStatement
}
}
// Block statement: { ... }
class BlockStatement extends ASTNode {
constructor(statements) {
super('BlockStatement');
this.statements = statements;
}
}
// If statement: if (cond) { ... } else { ... }
class IfStatement extends ASTNode {
constructor(condition, consequent, alternate) {
super('IfStatement');
this.condition = condition;
this.consequent = consequent;
this.alternate = alternate; // Can be null
}
}
// Return statement: return expr
class ReturnStatement extends ASTNode {
constructor(value) {
super('ReturnStatement');
this.value = value;
}
}
// Print statement: print(expr)
class PrintStatement extends ASTNode {
constructor(expression) {
super('PrintStatement');
this.expression = expression;
}
}Recursive Descent Parser
Parser uses recursive descent to build the AST:
class Parser {
constructor(tokens) {
this.tokens = tokens;
this.pos = 0;
this.current = this.tokens[0];
}
// Advance to next token
advance() {
this.pos++;
this.current = this.pos < this.tokens.length ? this.tokens[this.pos] : null;
}
// Check current token type
check(type) {
return this.current && this.current.type === type;
}
// Consume expected token or throw error
expect(type, message) {
if (!this.check(type)) {
throw new Error(
message || `Expected ${type} but got ${this.current?.type} at ${this.current?.line}:${this.current?.column}`
);
}
const token = this.current;
this.advance();
return token;
}
// Parse entire program
parse() {
const statements = [];
while (!this.check(TokenType.EOF)) {
statements.push(this.parseStatement());
}
return new Program(statements);
}
// Parse statement
parseStatement() {
if (this.check(TokenType.LET)) {
return this.parseLetStatement();
}
if (this.check(TokenType.PRINT)) {
return this.parsePrintStatement();
}
if (this.check(TokenType.IF)) {
return this.parseIfStatement();
}
if (this.check(TokenType.RETURN)) {
return this.parseReturnStatement();
}
if (this.check(TokenType.LBRACE)) {
return this.parseBlockStatement();
}
// Expression statement (for side effects)
const expr = this.parseExpression();
return expr;
}
// Parse: let x = expr
parseLetStatement() {
this.expect(TokenType.LET);
const name = this.expect(TokenType.IDENTIFIER);
this.expect(TokenType.EQUAL);
const value = this.parseExpression();
return new LetStatement(
new Identifier(name.value),
value
);
}
// Parse: print(expr)
parsePrintStatement() {
this.expect(TokenType.PRINT);
this.expect(TokenType.LPAREN);
const expr = this.parseExpression();
this.expect(TokenType.RPAREN);
return new PrintStatement(expr);
}
// Parse: if (cond) { ... } else { ... }
parseIfStatement() {
this.expect(TokenType.IF);
this.expect(TokenType.LPAREN);
const condition = this.parseExpression();
this.expect(TokenType.RPAREN);
const consequent = this.parseBlockStatement();
let alternate = null;
if (this.check(TokenType.ELSE)) {
this.advance();
alternate = this.check(TokenType.IF)
? this.parseIfStatement()
: this.parseBlockStatement();
}
return new IfStatement(condition, consequent, alternate);
}
// Parse: return expr
parseReturnStatement() {
this.expect(TokenType.RETURN);
const value = this.parseExpression();
return new ReturnStatement(value);
}
// Parse: { stmt1 stmt2 ... }
parseBlockStatement() {
this.expect(TokenType.LBRACE);
const statements = [];
while (!this.check(TokenType.RBRACE) && !this.check(TokenType.EOF)) {
statements.push(this.parseStatement());
}
this.expect(TokenType.RBRACE);
return new BlockStatement(statements);
}
// Parse expression (entry point)
parseExpression() {
return this.parseComparison();
}
// Parse: expr == expr, expr != expr, etc.
parseComparison() {
let left = this.parseAddition();
while (this.check(TokenType.EQUAL_EQUAL) ||
this.check(TokenType.BANG_EQUAL) ||
this.check(TokenType.LESS) ||
this.check(TokenType.LESS_EQUAL) ||
this.check(TokenType.GREATER) ||
this.check(TokenType.GREATER_EQUAL)) {
const operator = this.current.value;
this.advance();
const right = this.parseAddition();
left = new BinaryExpression(left, operator, right);
}
return left;
}
// Parse: expr + expr, expr - expr
parseAddition() {
let left = this.parseMultiplication();
while (this.check(TokenType.PLUS) || this.check(TokenType.MINUS)) {
const operator = this.current.value;
this.advance();
const right = this.parseMultiplication();
left = new BinaryExpression(left, operator, right);
}
return left;
}
// Parse: expr * expr, expr / expr, expr % expr
parseMultiplication() {
let left = this.parseCall();
while (this.check(TokenType.STAR) ||
this.check(TokenType.SLASH) ||
this.check(TokenType.PERCENT)) {
const operator = this.current.value;
this.advance();
const right = this.parseCall();
left = new BinaryExpression(left, operator, right);
}
return left;
}
// Parse: primary(arg1, arg2, ...)
parseCall() {
let expr = this.parsePrimary();
while (this.check(TokenType.LPAREN)) {
this.advance();
const args = [];
if (!this.check(TokenType.RPAREN)) {
args.push(this.parseExpression());
while (this.check(TokenType.COMMA)) {
this.advance();
args.push(this.parseExpression());
}
}
this.expect(TokenType.RPAREN);
expr = new CallExpression(expr, args);
}
return expr;
}
// Parse primary expressions
parsePrimary() {
// Number literal
if (this.check(TokenType.NUMBER)) {
const value = this.current.value;
this.advance();
return new NumberLiteral(value);
}
// String literal
if (this.check(TokenType.STRING)) {
const value = this.current.value;
this.advance();
return new StringLiteral(value);
}
// Identifier
if (this.check(TokenType.IDENTIFIER)) {
const name = this.current.value;
this.advance();
return new Identifier(name);
}
// Function expression: fn(params) => body
if (this.check(TokenType.FN)) {
return this.parseFunctionExpression();
}
// Parenthesized expression
if (this.check(TokenType.LPAREN)) {
this.advance();
const expr = this.parseExpression();
this.expect(TokenType.RPAREN);
return expr;
}
throw new Error(
`Unexpected token ${this.current?.type} at ${this.current?.line}:${this.current?.column}`
);
}
// Parse: fn(x, y) => expr or fn(x, y) { ... }
parseFunctionExpression() {
this.expect(TokenType.FN);
this.expect(TokenType.LPAREN);
const params = [];
if (!this.check(TokenType.RPAREN)) {
params.push(new Identifier(this.expect(TokenType.IDENTIFIER).value));
while (this.check(TokenType.COMMA)) {
this.advance();
params.push(new Identifier(this.expect(TokenType.IDENTIFIER).value));
}
}
this.expect(TokenType.RPAREN);
this.expect(TokenType.ARROW);
// Arrow function with expression body
const body = this.check(TokenType.LBRACE)
? this.parseBlockStatement()
: this.parseExpression();
return new FunctionExpression(params, body);
}
}Example usage:
const source = `
let x = 10
let double = fn(n) => n * 2
let result = double(x)
print(result)
`;
const lexer = new Lexer(source);
const tokens = lexer.tokenize();
const parser = new Parser(tokens);
const ast = parser.parse();
console.log(JSON.stringify(ast, null, 2));Output AST (simplified):
{
"type": "Program",
"statements": [
{
"type": "LetStatement",
"name": { "type": "Identifier", "name": "x" },
"value": { "type": "NumberLiteral", "value": 10 }
},
{
"type": "LetStatement",
"name": { "type": "Identifier", "name": "double" },
"value": {
"type": "FunctionExpression",
"params": [{ "type": "Identifier", "name": "n" }],
"body": {
"type": "BinaryExpression",
"left": { "type": "Identifier", "name": "n" },
"operator": "*",
"right": { "type": "NumberLiteral", "value": 2 }
}
}
},
{
"type": "LetStatement",
"name": { "type": "Identifier", "name": "result" },
"value": {
"type": "CallExpression",
"callee": { "type": "Identifier", "name": "double" },
"args": [{ "type": "Identifier", "name": "x" }]
}
},
{
"type": "PrintStatement",
"expression": { "type": "Identifier", "name": "result" }
}
]
}Phase 3: Semantic Analysis (Optional)
Semantic analysis checks for logical errors that syntax alone cannot catch.
Symbol Table
Track variable declarations and scopes:
class SymbolTable {
constructor(parent = null) {
this.parent = parent;
this.symbols = new Map();
}
define(name, info) {
if (this.symbols.has(name)) {
throw new Error(`Variable '${name}' already declared in this scope`);
}
this.symbols.set(name, info);
}
resolve(name) {
if (this.symbols.has(name)) {
return this.symbols.get(name);
}
if (this.parent) {
return this.parent.resolve(name);
}
return null;
}
enterScope() {
return new SymbolTable(this);
}
}Semantic Analyzer
class SemanticAnalyzer {
constructor() {
this.globalScope = new SymbolTable();
this.currentScope = this.globalScope;
this.errors = [];
}
analyze(ast) {
this.visit(ast);
return this.errors;
}
visit(node) {
const methodName = `visit${node.type}`;
if (this[methodName]) {
return this[methodName](node);
}
throw new Error(`No visit method for ${node.type}`);
}
visitProgram(node) {
node.statements.forEach(stmt => this.visit(stmt));
}
visitLetStatement(node) {
// Check if already declared in current scope
if (this.currentScope.symbols.has(node.name.name)) {
this.errors.push(
`Variable '${node.name.name}' already declared in this scope`
);
}
// Define variable
this.currentScope.define(node.name.name, {
type: 'variable',
node: node
});
// Analyze value expression
this.visit(node.value);
}
visitIdentifier(node) {
// Check if variable is declared
const symbol = this.currentScope.resolve(node.name);
if (!symbol) {
this.errors.push(`Undefined variable '${node.name}'`);
}
}
visitBinaryExpression(node) {
this.visit(node.left);
this.visit(node.right);
}
visitCallExpression(node) {
this.visit(node.callee);
node.args.forEach(arg => this.visit(arg));
}
visitFunctionExpression(node) {
// Enter new scope for function
const previousScope = this.currentScope;
this.currentScope = this.currentScope.enterScope();
// Define parameters
node.params.forEach(param => {
this.currentScope.define(param.name, {
type: 'parameter',
node: param
});
});
// Analyze body
this.visit(node.body);
// Restore scope
this.currentScope = previousScope;
}
visitBlockStatement(node) {
const previousScope = this.currentScope;
this.currentScope = this.currentScope.enterScope();
node.statements.forEach(stmt => this.visit(stmt));
this.currentScope = previousScope;
}
visitIfStatement(node) {
this.visit(node.condition);
this.visit(node.consequent);
if (node.alternate) {
this.visit(node.alternate);
}
}
visitReturnStatement(node) {
this.visit(node.value);
}
visitPrintStatement(node) {
this.visit(node.expression);
}
visitNumberLiteral(node) {
// Nothing to check
}
visitStringLiteral(node) {
// Nothing to check
}
}Usage:
const analyzer = new SemanticAnalyzer();
const errors = analyzer.analyze(ast);
if (errors.length > 0) {
console.error('Semantic errors:');
errors.forEach(err => console.error(' -', err));
} else {
console.log('✓ Semantic analysis passed');
}Phase 4: Code Generation
JavaScript Code Emitter
Generate JavaScript from AST:
class JavaScriptEmitter {
constructor() {
this.indent = 0;
}
emit(ast) {
return this.visit(ast);
}
visit(node) {
const methodName = `visit${node.type}`;
if (this[methodName]) {
return this[methodName](node);
}
throw new Error(`No emit method for ${node.type}`);
}
getIndent() {
return ' '.repeat(this.indent);
}
visitProgram(node) {
return node.statements
.map(stmt => this.visit(stmt))
.join('\n') + '\n';
}
visitLetStatement(node) {
const name = this.visit(node.name);
const value = this.visit(node.value);
return `${this.getIndent()}const ${name} = ${value};`;
}
visitIdentifier(node) {
return node.name;
}
visitNumberLiteral(node) {
return String(node.value);
}
visitStringLiteral(node) {
// Escape special characters
const escaped = node.value
.replace(/\\/g, '\\\\')
.replace(/"/g, '\\"')
.replace(/\n/g, '\\n')
.replace(/\t/g, '\\t');
return `"${escaped}"`;
}
visitBinaryExpression(node) {
const left = this.visit(node.left);
const right = this.visit(node.right);
// Add parentheses for clarity
return `(${left} ${node.operator} ${right})`;
}
visitCallExpression(node) {
const callee = this.visit(node.callee);
const args = node.args.map(arg => this.visit(arg)).join(', ');
return `${callee}(${args})`;
}
visitFunctionExpression(node) {
const params = node.params.map(p => this.visit(p)).join(', ');
// Single expression body
if (node.body.type !== 'BlockStatement') {
const body = this.visit(node.body);
return `(${params}) => ${body}`;
}
// Block body
this.indent++;
const body = node.body.statements
.map(stmt => this.visit(stmt))
.join('\n');
this.indent--;
return `(${params}) => {\n${body}\n${this.getIndent()}}`;
}
visitBlockStatement(node) {
this.indent++;
const statements = node.statements
.map(stmt => this.visit(stmt))
.join('\n');
this.indent--;
return `${this.getIndent()}{\n${statements}\n${this.getIndent()}}`;
}
visitIfStatement(node) {
const condition = this.visit(node.condition);
const consequent = this.visit(node.consequent);
let code = `${this.getIndent()}if (${condition}) ${consequent}`;
if (node.alternate) {
const alternate = this.visit(node.alternate);
// Else-if
if (node.alternate.type === 'IfStatement') {
code += ` else ${alternate.trimStart()}`;
} else {
code += ` else ${alternate}`;
}
}
return code;
}
visitReturnStatement(node) {
const value = this.visit(node.value);
return `${this.getIndent()}return ${value};`;
}
visitPrintStatement(node) {
const expr = this.visit(node.expression);
return `${this.getIndent()}console.log(${expr});`;
}
}Usage:
const emitter = new JavaScriptEmitter();
const jsCode = emitter.emit(ast);
console.log(jsCode);Output:
const x = 10;
const double = (n) => (n * 2);
const result = double(x);
console.log(result);Complete Compiler Pipeline
Putting it all together:
class Compiler {
constructor() {
this.lexer = null;
this.parser = null;
this.analyzer = null;
this.emitter = null;
}
compile(source, options = {}) {
const {
skipSemanticAnalysis = false,
outputAST = false
} = options;
try {
// Phase 1: Lexical analysis
this.lexer = new Lexer(source);
const tokens = this.lexer.tokenize();
// Phase 2: Parsing
this.parser = new Parser(tokens);
const ast = this.parser.parse();
if (outputAST) {
console.log('AST:', JSON.stringify(ast, null, 2));
}
// Phase 3: Semantic analysis (optional)
if (!skipSemanticAnalysis) {
this.analyzer = new SemanticAnalyzer();
const errors = this.analyzer.analyze(ast);
if (errors.length > 0) {
throw new Error(
'Semantic errors:\n' + errors.map(e => ' - ' + e).join('\n')
);
}
}
// Phase 4: Code generation
this.emitter = new JavaScriptEmitter();
const jsCode = this.emitter.emit(ast);
return {
success: true,
code: jsCode,
ast: ast
};
} catch (error) {
return {
success: false,
error: error.message,
stack: error.stack
};
}
}
compileAndRun(source) {
const result = this.compile(source);
if (!result.success) {
console.error('Compilation failed:');
console.error(result.error);
return;
}
console.log('Generated JavaScript:');
console.log(result.code);
console.log('\nExecution:');
try {
eval(result.code);
} catch (error) {
console.error('Runtime error:', error.message);
}
}
}Example usage:
const compiler = new Compiler();
const source = `
# Factorial example
let factorial = fn(n) => {
if (n <= 1) {
return 1
}
return n * factorial(n - 1)
}
let result = factorial(5)
print(result)
`;
compiler.compileAndRun(source);Output:
Generated JavaScript: const factorial = (n) => { if ((n <= 1)) { return 1; } return (n * factorial((n - 1))); }; const result = factorial(5); console.log(result);
Execution: 120
Optimization Techniques
Constant Folding
Evaluate constant expressions at compile time:
class ConstantFolder {
visit(node) {
if (node.type === 'BinaryExpression') {
return this.foldBinaryExpression(node);
}
// Recursively fold children
for (const key in node) {
if (node[key] && typeof node[key] === 'object') {
if (Array.isArray(node[key])) {
node[key] = node[key].map(child => this.visit(child));
} else if (node[key].type) {
node[key] = this.visit(node[key]);
}
}
}
return node;
}
foldBinaryExpression(node) {
// Fold operands first
node.left = this.visit(node.left);
node.right = this.visit(node.right);
// Check if both operands are literals
if (node.left.type === 'NumberLiteral' && node.right.type === 'NumberLiteral') {
const left = node.left.value;
const right = node.right.value;
let result;
switch (node.operator) {
case '+': result = left + right; break;
case '-': result = left - right; break;
case '*': result = left * right; break;
case '/': result = left / right; break;
case '%': result = left % right; break;
case '==': result = left === right ? 1 : 0; break;
case '!=': result = left !== right ? 1 : 0; break;
case '<': result = left < right ? 1 : 0; break;
case '<=': result = left <= right ? 1 : 0; break;
case '>': result = left > right ? 1 : 0; break;
case '>=': result = left >= right ? 1 : 0; break;
default: return node;
}
return new NumberLiteral(result);
}
return node;
}
}Usage:
// Before: let x = 2 + 3 * 4
// After: let x = 14
const folder = new ConstantFolder();
const optimizedAST = folder.visit(ast);Dead Code Elimination
Remove unreachable code:
class DeadCodeEliminator {
visit(node) {
if (node.type === 'BlockStatement') {
return this.eliminateDeadCode(node);
}
// Recursively process children
for (const key in node) {
if (node[key] && typeof node[key] === 'object') {
if (Array.isArray(node[key])) {
node[key] = node[key].map(child => this.visit(child));
} else if (node[key].type) {
node[key] = this.visit(node[key]);
}
}
}
return node;
}
eliminateDeadCode(node) {
const statements = [];
let reachable = true;
for (const stmt of node.statements) {
if (!reachable) {
// Skip unreachable code
continue;
}
statements.push(this.visit(stmt));
// Return makes subsequent code unreachable
if (stmt.type === 'ReturnStatement') {
reachable = false;
}
}
node.statements = statements;
return node;
}
}Error Handling and Reporting
Better Error Messages
class CompilerError extends Error {
constructor(message, line, column, source) {
super(message);
this.line = line;
this.column = column;
this.source = source;
}
format() {
const lines = this.source.split('\n');
const errorLine = lines[this.line - 1];
const pointer = ' '.repeat(this.column - 1) + '^';
return `
Error at ${this.line}:${this.column}
${this.message}
${this.line} | ${errorLine}
${pointer}
`;
}
}Enhanced lexer with better errors:
throw new CompilerError(
`Unexpected character '${this.current}'`,
this.line,
this.column,
this.source
);Testing the Compiler
Unit Tests
function test(name, fn) {
try {
fn();
console.log(`✓ ${name}`);
} catch (error) {
console.error(`✗ ${name}`);
console.error(` ${error.message}`);
}
}
function assertEquals(actual, expected) {
if (JSON.stringify(actual) !== JSON.stringify(expected)) {
throw new Error(`Expected ${JSON.stringify(expected)} but got ${JSON.stringify(actual)}`);
}
}
// Lexer tests
test('Lexer tokenizes numbers', () => {
const lexer = new Lexer('42 3.14');
const tokens = lexer.tokenize();
assertEquals(tokens[0].type, TokenType.NUMBER);
assertEquals(tokens[0].value, 42);
assertEquals(tokens[1].type, TokenType.NUMBER);
assertEquals(tokens[1].value, 3.14);
});
test('Lexer tokenizes strings', () => {
const lexer = new Lexer('"hello world"');
const tokens = lexer.tokenize();
assertEquals(tokens[0].type, TokenType.STRING);
assertEquals(tokens[0].value, 'hello world');
});
// Parser tests
test('Parser parses let statement', () => {
const tokens = new Lexer('let x = 42').tokenize();
const parser = new Parser(tokens);
const ast = parser.parse();
assertEquals(ast.statements[0].type, 'LetStatement');
assertEquals(ast.statements[0].name.name, 'x');
assertEquals(ast.statements[0].value.value, 42);
});
// Emitter tests
test('Emitter generates correct JavaScript', () => {
const tokens = new Lexer('let x = 42').tokenize();
const parser = new Parser(tokens);
const ast = parser.parse();
const emitter = new JavaScriptEmitter();
const code = emitter.emit(ast);
assertEquals(code.trim(), 'const x = 42;');
});
// Integration tests
test('Compile and execute factorial', () => {
const source = `
let fac = fn(n) => {
if (n <= 1) {
return 1
}
return n * fac(n - 1)
}
let result = fac(5)
`;
const compiler = new Compiler();
const { success, code } = compiler.compile(source);
assertEquals(success, true);
let result;
eval(code + 'result');
assertEquals(result, 120);
});Summary
We built a complete source-to-source compiler from scratch:
Phase 1: Lexical Analysis
Tokenize source into tokens
Handle numbers, strings, identifiers, keywords, operators
Track line/column for error messages
Phase 2: Syntax Analysis
Build Abstract Syntax Tree (AST) via recursive descent parsing
Define AST node types for all language constructs
Handle operator precedence correctly
Phase 3: Semantic Analysis (Optional)
Symbol table tracks variable declarations
Check for undefined variables
Enforce scoping rules
Phase 4: Code Generation
Traverse AST and emit JavaScript
Handle indentation and formatting
Map source constructs to target equivalents
Optimization:
Constant folding: Evaluate constants at compile-time
Dead code elimination: Remove unreachable code
Error handling:
Provide helpful error messages with source location
Format errors with context and pointer
This mini-compiler demonstrates the core principles used by production transpilers like TypeScript, Babel, and others. The same architecture scales to more complex languages with additional features like type systems, generics, and advanced optimizations.
Chapter 13: WebAssembly Fundamentals
Introduction: A New Compilation Target for the Web
WebAssembly (Wasm) is a binary instruction format designed as a portable compilation target for high-level languages. It represents a fundamental shift in web development capabilities.
Key characteristics:
┌│─────────────────────────────────────────────────────┐
││ WebAssembly Core Properties │
├│─────────────────────────────────────────────────────┤
││ │
││ ✓ Binary format (fast to parse/decode) │
││ ✓ Stack-based virtual machine │
││ ✓ Strongly typed (validated before execution) │
││ ✓ Memory-safe (sandboxed execution) │
││ ✓ Near-native performance │
││ ✓ Language-agnostic compilation target │
││ ✓ Designed to work alongside JavaScript │
││ │ └─────────────────────────────────────────────────────┘
Design goals (from specification):
Safe: Sandboxed execution with memory safety
Fast: Near-native speed with efficient validation
Portable: Architecture-independent binary format
Compact: Small binary size for fast network transfer
WebAssembly vs. JavaScript
Complementary Technologies
WebAssembly does NOT replace JavaScript – they work together:
// JavaScript: High-level, dynamic, great for DOM/APIs
document.addEventListener('click', async (e) => {
// Load WebAssembly module
const wasm = await loadWasmModule();
// JavaScript handles UI logic
const imageData = getImageFromCanvas();
// WebAssembly handles computation
const processed = wasm.processImage(imageData);
// JavaScript updates DOM
displayResult(processed);
});When to Use Each
Use JavaScript for:
DOM manipulation
Event handling
Rapid prototyping
String/text processing
Small computations
Async I/O operations
Use WebAssembly for:
CPU-intensive computations
Image/video processing
Physics simulations
Cryptography
Game engines
Porting existing C/C++/Rust code
Performance Comparison
Parsing/Load Time: JavaScript: ████████████████░░░░ (slower - must parse text) WebAssembly: ████░░░░░░░░░░░░░░░░ (faster - binary format)
Execution Speed: JavaScript: ████████████░░░░░░░░ (JIT-compiled, optimized) WebAssembly: ██████████████████░░ (near-native, predictable)
Startup Cost: JavaScript: ████░░░░░░░░░░░░░░░░ (low - runs immediately) WebAssembly: ████████░░░░░░░░░░░░ (higher - compile + instantiate)
Memory Usage: JavaScript: ████████████████████ (GC overhead) WebAssembly: ████████░░░░░░░░░░░░ (manual control, compact)
Rule of thumb: If a task takes <16ms, JavaScript is fine. For longer computations, consider WebAssembly.
WebAssembly Architecture
Stack-Based Virtual Machine
WebAssembly uses a stack machine model:
Traditional Register Machine: ADD r1, r2, r3 ; r1 = r2 + r3
WebAssembly Stack Machine: local.get 0 ; push local[0] onto stack local.get 1 ; push local[1] onto stack i32.add ; pop two values, push sum local.set 2 ; pop result, store in local[2]
Stack operations visualization:
Instruction Stack State ───────────────── ───────────── [ ] i32.const 5 [ 5 ] i32.const 3 [ 5, 3 ] i32.add [ 8 ] i32.const 2 [ 8, 2 ] i32.mul [ 16 ]
Module Structure
A WebAssembly module is organized into sections:
┌│──────────────────────────────────────────┐
││ WebAssembly Module │
├│──────────────────────────────────────────┤
││ │
││ Section 1: Type │
││ - Function signatures │
││ │
││ Section 2: Import │
││ - Imported functions/memory/tables │
││ │
││ Section 3: Function │
││ - Function type indices │
││ │
││ Section 4: Table │
││ - Indirect function call tables │
││ │
││ Section 5: Memory │
││ - Linear memory definitions │
││ │
││ Section 6: Global │
││ - Global variables │
││ │
││ Section 7: Export │
││ - Exported functions/memory │
││ │
││ Section 8: Start │
││ - Initialization function │
││ │
││ Section 9: Element │
││ - Table initialization │
││ │
││ Section 10: Code │
││ - Function bodies (bytecode) │
││ │
││ Section 11: Data │
││ - Memory initialization │
││ │ └──────────────────────────────────────────┘
WebAssembly Type System
Value Types
WebAssembly supports four numeric types:
i32 ; 32-bit integer
i64 ; 64-bit integer
f32 ; 32-bit floating-point (IEEE 754)
f64 ; 64-bit floating-point (IEEE 754)
And reference types (newer specification):
funcref ; Reference to function
externref ; Reference to external (JS) object
Function Signatures
Functions have typed parameters and results:
;; Function signature: (param i32 i32) (result i32)
(func $add (param $a i32) (param $b i32) (result i32)
local.get $a
local.get $b
i32.add
)
Multiple return values (MVP extension):
;; Returns two values
(func $divmod (param $a i32) (param $b i32) (result i32 i32)
local.get $a
local.get $b
i32.div_u
local.get $a
local.get $b
i32.rem_u
)
WAT: WebAssembly Text Format
S-Expression Syntax
WAT (WebAssembly Text) is the human-readable representation:
(module
;; Import JavaScript function
(import "env" "log" (func $log (param i32)))
;; Define memory
(memory 1)
;; Export memory
(export "memory" (memory 0))
;; Add function
(func $add (param $a i32) (param $b i32) (result i32)
local.get $a
local.get $b
i32.add
)
;; Export function
(export "add" (func $add))
;; Factorial function (recursive)
(func $factorial (param $n i32) (result i32)
(if (result i32)
(i32.le_s (local.get $n) (i32.const 1))
(then
(i32.const 1)
)
(else
(i32.mul
(local.get $n)
(call $factorial
(i32.sub (local.get $n) (i32.const 1))
)
)
)
)
)
(export "factorial" (func $factorial))
)
Folded vs. Linear Format
Folded (S-expression):
(i32.add
(i32.const 2)
(i32.mul
(i32.const 3)
(i32.const 4)
)
)
Linear (instruction sequence):
i32.const 2
i32.const 3
i32.const 4
i32.mul
i32.add
Both represent:
Core Instructions
Arithmetic Operations
Integer operations:
i32.add ; Addition
i32.sub ; Subtraction
i32.mul ; Multiplication
i32.div_s ; Signed division
i32.div_u ; Unsigned division
i32.rem_s ; Signed remainder
i32.rem_u ; Unsigned remainder
i32.and ; Bitwise AND
i32.or ; Bitwise OR
i32.xor ; Bitwise XOR
i32.shl ; Shift left
i32.shr_s ; Arithmetic shift right
i32.shr_u ; Logical shift right
i32.rotl ; Rotate left
i32.rotr ; Rotate right
Floating-point operations:
f64.add ; Addition
f64.sub ; Subtraction
f64.mul ; Multiplication
f64.div ; Division
f64.sqrt ; Square root
f64.min ; Minimum
f64.max ; Maximum
f64.ceil ; Ceiling
f64.floor ; Floor
f64.abs ; Absolute value
f64.neg ; Negation
Comparison Operations
i32.eq ; Equal
i32.ne ; Not equal
i32.lt_s ; Less than (signed)
i32.lt_u ; Less than (unsigned)
i32.le_s ; Less or equal (signed)
i32.gt_s ; Greater than (signed)
i32.ge_s ; Greater or equal (signed)
i32.eqz ; Equal to zero
Control Flow
Structured control flow (no
goto):
;; If-then-else
(if (result i32)
(i32.lt_s (local.get $x) (i32.const 0))
(then
(i32.const -1)
)
(else
(i32.const 1)
)
)
;; Block (labeled)
(block $my_block
;; code
br $my_block ;; branch to end of block
;; unreachable
)
;; Loop
(loop $continue
;; code
(br_if $continue (i32.const 1)) ;; conditional branch
)
Example: Loop to sum 1..n:
(func $sum (param $n i32) (result i32)
(local $i i32)
(local $sum i32)
;; i = 0, sum = 0
(local.set $i (i32.const 0))
(local.set $sum (i32.const 0))
(loop $continue
;; sum += i
(local.set $sum
(i32.add (local.get $sum) (local.get $i))
)
;; i++
(local.set $i
(i32.add (local.get $i) (i32.const 1))
)
;; if (i <= n) continue loop
(br_if $continue
(i32.le_s (local.get $i) (local.get $n))
)
)
(local.get $sum)
)
Linear Memory
Memory Model
WebAssembly uses linear memory: a contiguous, resizable byte array.
┌│────────────────────────────────────────────────────┐
││ WebAssembly Linear Memory │
├│────────────────────────────────────────────────────┤
││ │
││ Address 0x0000: [byte][byte][byte][byte] … │
││ Address 0x0004: [byte][byte][byte][byte] … │
││ Address 0x0008: [byte][byte][byte][byte] … │
││ … │
││ │
││ Each page = 64 KiB (65,536 bytes) │
││ Maximum size = 4 GiB (65,536 pages) │
││ │ └────────────────────────────────────────────────────┘
Memory Operations
Define memory:
;; Define 1 page (64 KiB) of memory, max 10 pages
(memory 1 10)
;; Export memory to JavaScript
(export "memory" (memory 0))
Load/store operations:
;; Load 32-bit integer from address
i32.load (offset)
;; Store 32-bit integer to address
i32.store (offset)
;; Load/store with different sizes
i32.load8_s ; Load signed 8-bit, extend to 32-bit
i32.load8_u ; Load unsigned 8-bit
i32.load16_s ; Load signed 16-bit
i32.store8 ; Store low 8 bits
i32.store16 ; Store low 16 bits
Example: Read/write memory:
(func $writeInt (param $addr i32) (param $value i32)
(i32.store
(local.get $addr)
(local.get $value)
)
)
(func $readInt (param $addr i32) (result i32)
(i32.load (local.get $addr))
)
Memory Growth
Dynamic memory allocation:
;; Grow memory by n pages, returns previous size or -1
(memory.grow (i32.const 1))
;; Get current memory size in pages
(memory.size)
JavaScript Interoperability
Loading WebAssembly
Streaming compilation (recommended):
async function loadWasm(url) {
const response = await fetch(url);
const { instance, module } = await WebAssembly.instantiateStreaming(
response,
importObject
);
return instance;
}
const wasmInstance = await loadWasm('module.wasm');From buffer (for Node.js or non-streaming):
const fs = require('fs');
const wasmBuffer = fs.readFileSync('module.wasm');
const wasmModule = await WebAssembly.compile(wasmBuffer);
const wasmInstance = await WebAssembly.instantiate(wasmModule, importObject);Calling Wasm from JavaScript
// Call exported Wasm function
const result = wasmInstance.exports.add(5, 3);
console.log(result); // 8
// Access exported memory
const memory = wasmInstance.exports.memory;
const view = new Uint8Array(memory.buffer);
// Read from memory
const value = new Int32Array(memory.buffer, 0, 1)[0];
// Write to memory
new Int32Array(memory.buffer, 4, 1)[0] = 42;Importing JavaScript Functions
Import object:
const importObject = {
env: {
// Import JavaScript function
log: (x) => console.log('Wasm says:', x),
// Import global
globalValue: new WebAssembly.Global({
value: 'i32',
mutable: true
}, 42)
}
};Use in Wasm:
(module
;; Import JS function
(import "env" "log" (func $log (param i32)))
;; Import JS global
(import "env" "globalValue" (global $g i32))
(func $test
;; Call imported function
(call $log (global.get $g))
)
)
Passing Complex Data
Strings (no native string type):
// JavaScript → Wasm: Write UTF-8 string to memory
function writeString(instance, str, addr) {
const memory = new Uint8Array(instance.exports.memory.buffer);
const encoded = new TextEncoder().encode(str);
memory.set(encoded, addr);
return encoded.length;
}
// Wasm → JavaScript: Read UTF-8 string from memory
function readString(instance, addr, length) {
const memory = new Uint8Array(instance.exports.memory.buffer);
const bytes = memory.slice(addr, addr + length);
return new TextDecoder().decode(bytes);
}Arrays:
// Pass array to Wasm
function passArray(instance, array) {
const memory = new Float32Array(instance.exports.memory.buffer);
memory.set(array, 0); // Write at address 0
// Call Wasm function with array address and length
instance.exports.processArray(0, array.length);
}
// Get array from Wasm
function getArray(instance, addr, length) {
const memory = new Float32Array(instance.exports.memory.buffer);
return Array.from(memory.slice(addr / 4, addr / 4 + length));
}Building WebAssembly
From WAT to WASM
Using WABT (WebAssembly Binary Toolkit):
# Install wabt
npm install -g wabt
# Compile WAT to WASM
wat2wasm module.wat -o module.wasm
# Disassemble WASM to WAT
wasm2wat module.wasm -o module.wat
# Validate WASM module
wasm-validate module.wasmExample module (add.wat):
(module
(func $add (param $a i32) (param $b i32) (result i32)
local.get $a
local.get $b
i32.add
)
(export "add" (func $add))
)
Compile and use:
wat2wasm add.wat -o add.wasmconst wasm = await loadWasm('add.wasm');
console.log(wasm.exports.add(10, 20)); // 30Compilation from High-Level Languages
From C/C++ (covered in detail in Chapter 14):
emcc source.c -o output.wasm \
-s EXPORTED_FUNCTIONS='["_myFunction"]' \
-s STANDALONE_WASMFrom Rust (covered in detail in Chapter 15):
cargo build --target wasm32-unknown-unknown --releaseFrom AssemblyScript (TypeScript-like):
// module.ts
export function add(a: i32, b: i32): i32 {
return a + b;
}asc module.ts -o module.wasm --optimizePerformance Characteristics
Advantages
1. Fast parsing/loading: Binary format → Direct decoding → No parsing overhead vs. JavaScript text → Lexing → Parsing → AST → Bytecode
2. Predictable performance:
No JIT warmup delays
Ahead-of-time compilation
Consistent execution times
3. Compact size: Typical size comparison: JavaScript (minified): 100 KB WebAssembly: 30-50 KB (40-50% smaller)
4. Memory efficiency:
Manual memory management
No garbage collection overhead
Precise control over layout
Limitations
1. Startup cost:
// Compile + instantiate time
const t0 = performance.now();
const wasm = await WebAssembly.instantiateStreaming(fetch('module.wasm'));
const t1 = performance.now();
console.log(`Startup: ${t1 - t0}ms`);Typical: 10-100ms depending on module size.
2. JS ↔︎ Wasm boundary cost:
// Expensive if called millions of times
for (let i = 0; i < 1_000_000; i++) {
wasm.exports.smallFunction(i); // Boundary crossing
}
// Better: Do work in Wasm
wasm.exports.processMillionItems(); // One crossing3. No direct DOM access:
WebAssembly cannot directly manipulate DOM – must call JavaScript.
4. Limited debugging:
Binary format makes debugging harder (though improving with source maps).
Practical Example: Image Processing
Grayscale Filter
Wasm module (filter.wat):
(module
(memory (export "memory") 1)
;; Convert RGBA image to grayscale
;; addr: start address of image data
;; length: number of pixels
(func $grayscale (param $addr i32) (param $length i32)
(local $i i32)
(local $r i32)
(local $g i32)
(local $b i32)
(local $gray i32)
(local $offset i32)
(loop $continue
;; Calculate byte offset (4 bytes per pixel)
(local.set $offset
(i32.mul (local.get $i) (i32.const 4))
)
;; Load R, G, B
(local.set $r (i32.load8_u (i32.add (local.get $addr) (local.get $offset))))
(local.set $g (i32.load8_u (i32.add (local.get $addr) (i32.add (local.get $offset) (i32.const 1)))))
(local.set $b (i32.load8_u (i32.add (local.get $addr) (i32.add (local.get $offset) (i32.const 2)))))
;; Calculate grayscale: 0.299*R + 0.587*G + 0.114*B
;; Using integer approximation: (77*R + 150*G + 29*B) / 256
(local.set $gray
(i32.div_u
(i32.add
(i32.add
(i32.mul (local.get $r) (i32.const 77))
(i32.mul (local.get $g) (i32.const 150))
)
(i32.mul (local.get $b) (i32.const 29))
)
(i32.const 256)
)
)
;; Write grayscale value to R, G, B
(i32.store8 (i32.add (local.get $addr) (local.get $offset)) (local.get $gray))
(i32.store8 (i32.add (local.get $addr) (i32.add (local.get $offset) (i32.const 1))) (local.get $gray))
(i32.store8 (i32.add (local.get $addr) (i32.add (local.get $offset) (i32.const 2))) (local.get $gray))
;; i++
(local.set $i (i32.add (local.get $i) (i32.const 1)))
;; Continue if i < length
(br_if $continue (i32.lt_u (local.get $i) (local.get $length)))
)
)
(export "grayscale" (func $grayscale))
)
JavaScript integration:
async function applyGrayscaleFilter(imageData) {
// Load Wasm module
const wasm = await loadWasm('filter.wasm');
// Get memory view
const memory = new Uint8Array(wasm.exports.memory.buffer);
// Copy image data to Wasm memory
memory.set(imageData.data, 0);
// Process in Wasm
const pixelCount = imageData.width * imageData.height;
wasm.exports.grayscale(0, pixelCount);
// Copy result back
imageData.data.set(memory.subarray(0, imageData.data.length));
return imageData;
}
// Usage with Canvas
const canvas = document.getElementById('myCanvas');
const ctx = canvas.getContext('2d');
const imageData = ctx.getImageData(0, 0, canvas.width, canvas.height);
await applyGrayscaleFilter(imageData);
ctx.putImageData(imageData, 0, 0);Performance comparison:
// Pure JavaScript version
function grayscaleJS(imageData) {
const data = imageData.data;
for (let i = 0; i < data.length; i += 4) {
const gray = 0.299 * data[i] + 0.587 * data[i+1] + 0.114 * data[i+2];
data[i] = data[i+1] = data[i+2] = gray;
}
}
// Benchmark
const imageData = ctx.getImageData(0, 0, 1920, 1080);
console.time('JavaScript');
grayscaleJS(imageData);
console.timeEnd('JavaScript');
// JavaScript: ~12ms
console.time('WebAssembly');
await applyGrayscaleFilter(imageData);
console.timeEnd('WebAssembly');
// WebAssembly: ~4ms
// ~3x faster!Summary
WebAssembly fundamentals:
1. Architecture:
Stack-based virtual machine
Strongly typed instruction set
Linear memory model
Module-based structure
2. Type system:
Four numeric types:
i32,i64,f32,f64Reference types:
funcref,externrefFunction signatures with typed params/results
3. Text format (WAT):
S-expression syntax
Human-readable representation
Converts to binary
.wasmformat
4. Core features:
Arithmetic/logic operations
Structured control flow (no
goto)Memory load/store operations
Function calls (direct and indirect)
5. JavaScript interop:
Load modules with
WebAssembly.instantiateStreaming()Call exported functions
Share linear memory
Import JavaScript functions
6. Performance:
Fast parsing (binary format)
Near-native execution speed
Predictable performance
Small binary size
Boundary crossing overhead
7. Use cases:
CPU-intensive computations
Image/video processing
Physics simulations
Porting existing C/C++/Rust code
Performance-critical algorithms
WebAssembly provides a safe, fast, portable compilation target that complements JavaScript, enabling new classes of web applications with near-native performance. In the next chapters, we’ll explore compiling from high-level languages like C/C++ and Rust to WebAssembly.
Chapter 14: WebAssembly Text Format (WAT)
Introduction: Understanding WAT
WebAssembly Text Format (WAT) is the human-readable representation of WebAssembly modules, using S-expression syntax similar to Lisp. While most developers compile to WebAssembly from high-level languages, understanding WAT is crucial for:
Debugging: Reading compiled output
Learning: Understanding WebAssembly’s instruction set
Manual optimization: Fine-tuning critical sections
Tool development: Building compilers and analyzers
┌│─────────────────────────────────────────────────────┐
││ WebAssembly Representations │
├│─────────────────────────────────────────────────────┤
││ │
││ High-level code (C/Rust/etc.) │
││ ↓ │
││ WebAssembly Text (.wat) ←→ Binary (.wasm) │
││ ↓ │
││ JavaScript integration │
││ │ └─────────────────────────────────────────────────────┘
Note: WAT was originally called “Wast”, but the community standardized on “WAT” (.wat file extension).
S-Expression Syntax
Basic Structure
WAT uses S-expressions (symbolic expressions):
;; S-expression format
(keyword arguments...)
;; Examples
(i32.add (i32.const 1) (i32.const 2))
;; Nested
(i32.mul
(i32.add (i32.const 2) (i32.const 3))
(i32.const 4)
)
;; Evaluates to: (2 + 3) * 4 = 20
Parentheses rules:
Opening
(starts an operationClosing
)completes itEverything is explicitly nested
Comments
;; Single-line comment
(;
Multi-line comment
spanning multiple lines
;)
;;; Documentation comment (convention)
;;; Used to document functions/modules
Module Structure
Minimal Module
Every WAT file defines a module:
(module
;; Module contents go here
)
Complete Module Template
(module
;; 1. Type definitions
(type $functype (func (param i32) (result i32)))
;; 2. Imports
(import "env" "log" (func $log (param i32)))
(import "js" "memory" (memory 1))
;; 3. Function declarations
(func $internal ...)
(func $exported ...)
;; 4. Table definitions
(table 10 funcref)
;; 5. Memory definitions
(memory 1)
;; 6. Global variables
(global $counter (mut i32) (i32.const 0))
;; 7. Exports
(export "exported" (func $exported))
(export "memory" (memory 0))
;; 8. Start function (initialization)
(start $init)
;; 9. Element segments (table initialization)
(elem (i32.const 0) $func1 $func2)
;; 10. Data segments (memory initialization)
(data (i32.const 0) "Hello, World!")
)
Value Types
Numeric Types
i32 ;; 32-bit integer
i64 ;; 64-bit integer
f32 ;; 32-bit floating-point (IEEE 754)
f64 ;; 64-bit floating-point (IEEE 754)
Reference Types
funcref ;; Reference to a function
externref ;; Reference to external (JavaScript) object
Type Usage
;; Local variables
(local $x i32)
(local $y f64)
(local $fn funcref)
;; Parameters
(param $a i32)
(param $b f64)
;; Results
(result i32)
(result f64 f64) ;; Multiple results
Constants and Literals
Integer Constants
;; Decimal
(i32.const 42)
(i32.const -17)
;; Hexadecimal
(i32.const 0x2A)
(i32.const 0xFF)
;; Binary
(i32.const 0b101010)
;; Underscores for readability
(i32.const 1_000_000)
(i32.const 0xFF_FF_FF_FF)
Floating-Point Constants
;; Standard notation
(f32.const 3.14159)
(f32.const -2.5)
;; Scientific notation
(f64.const 1.23e-4)
(f64.const 6.022e23)
;; Special values
(f32.const nan) ;; NaN
(f32.const inf) ;; Infinity
(f32.const -inf) ;; Negative infinity
;; Hexadecimal float
(f64.const 0x1.921fb54442d18p+1) ;; π
Functions
Basic Function Definition
;; Unnamed function
(func (param i32 i32) (result i32)
local.get 0
local.get 1
i32.add
)
;; Named function with named parameters
(func $add (param $a i32) (param $b i32) (result i32)
local.get $a
local.get $b
i32.add
)
Local Variables
(func $compute (param $x i32) (result i32)
;; Declare local variables
(local $temp i32)
(local $result i32)
;; Use locals
(local.set $temp (i32.mul (local.get $x) (i32.const 2)))
(local.set $result (i32.add (local.get $temp) (i32.const 1)))
(local.get $result)
)
Shorthand for multiple locals:
;; Multiple declarations
(local $a i32)
(local $b i32)
(local $c i32)
;; Equivalent shorthand
(local $a $b $c i32)
Multiple Return Values
;; Function returning two values
(func $swap (param $a i32) (param $b i32) (result i32 i32)
local.get $b
local.get $a
)
;; Usage
(func $test
(local $x i32)
(local $y i32)
(i32.const 10)
(i32.const 20)
(call $swap)
(local.set $y) ;; Pop second result
(local.set $x) ;; Pop first result
)
Instructions
Stack Manipulation
;; Push constant onto stack
(i32.const 42)
;; Get local variable (push to stack)
(local.get $varname)
(local.get 0) ;; By index
;; Set local variable (pop from stack)
(local.set $varname)
;; Tee: set local AND keep value on stack
(local.tee $varname)
;; Get global variable
(global.get $globalname)
;; Set global variable
(global.set $globalname)
Arithmetic Instructions
;; Integer arithmetic
i32.add ;; Addition
i32.sub ;; Subtraction
i32.mul ;; Multiplication
i32.div_s ;; Signed division
i32.div_u ;; Unsigned division
i32.rem_s ;; Signed remainder
i32.rem_u ;; Unsigned remainder
;; Example: (a * b) + c
(func $formula (param $a i32) (param $b i32) (param $c i32) (result i32)
(i32.add
(i32.mul (local.get $a) (local.get $b))
(local.get $c)
)
)
Bitwise Operations
i32.and ;; Bitwise AND
i32.or ;; Bitwise OR
i32.xor ;; Bitwise XOR
i32.shl ;; Shift left
i32.shr_s ;; Arithmetic shift right
i32.shr_u ;; Logical shift right
i32.rotl ;; Rotate left
i32.rotr ;; Rotate right
i32.clz ;; Count leading zeros
i32.ctz ;; Count trailing zeros
i32.popcnt ;; Count set bits
Example: Check if power of 2:
(func $isPowerOf2 (param $n i32) (result i32)
;; n != 0 && (n & (n-1)) == 0
(i32.and
(i32.ne (local.get $n) (i32.const 0))
(i32.eqz
(i32.and
(local.get $n)
(i32.sub (local.get $n) (i32.const 1))
)
)
)
)
Comparison Instructions
i32.eq ;; Equal
i32.ne ;; Not equal
i32.lt_s ;; Less than (signed)
i32.lt_u ;; Less than (unsigned)
i32.le_s ;; Less or equal (signed)
i32.le_u ;; Less or equal (unsigned)
i32.gt_s ;; Greater than (signed)
i32.gt_u ;; Greater than (unsigned)
i32.ge_s ;; Greater or equal (signed)
i32.ge_u ;; Greater or equal (unsigned)
i32.eqz ;; Equal to zero
Conversion Instructions
;; Wrap (truncate)
i32.wrap_i64 ;; i64 → i32 (truncate)
;; Extend
i64.extend_i32_s ;; i32 → i64 (sign extend)
i64.extend_i32_u ;; i32 → i64 (zero extend)
;; Truncate float to int
i32.trunc_f32_s ;; f32 → i32 (signed)
i32.trunc_f32_u ;; f32 → i32 (unsigned)
i32.trunc_f64_s ;; f64 → i32 (signed)
;; Convert int to float
f32.convert_i32_s ;; i32 → f32 (signed)
f32.convert_i32_u ;; i32 → f32 (unsigned)
;; Promote/Demote
f64.promote_f32 ;; f32 → f64
f32.demote_f64 ;; f64 → f32
;; Reinterpret bits
i32.reinterpret_f32 ;; Reinterpret f32 bits as i32
f32.reinterpret_i32 ;; Reinterpret i32 bits as f32
Example: Float to int with rounding:
(func $roundToInt (param $x f64) (result i32)
(i32.trunc_f64_s
(f64.add (local.get $x) (f64.const 0.5))
)
)
Control Flow
Blocks
Block: Creates a label for branching to the end:
(block $label (result i32)
;; code
(br $label) ;; Jump to end of block
;; unreachable code
(i32.const 42) ;; Block result
)
Example: Early exit:
(func $absoluteValue (param $x i32) (result i32)
(block $done (result i32)
;; If x >= 0, return x immediately
(br_if $done
(local.get $x)
(i32.ge_s (local.get $x) (i32.const 0))
)
;; Otherwise return -x
(i32.sub (i32.const 0) (local.get $x))
)
)
Loops
Loop: Creates a label for branching to the beginning:
(loop $continue
;; code
(br $continue) ;; Jump to start of loop
)
Example: Sum 1 to n:
(func $sumToN (param $n i32) (result i32)
(local $i i32)
(local $sum i32)
(local.set $i (i32.const 1))
(local.set $sum (i32.const 0))
(loop $continue
;; sum += i
(local.set $sum
(i32.add (local.get $sum) (local.get $i))
)
;; i++
(local.set $i
(i32.add (local.get $i) (i32.const 1))
)
;; if (i <= n) continue
(br_if $continue
(i32.le_s (local.get $i) (local.get $n))
)
)
(local.get $sum)
)
If-Then-Else
;; Without result
(if (i32.gt_s (local.get $x) (i32.const 0))
(then
;; code when true
)
(else
;; code when false
)
)
;; With result
(if (result i32)
(i32.lt_s (local.get $x) (i32.const 0))
(then
(i32.const -1)
)
(else
(i32.const 1)
)
)
Example: Max function:
(func $max (param $a i32) (param $b i32) (result i32)
(if (result i32)
(i32.gt_s (local.get $a) (local.get $b))
(then (local.get $a))
(else (local.get $b))
)
)
Select
Select: Ternary operator equivalent:
;; select(value_if_true, value_if_false, condition)
(select
(i32.const 100) ;; value if true
(i32.const 0) ;; value if false
(i32.gt_s (local.get $x) (i32.const 0)) ;; condition
)
;; Equivalent to: x > 0 ? 100 : 0
Branch Instructions
br $label ;; Unconditional branch
br_if $label ;; Conditional branch (pops condition)
br_table $l1 $l2 $default ;; Switch-case style
return ;; Return from function
unreachable ;; Trap execution
Example: Switch statement:
(func $getDayName (param $day i32) (result i32)
(block $default (result i32)
(block $case6 (result i32)
(block $case5 (result i32)
(block $case4 (result i32)
(block $case3 (result i32)
(block $case2 (result i32)
(block $case1 (result i32)
(block $case0 (result i32)
;; Jump table
(br_table $case0 $case1 $case2 $case3
$case4 $case5 $case6 $default
(local.get $day)
)
)
;; Case 0: Sunday
(return (i32.const 0))
)
;; Case 1: Monday
(return (i32.const 1))
)
;; Case 2: Tuesday
(return (i32.const 2))
)
;; ... etc
)
)
)
;; Default case
(i32.const -1)
)
)
Memory Operations
Memory Definition
;; Define 1 page (64KB) minimum, 10 pages maximum
(memory 1 10)
;; Export memory
(export "memory" (memory 0))
;; Import memory from JavaScript
(import "js" "mem" (memory 1))
Load Instructions
;; Load 32-bit integer
i32.load (offset) (align)
;; Load with specific size
i32.load8_s ;; Load signed 8-bit, extend to 32-bit
i32.load8_u ;; Load unsigned 8-bit
i32.load16_s ;; Load signed 16-bit
i32.load16_u ;; Load unsigned 16-bit
;; Offset and alignment
(i32.load offset=4 align=4)
Example: Read integer from address:
(func $readInt (param $addr i32) (result i32)
(i32.load (local.get $addr))
)
Store Instructions
;; Store 32-bit integer
i32.store (offset) (align)
;; Store with specific size
i32.store8 ;; Store low 8 bits
i32.store16 ;; Store low 16 bits
;; Example with offset
(i32.store offset=8 align=4)
Example: Write integer to address:
(func $writeInt (param $addr i32) (param $value i32)
(i32.store
(local.get $addr)
(local.get $value)
)
)
Memory Size and Growth
;; Get current memory size in pages
(memory.size)
;; Grow memory by n pages (returns previous size or -1)
(memory.grow (i32.const 1))
Example: Allocate memory:
(func $malloc (param $size i32) (result i32)
(local $oldSize i32)
(local $pagesNeeded i32)
;; Calculate pages needed
(local.set $pagesNeeded
(i32.div_u
(i32.add (local.get $size) (i32.const 65535))
(i32.const 65536)
)
)
;; Get current size
(local.set $oldSize (memory.size))
;; Grow memory
(memory.grow (local.get $pagesNeeded))
(drop)
;; Return address (start of new memory)
(i32.mul (local.get $oldSize) (i32.const 65536))
)
Data Segments
Initialize memory with data:
;; Active segment (written at instantiation)
(data (i32.const 0) "Hello, World!\00")
;; With offset expression
(data (offset (i32.const 1024)) "\01\02\03\04")
;; Passive segment (copied manually)
(data $mydata "Some data")
;; Copy passive segment
(memory.init $mydata
(i32.const 0) ;; Destination address
(i32.const 0) ;; Source offset
(i32.const 9) ;; Length
)
Tables and Indirect Calls
Table Definition
;; Define table of function references
(table 10 20 funcref) ;; min 10, max 20 entries
;; Export table
(export "table" (table 0))
Element Segments
Initialize table with functions:
(func $func1 ...)
(func $func2 ...)
(func $func3 ...)
;; Active element (written at instantiation)
(elem (i32.const 0) $func1 $func2 $func3)
;; With offset expression
(elem (offset (i32.const 5)) $func1 $func2)
Indirect Function Calls
;; Call function by table index
call_indirect (type $signature)
;; Example
(type $binop (func (param i32 i32) (result i32)))
(func $add (param i32 i32) (result i32)
(i32.add (local.get 0) (local.get 1))
)
(func $sub (param i32 i32) (result i32)
(i32.sub (local.get 0) (local.get 1))
)
(table 2 funcref)
(elem (i32.const 0) $add $sub)
(func $compute (param $op i32) (param $a i32) (param $b i32) (result i32)
(call_indirect (type $binop)
(local.get $a)
(local.get $b)
(local.get $op) ;; Table index
)
)
Global Variables
Immutable Globals
;; Constant global
(global $pi f64 (f64.const 3.14159265359))
;; Get global value
(func $circumference (param $r f64) (result f64)
(f64.mul
(f64.mul (local.get $r) (f64.const 2.0))
(global.get $pi)
)
)
Mutable Globals
;; Mutable global
(global $counter (mut i32) (i32.const 0))
;; Increment counter
(func $increment
(global.set $counter
(i32.add (global.get $counter) (i32.const 1))
)
)
;; Get counter value
(func $getCount (result i32)
(global.get $counter)
)
Imported/Exported Globals
;; Import global from JavaScript
(import "env" "maxValue" (global $max i32))
;; Export global to JavaScript
(global $status (mut i32) (i32.const 0))
(export "status" (global $status))
Imports and Exports
Importing Functions
;; Import function from JavaScript
(import "env" "log" (func $log (param i32)))
(import "console" "error" (func $error (param i32 i32)))
;; Use imported function
(func $debug (param $x i32)
(call $log (local.get $x))
)
Exporting Functions
;; Export function to JavaScript
(func $add (param $a i32) (param $b i32) (result i32)
(i32.add (local.get $a) (local.get $b))
)
(export "add" (func $add))
;; Export with different name
(func $internalName ...)
(export "externalName" (func $internalName))
Importing Memory/Tables
;; Import memory
(import "js" "memory" (memory 1))
;; Import table
(import "env" "table" (table 10 funcref))
;; Import global
(import "env" "timestamp" (global $timestamp i64))
Exporting Memory/Tables
;; Export memory
(memory 1)
(export "memory" (memory 0))
;; Export table
(table 10 funcref)
(export "table" (table 0))
;; Export global
(global $version i32 (i32.const 1))
(export "version" (global $version))
Practical Examples
Example 1: Fibonacci
(module
;; Recursive Fibonacci
(func $fib (param $n i32) (result i32)
(if (result i32)
(i32.le_s (local.get $n) (i32.const 1))
(then
(local.get $n)
)
(else
(i32.add
(call $fib (i32.sub (local.get $n) (i32.const 1)))
(call $fib (i32.sub (local.get $n) (i32.const 2)))
)
)
)
)
(export "fib" (func $fib))
)
Example 2: String Length
(module
(memory 1)
(export "memory" (memory 0))
;; Calculate length of null-terminated string
(func $strlen (param $addr i32) (result i32)
(local $len i32)
(local.set $len (i32.const 0))
(loop $continue
;; Load byte at current position
(if (i32.load8_u
(i32.add (local.get $addr) (local.get $len)))
(then
;; Not null, increment length
(local.set $len (i32.add (local.get $len) (i32.const 1)))
(br $continue)
)
)
)
(local.get $len)
)
(export "strlen" (func $strlen))
;; Initialize with test string
(data (i32.const 0) "Hello, WAT!\00")
)
Example 3: Array Sum
(module
(memory 1)
(export "memory" (memory 0))
;; Sum array of i32 values
;; @param addr: start address
;; @param length: number of elements
(func $sumArray (param $addr i32) (param $length i32) (result i32)
(local $i i32)
(local $sum i32)
(local.set $i (i32.const 0))
(local.set $sum (i32.const 0))
(loop $continue
;; Load element at index i
(local.set $sum
(i32.add
(local.get $sum)
(i32.load
(i32.add
(local.get $addr)
(i32.mul (local.get $i) (i32.const 4))
)
)
)
)
;; i++
(local.set $i (i32.add (local.get $i) (i32.const 1)))
;; Continue if i < length
(br_if $continue
(i32.lt_u (local.get $i) (local.get $length))
)
)
(local.get $sum)
)
(export "sumArray" (func $sumArray))
)
Example 4: Pointer-Based Data Structure
(module
(memory 1)
(export "memory" (memory 0))
;; Linked list node: [value: i32, next: i32]
;; Node size: 8 bytes
(global $heapPtr (mut i32) (i32.const 0))
;; Allocate node
(func $allocNode (result i32)
(local $ptr i32)
(local.set $ptr (global.get $heapPtr))
(global.set $heapPtr
(i32.add (global.get $heapPtr) (i32.const 8))
)
(local.get $ptr)
)
;; Create node with value
(func $createNode (param $value i32) (result i32)
(local $node i32)
(local.set $node (call $allocNode))
;; Set value
(i32.store (local.get $node) (local.get $value))
;; Set next to null (0)
(i32.store offset=4 (local.get $node) (i32.const 0))
(local.get $node)
)
;; Get value from node
(func $getValue (param $node i32) (result i32)
(i32.load (local.get $node))
)
;; Get next pointer
(func $getNext (param $node i32) (result i32)
(i32.load offset=4 (local.get $node))
)
;; Set next pointer
(func $setNext (param $node i32) (param $next i32)
(i32.store offset=4 (local.get $node) (local.get $next))
)
;; Sum linked list
(func $sumList (param $head i32) (result i32)
(local $sum i32)
(local $current i32)
(local.set $sum (i32.const 0))
(local.set $current (local.get $head))
(loop $continue
(if (local.get $current)
(then
;; Add value
(local.set $sum
(i32.add (local.get $sum) (call $getValue (local.get $current)))
)
;; Move to next
(local.set $current (call $getNext (local.get $current)))
(br $continue)
)
)
)
(local.get $sum)
)
(export "createNode" (func $createNode))
(export "setNext" (func $setNext))
(export "sumList" (func $sumList))
)
Converting WAT to WASM
Using WABT Tools
# Install WABT (WebAssembly Binary Toolkit)
npm install -g wabt
# Compile WAT to WASM
wat2wasm module.wat -o module.wasm
# Disassemble WASM to WAT
wasm2wat module.wasm -o module.wat
# Validate WASM
wasm-validate module.wasm
# View binary structure
wasm-objdump -x module.wasm
# View disassembly
wasm-objdump -d module.wasmUsing in JavaScript
// Load and instantiate
const response = await fetch('module.wasm');
const { instance } = await WebAssembly.instantiateStreaming(response);
// Call exported function
const result = instance.exports.fib(10);
console.log(result); // 55Debugging WAT
Adding Debug Information
(module
;; Name section (custom section for debugging)
(@name "MyModule")
(func $add (param $a i32) (param $b i32) (result i32)
(@name "add")
;; Local names
(local.get $a (@name "a"))
(local.get $b (@name "b"))
i32.add
)
)
Common Debugging Techniques
1. Import console.log:
(import "console" "log" (func $log (param i32)))
(func $debug
(call $log (i32.const 42)) ;; Log value
)
2. Use unreachable for breakpoints:
(func $test
;; code
unreachable ;; Trap here
;; more code
)
3. Return intermediate values:
(func $compute (param $x i32) (result i32 i32)
(local $intermediate i32)
(local.set $intermediate (i32.mul (local.get $x) (i32.const 2)))
;; Return both intermediate and final
(local.get $intermediate)
(i32.add (local.get $intermediate) (i32.const 1))
)
Best Practices
1. Use Named Parameters and Locals
;; Good: Named and clear
(func $calculateArea (param $width f64) (param $height f64) (result f64)
(f64.mul (local.get $width) (local.get $height))
)
;; Bad: Unnamed
(func (param f64) (param f64) (result f64)
(f64.mul (local.get 0) (local.get 1))
)
2. Add Comments
;; Calculate compound interest
;; Formula: A = P(1 + r)^t
(func $compoundInterest
(param $principal f64) ;; Initial amount
(param $rate f64) ;; Annual interest rate
(param $time i32) ;; Years
(result f64) ;; Final amount
;; Implementation...
)
3. Use Local Variables for Clarity
;; Good: Clear intent
(func $pythagorean (param $a f64) (param $b f64) (result f64)
(local $aSquared f64)
(local $bSquared f64)
(local.set $aSquared (f64.mul (local.get $a) (local.get $a)))
(local.set $bSquared (f64.mul (local.get $b) (local.get $b)))
(f64.sqrt (f64.add (local.get $aSquared) (local.get $bSquared)))
)
4. Prefer Folded Format for Readability
;; Folded (more readable)
(i32.add
(i32.mul (local.get $x) (i32.const 2))
(i32.const 1)
)
;; Linear (harder to read)
local.get $x
i32.const 2
i32.mul
i32.const 1
i32.add
5. Validate Early
# Always validate before deploying
wat2wasm module.wat --validateSummary
WebAssembly Text Format (WAT) provides:
1. S-expression syntax:
Lisp-like parentheses
Explicit nesting
Human-readable representation
2. Module structure:
Types, imports, functions
Memory, tables, globals
Exports, initialization
3. Type system:
Four numeric types:
i32,i64,f32,f64Reference types:
funcref,externref
4. Instructions:
Stack-based operations
Arithmetic, logic, comparison
Memory load/store
Control flow (block, loop, if)
5. Memory model:
Linear memory array
Load/store with offsets
Data segments for initialization
6. Functions:
Named parameters/locals
Multiple return values
Direct and indirect calls
7. Tooling:
WABT for conversion/validation
Browser DevTools support
Source maps for debugging
Understanding WAT is essential for working with WebAssembly at a low level, debugging compiled output, and building tools that generate WebAssembly code. In the next chapter, we’ll explore compiling high-level languages (C/C++) to WebAssembly.
Chapter 15: JavaScript and WebAssembly Interop
Introduction: Bridging Two Worlds
JavaScript-WebAssembly interoperability is the foundation of practical WebAssembly applications. While WebAssembly excels at CPU-intensive computations, it relies on JavaScript for:
I/O operations (network, file system, DOM)
Complex data structures (objects, arrays, strings)
Browser APIs (Canvas, WebGL, Audio)
User interaction and event handling
┌│────────────────────────────────────────────────────┐
││ JavaScript Host Environment │
├│────────────────────────────────────────────────────┤
││ │
││ ┌──────────────┐ ┌──────────────┐ │
││ │ JavaScript │ ←────→ │ WebAssembly │ │
││ │ Engine │ Interop│ Module │ │
││ └──────────────┘ └──────────────┘ │
││ ↓ ↓ │
││ ┌──────────────┐ ┌──────────────┐ │
││ │ JS Objects │ │ Linear Memory│ │
││ │ DOM APIs │ │ (Typed Data)│ │
││ └──────────────┘ └──────────────┘ │
││ │ └────────────────────────────────────────────────────┘
Key Challenge: The fundamental mismatch between JavaScript’s dynamic, garbage-collected objects and WebAssembly’s static, linear memory model.
Loading WebAssembly Modules
Basic Loading Pattern
// Modern async approach (recommended)
async function loadWasm(url) {
try {
const response = await fetch(url);
const { instance, module } = await WebAssembly.instantiateStreaming(response);
return instance;
} catch (error) {
console.error('Failed to load WebAssembly:', error);
throw error;
}
}
// Usage
const wasmInstance = await loadWasm('module.wasm');
const result = wasmInstance.exports.add(5, 3);
console.log(result); // 8Synchronous Loading (Node.js or with ArrayBuffer)
// Node.js
const fs = require('fs');
function loadWasmSync(filepath) {
const buffer = fs.readFileSync(filepath);
const module = new WebAssembly.Module(buffer);
const instance = new WebAssembly.Instance(module);
return instance;
}
// Browser (after fetching)
async function loadWithArrayBuffer(url) {
const response = await fetch(url);
const buffer = await response.arrayBuffer();
const module = new WebAssembly.Module(buffer);
const instance = new WebAssembly.Instance(module);
return instance;
}With Import Object
async function loadWithImports(url, importObject) {
const response = await fetch(url);
const { instance } = await WebAssembly.instantiateStreaming(
response,
importObject
);
return instance;
}
// Define imports
const importObject = {
env: {
log: (value) => console.log('WASM says:', value),
abort: (msg, file, line, column) => {
console.error(`Abort at ${file}:${line}:${column} - ${msg}`);
}
},
js: {
memory: new WebAssembly.Memory({ initial: 1 })
}
};
const instance = await loadWithImports('module.wasm', importObject);Data Type Mapping
Primitive Types
WebAssembly ↔︎ JavaScript type correspondence:
// WebAssembly types → JavaScript types
/*
i32 → Number (32-bit integer)
i64 → BigInt
f32 → Number (32-bit float)
f64 → Number (64-bit float)
*/
// Example module
const wasmCode = `
(module
(func $testTypes
(param $int i32)
(param $bigint i64)
(param $float f32)
(param $double f64)
(result f64)
;; Convert and add all values
(f64.add
(f64.add
(f64.convert_i32_s (local.get $int))
(f64.convert_i64_s (local.get $bigint))
)
(f64.add
(f64.promote_f32 (local.get $float))
(local.get $double)
)
)
)
(export "testTypes" (func $testTypes))
)
`;
// JavaScript usage
const instance = await loadWasmFromText(wasmCode);
const result = instance.exports.testTypes(
10, // i32
BigInt(20), // i64 (must use BigInt!)
3.14, // f32
2.71828 // f64
);
console.log(result); // 35.85828Important: i64 parameters and
returns require BigInt in JavaScript:
// Correct
wasmFunc(42, BigInt(100));
// Error: Cannot convert number to BigInt
wasmFunc(42, 100);Type Coercion Table
| Wasm Type | JS Input | JS Output | Notes |
|---|---|---|---|
i32 |
Number | Number | Truncated to 32-bit |
i64 |
BigInt | BigInt | Must use BigInt |
f32 |
Number | Number | Precision loss possible |
f64 |
Number | Number | Full precision |
funcref |
Function/null | Function | Reference types (MVP+) |
externref |
Any JS value | Same value | Opaque reference (MVP+) |
Memory Management
Accessing Linear Memory
// WAT module with memory
const wasmCode = `
(module
(memory 1) ;; 1 page = 64KB
(export "memory" (memory 0))
;; Store integer at address
(func $writeInt (param $addr i32) (param $value i32)
(i32.store (local.get $addr) (local.get $value))
)
;; Read integer from address
(func $readInt (param $addr i32) (result i32)
(i32.load (local.get $addr))
)
(export "writeInt" (func $writeInt))
(export "readInt" (func $readInt))
)
`;
const instance = await loadWasmFromText(wasmCode);
// Access memory from JavaScript
const memory = instance.exports.memory;
const buffer = memory.buffer;
// Create typed array views
const int32View = new Int32Array(buffer);
const uint8View = new Uint8Array(buffer);
const float64View = new Float64Array(buffer);
// Write via JavaScript
int32View[0] = 42;
int32View[1] = 100;
// Read via WebAssembly
console.log(instance.exports.readInt(0)); // 42
console.log(instance.exports.readInt(4)); // 100
// Write via WebAssembly
instance.exports.writeInt(8, 999);
// Read via JavaScript
console.log(int32View[2]); // 999Shared Memory Creation
// JavaScript creates memory, shares with Wasm
const memory = new WebAssembly.Memory({
initial: 1, // 1 page (64KB)
maximum: 10 // Max 10 pages (640KB)
});
const importObject = {
js: { memory }
};
const wasmCode = `
(module
(import "js" "memory" (memory 1))
(func $init
;; Initialize memory
(i32.store (i32.const 0) (i32.const 42))
)
(export "init" (func $init))
)
`;
const instance = await loadWasmFromText(wasmCode, importObject);
// Both JS and Wasm share the same memory
instance.exports.init();
const view = new Int32Array(memory.buffer);
console.log(view[0]); // 42Memory Growth
const wasmCode = `
(module
(memory 1)
(export "memory" (memory 0))
;; Grow memory by n pages
(func $growMemory (param $pages i32) (result i32)
(memory.grow (local.get $pages))
)
(export "growMemory" (func $growMemory))
)
`;
const instance = await loadWasmFromText(wasmCode);
const memory = instance.exports.memory;
console.log('Initial size:', memory.buffer.byteLength); // 65536 (64KB)
// Grow by 2 pages
const oldSize = instance.exports.growMemory(2);
console.log('Previous size (pages):', oldSize); // 1
console.log('New size:', memory.buffer.byteLength); // 196608 (192KB)
// IMPORTANT: buffer reference is now stale!
// Must get new buffer reference after growth
const newBuffer = memory.buffer;
const newView = new Uint8Array(newBuffer);Critical Warning: After
memory.grow(), the old buffer reference
becomes detached. Always re-obtain the buffer:
// ❌ Wrong: Stale buffer reference
const oldBuffer = memory.buffer;
const oldView = new Uint8Array(oldBuffer);
instance.exports.growMemory(1);
oldView[0] = 42; // TypeError: detached ArrayBuffer
// ✅ Correct: Get new buffer
function getMemoryView(memory) {
return new Uint8Array(memory.buffer);
}
let view = getMemoryView(memory);
instance.exports.growMemory(1);
view = getMemoryView(memory); // Refresh reference
view[0] = 42; // OKString Handling
The String Problem
WebAssembly has no native string type. Strings must be:
Encoded as byte sequences in linear memory
Shared via memory addresses (pointers)
Decoded on the receiving end
JavaScript → WebAssembly (Passing Strings)
// JavaScript side: String encoding utilities
class WasmStringHelper {
constructor(wasmInstance) {
this.instance = wasmInstance;
this.memory = wasmInstance.exports.memory;
this.encoder = new TextEncoder();
this.decoder = new TextDecoder('utf-8');
}
// Get current memory view (refresh after growth)
getMemoryView() {
return new Uint8Array(this.memory.buffer);
}
// Allocate space for string in Wasm memory
allocateString(str) {
const bytes = this.encoder.encode(str);
const len = bytes.length;
// Allocate memory in Wasm (assumes malloc export)
const ptr = this.instance.exports.malloc(len + 1); // +1 for null terminator
// Copy string bytes
const view = this.getMemoryView();
view.set(bytes, ptr);
view[ptr + len] = 0; // Null terminator
return { ptr, len };
}
// Read C-style string from memory (null-terminated)
readCString(ptr) {
const view = this.getMemoryView();
let end = ptr;
// Find null terminator
while (view[end] !== 0) end++;
// Decode bytes
return this.decoder.decode(view.subarray(ptr, end));
}
// Read string with known length
readString(ptr, length) {
const view = this.getMemoryView();
return this.decoder.decode(view.subarray(ptr, ptr + length));
}
}
// Example usage
const wasmCode = `
(module
(memory 1)
(export "memory" (memory 0))
(global $heapPtr (mut i32) (i32.const 0))
;; Simple malloc
(func $malloc (param $size i32) (result i32)
(local $ptr i32)
(local.set $ptr (global.get $heapPtr))
(global.set $heapPtr
(i32.add (global.get $heapPtr) (local.get $size))
)
(local.get $ptr)
)
;; String length (C-style)
(func $strlen (param $ptr i32) (result i32)
(local $len i32)
(loop $continue
(if (i32.load8_u (i32.add (local.get $ptr) (local.get $len)))
(then
(local.set $len (i32.add (local.get $len) (i32.const 1)))
(br $continue)
)
)
)
(local.get $len)
)
;; Reverse string in place
(func $reverseString (param $ptr i32) (param $len i32)
(local $i i32)
(local $j i32)
(local $temp i32)
(local.set $i (i32.const 0))
(local.set $j (i32.sub (local.get $len) (i32.const 1)))
(loop $continue
(if (i32.lt_s (local.get $i) (local.get $j))
(then
;; Swap bytes
(local.set $temp
(i32.load8_u (i32.add (local.get $ptr) (local.get $i)))
)
(i32.store8
(i32.add (local.get $ptr) (local.get $i))
(i32.load8_u (i32.add (local.get $ptr) (local.get $j)))
)
(i32.store8
(i32.add (local.get $ptr) (local.get $j))
(local.get $temp)
)
(local.set $i (i32.add (local.get $i) (i32.const 1)))
(local.set $j (i32.sub (local.get $j) (i32.const 1)))
(br $continue)
)
)
)
)
(export "malloc" (func $malloc))
(export "strlen" (func $strlen))
(export "reverseString" (func $reverseString))
)
`;
const instance = await loadWasmFromText(wasmCode);
const helper = new WasmStringHelper(instance);
// Pass string to WebAssembly
const inputStr = "Hello, WebAssembly!";
const { ptr, len } = helper.allocateString(inputStr);
console.log('String at address:', ptr);
console.log('Length:', len);
// Reverse string in Wasm memory
instance.exports.reverseString(ptr, len);
// Read result
const reversed = helper.readCString(ptr);
console.log('Reversed:', reversed); // "!ylbmessAbeW ,olleH"WebAssembly → JavaScript (Returning Strings)
// Pattern 1: Return pointer, JS reads from memory
const wasmCode = `
(module
(memory 1)
(export "memory" (memory 0))
;; Store greeting in memory
(data (i32.const 0) "Hello from Wasm!")
;; Return pointer to string
(func $getGreeting (result i32)
(i32.const 0)
)
(export "getGreeting" (func $getGreeting))
)
`;
const instance = await loadWasmFromText(wasmCode);
const helper = new WasmStringHelper(instance);
const ptr = instance.exports.getGreeting();
const greeting = helper.readCString(ptr);
console.log(greeting); // "Hello from Wasm!"
// Pattern 2: Return pointer + length
const wasmCode2 = `
(module
(memory 1)
(export "memory" (memory 0))
(global $strPtr i32 (i32.const 0))
(global $strLen i32 (i32.const 16))
(data (i32.const 0) "Hello from Wasm!")
(func $getStringPtr (result i32)
(global.get $strPtr)
)
(func $getStringLen (result i32)
(global.get $strLen)
)
(export "getStringPtr" (func $getStringPtr))
(export "getStringLen" (func $getStringLen))
)
`;
const instance2 = await loadWasmFromText(wasmCode2);
const helper2 = new WasmStringHelper(instance2);
const strPtr = instance2.exports.getStringPtr();
const strLen = instance2.exports.getStringLen();
const message = helper2.readString(strPtr, strLen);
console.log(message); // "Hello from Wasm!"Advanced: String Builder Pattern
// JavaScript wrapper for string building
class WasmStringBuilder {
constructor(wasmInstance) {
this.instance = wasmInstance;
this.helper = new WasmStringHelper(wasmInstance);
this.bufferPtr = null;
this.capacity = 0;
}
// Initialize buffer
init(initialCapacity = 256) {
this.capacity = initialCapacity;
this.bufferPtr = this.instance.exports.malloc(this.capacity);
}
// Append string
append(str) {
const { ptr } = this.helper.allocateString(str);
const len = str.length;
// Call Wasm function to append
this.instance.exports.stringBuilderAppend(
this.bufferPtr,
this.capacity,
ptr,
len
);
}
// Get result
toString() {
const len = this.instance.exports.stringBuilderLength(this.bufferPtr);
return this.helper.readString(this.bufferPtr, len);
}
}Complex Data Structures
Arrays
// Passing JavaScript arrays to WebAssembly
class WasmArrayHelper {
constructor(wasmInstance) {
this.instance = wasmInstance;
this.memory = wasmInstance.exports.memory;
}
// Allocate and copy i32 array
allocateInt32Array(jsArray) {
const length = jsArray.length;
const bytes = length * 4; // 4 bytes per i32
// Allocate
const ptr = this.instance.exports.malloc(bytes);
// Copy data
const view = new Int32Array(this.memory.buffer, ptr, length);
view.set(jsArray);
return { ptr, length };
}
// Read i32 array from memory
readInt32Array(ptr, length) {
const view = new Int32Array(this.memory.buffer, ptr, length);
return Array.from(view);
}
// Allocate and copy f64 array
allocateFloat64Array(jsArray) {
const length = jsArray.length;
const bytes = length * 8; // 8 bytes per f64
const ptr = this.instance.exports.malloc(bytes);
const view = new Float64Array(this.memory.buffer, ptr, length);
view.set(jsArray);
return { ptr, length };
}
readFloat64Array(ptr, length) {
const view = new Float64Array(this.memory.buffer, ptr, length);
return Array.from(view);
}
}
// Example: Vector operations
const wasmCode = `
(module
(memory 1)
(export "memory" (memory 0))
(global $heapPtr (mut i32) (i32.const 0))
(func $malloc (param $size i32) (result i32)
(local $ptr i32)
(local.set $ptr (global.get $heapPtr))
(global.set $heapPtr
(i32.add (global.get $heapPtr) (local.get $size))
)
(local.get $ptr)
)
;; Add two f64 arrays element-wise
(func $addVectors
(param $a i32) ;; Pointer to array A
(param $b i32) ;; Pointer to array B
(param $result i32) ;; Pointer to result array
(param $length i32) ;; Array length
(local $i i32)
(local $offset i32)
(loop $continue
(if (i32.lt_u (local.get $i) (local.get $length))
(then
;; offset = i * 8 (8 bytes per f64)
(local.set $offset (i32.mul (local.get $i) (i32.const 8)))
;; result[i] = a[i] + b[i]
(f64.store
(i32.add (local.get $result) (local.get $offset))
(f64.add
(f64.load (i32.add (local.get $a) (local.get $offset)))
(f64.load (i32.add (local.get $b) (local.get $offset)))
)
)
(local.set $i (i32.add (local.get $i) (i32.const 1)))
(br $continue)
)
)
)
)
;; Dot product of two vectors
(func $dotProduct
(param $a i32)
(param $b i32)
(param $length i32)
(result f64)
(local $i i32)
(local $sum f64)
(local $offset i32)
(local.set $sum (f64.const 0))
(loop $continue
(if (i32.lt_u (local.get $i) (local.get $length))
(then
(local.set $offset (i32.mul (local.get $i) (i32.const 8)))
;; sum += a[i] * b[i]
(local.set $sum
(f64.add
(local.get $sum)
(f64.mul
(f64.load (i32.add (local.get $a) (local.get $offset)))
(f64.load (i32.add (local.get $b) (local.get $offset)))
)
)
)
(local.set $i (i32.add (local.get $i) (i32.const 1)))
(br $continue)
)
)
)
(local.get $sum)
)
(export "malloc" (func $malloc))
(export "addVectors" (func $addVectors))
(export "dotProduct" (func $dotProduct))
)
`;
const instance = await loadWasmFromText(wasmCode);
const arrayHelper = new WasmArrayHelper(instance);
// JavaScript arrays
const vecA = [1.0, 2.0, 3.0, 4.0];
const vecB = [5.0, 6.0, 7.0, 8.0];
// Allocate in Wasm memory
const { ptr: ptrA } = arrayHelper.allocateFloat64Array(vecA);
const { ptr: ptrB } = arrayHelper.allocateFloat64Array(vecB);
const { ptr: ptrResult } = arrayHelper.allocateFloat64Array(new Array(4).fill(0));
// Vector addition
instance.exports.addVectors(ptrA, ptrB, ptrResult, 4);
const sum = arrayHelper.readFloat64Array(ptrResult, 4);
console.log('A + B =', sum); // [6, 8, 10, 12]
// Dot product
const dot = instance.exports.dotProduct(ptrA, ptrB, 4);
console.log('A · B =', dot); // 70Structures (Records)
// JavaScript representation of C-like struct
class WasmStructHelper {
constructor(wasmInstance) {
this.instance = wasmInstance;
this.memory = wasmInstance.exports.memory;
}
getMemoryView() {
return new DataView(this.memory.buffer);
}
// Example: Person struct
// struct Person {
// i32 id; // offset 0
// i32 age; // offset 4
// f64 salary; // offset 8
// };
// Total size: 16 bytes
writePerson(ptr, person) {
const view = this.getMemoryView();
view.setInt32(ptr, person.id, true); // Little-endian
view.setInt32(ptr + 4, person.age, true);
view.setFloat64(ptr + 8, person.salary, true);
}
readPerson(ptr) {
const view = this.getMemoryView();
return {
id: view.getInt32(ptr, true),
age: view.getInt32(ptr + 4, true),
salary: view.getFloat64(ptr + 8, true)
};
}
// Allocate person
allocatePerson(person) {
const ptr = this.instance.exports.malloc(16);
this.writePerson(ptr, person);
return ptr;
}
}
// Example usage
const wasmCode = `
(module
(memory 1)
(export "memory" (memory 0))
(global $heapPtr (mut i32) (i32.const 0))
(func $malloc (param $size i32) (result i32)
(local $ptr i32)
(local.set $ptr (global.get $heapPtr))
(global.set $heapPtr
(i32.add (global.get $heapPtr) (local.get $size))
)
(local.get $ptr)
)
;; Calculate bonus (10% of salary)
(func $calculateBonus (param $personPtr i32) (result f64)
(f64.mul
(f64.load offset=8 (local.get $personPtr)) ;; Load salary
(f64.const 0.1)
)
)
;; Increment age
(func $incrementAge (param $personPtr i32)
(i32.store offset=4
(local.get $personPtr)
(i32.add
(i32.load offset=4 (local.get $personPtr))
(i32.const 1)
)
)
)
(export "malloc" (func $malloc))
(export "calculateBonus" (func $calculateBonus))
(export "incrementAge" (func $incrementAge))
)
`;
const instance = await loadWasmFromText(wasmCode);
const structHelper = new WasmStructHelper(instance);
// Create person
const person = {
id: 101,
age: 30,
salary: 75000.0
};
// Allocate in Wasm
const personPtr = structHelper.allocatePerson(person);
// Calculate bonus
const bonus = instance.exports.calculateBonus(personPtr);
console.log('Bonus:', bonus); // 7500
// Increment age
instance.exports.incrementAge(personPtr);
// Read updated person
const updated = structHelper.readPerson(personPtr);
console.log('Updated:', updated); // { id: 101, age: 31, salary: 75000 }Importing JavaScript Functions
Basic Function Import
const importObject = {
env: {
// Simple logging
log: (value) => {
console.log('Wasm log:', value);
},
// Math operations
randomFloat: () => Math.random(),
getCurrentTime: () => Date.now(),
// Assertions
assert: (condition) => {
if (!condition) {
throw new Error('Assertion failed');
}
}
}
};
const wasmCode = `
(module
(import "env" "log" (func $log (param i32)))
(import "env" "randomFloat" (func $randomFloat (result f64)))
(import "env" "getCurrentTime" (func $getCurrentTime (result f64)))
(import "env" "assert" (func $assert (param i32)))
(func $test
;; Log a value
(call $log (i32.const 42))
;; Get random number
(local $rand f64)
(local.set $rand (call $randomFloat))
;; Assert it's in range [0, 1)
(call $assert
(f64.lt (local.get $rand) (f64.const 1.0))
)
;; Get timestamp
(drop (call $getCurrentTime))
)
(export "test" (func $test))
)
`;
const instance = await loadWasmFromText(wasmCode, importObject);
instance.exports.test();Callback Pattern
// JavaScript provides callback, Wasm calls it
class WasmWithCallbacks {
constructor() {
this.callbacks = new Map();
this.nextId = 0;
}
createImportObject() {
return {
env: {
// Register callback
registerCallback: (callbackId) => {
console.log('Callback registered:', callbackId);
},
// Invoke callback
invokeCallback: (callbackId, value) => {
const callback = this.callbacks.get(callbackId);
if (callback) {
return callback(value);
}
return 0;
}
}
};
}
// JavaScript registers callback
onEvent(callback) {
const id = this.nextId++;
this.callbacks.set(id, callback);
return id;
}
}
const manager = new WasmWithCallbacks();
const wasmCode = `
(module
(import "env" "registerCallback" (func $registerCallback (param i32)))
(import "env" "invokeCallback" (func $invokeCallback (param i32 i32) (result i32)))
;; Process data with callback
(func $processWithCallback (param $callbackId i32) (param $data i32) (result i32)
;; Do some processing
(local $processed i32)
(local.set $processed (i32.mul (local.get $data) (i32.const 2)))
;; Invoke JavaScript callback
(call $invokeCallback
(local.get $callbackId)
(local.get $processed)
)
)
(export "processWithCallback" (func $processWithCallback))
)
`;
const instance = await loadWasmFromText(
wasmCode,
manager.createImportObject()
);
// Register callback
const callbackId = manager.onEvent((value) => {
console.log('Callback invoked with:', value);
return value + 10;
});
// Process data
const result = instance.exports.processWithCallback(callbackId, 5);
console.log('Final result:', result); // Callback invoked with: 10, Final: 20Exporting to JavaScript
Exporting Functions
const wasmCode = `
(module
;; Math utilities
(func $add (param $a i32) (param $b i32) (result i32)
(i32.add (local.get $a) (local.get $b))
)
(func $multiply (param $a f64) (param $b f64) (result f64)
(f64.mul (local.get $a) (local.get $b))
)
;; Export with same name
(export "add" (func $add))
;; Export with different name
(export "mul" (func $multiply))
)
`;
const instance = await loadWasmFromText(wasmCode);
// Access exports
console.log(instance.exports.add(5, 3)); // 8
console.log(instance.exports.mul(2.5, 4.0)); // 10Exporting Memory
const wasmCode = `
(module
(memory 2)
(export "memory" (memory 0))
;; Initialize with data
(data (i32.const 0) "WebAssembly")
(func $getDataPtr (result i32)
(i32.const 0)
)
(export "getDataPtr" (func $getDataPtr))
)
`;
const instance = await loadWasmFromText(wasmCode);
// Access exported memory
const memory = instance.exports.memory;
const view = new Uint8Array(memory.buffer);
const ptr = instance.exports.getDataPtr();
const decoder = new TextDecoder();
const text = decoder.decode(view.subarray(ptr, ptr + 11));
console.log(text); // "WebAssembly"Exporting Globals
const wasmCode = `
(module
;; Immutable global (constant)
(global $VERSION i32 (i32.const 100))
(export "VERSION" (global $VERSION))
;; Mutable global (state)
(global $counter (mut i32) (i32.const 0))
(export "counter" (global $counter))
(func $increment
(global.set $counter
(i32.add (global.get $counter) (i32.const 1))
)
)
(export "increment" (func $increment))
)
`;
const instance = await loadWasmFromText(wasmCode);
// Read constant
console.log('Version:', instance.exports.VERSION.value); // 100
// Read/write mutable global
console.log('Counter:', instance.exports.counter.value); // 0
instance.exports.increment();
console.log('Counter:', instance.exports.counter.value); // 1
// Set from JavaScript
instance.exports.counter.value = 42;
console.log('Counter:', instance.exports.counter.value); // 42Exporting Tables
const wasmCode = `
(module
(type $binop (func (param i32 i32) (result i32)))
(func $add (param i32 i32) (result i32)
(i32.add (local.get 0) (local.get 1))
)
(func $subtract (param i32 i32) (result i32)
(i32.sub (local.get 0) (local.get 1))
)
(func $multiply (param i32 i32) (result i32)
(i32.mul (local.get 0) (local.get 1))
)
;; Function table
(table $ops 3 funcref)
(elem (i32.const 0) $add $subtract $multiply)
(export "ops" (table $ops))
;; Indirect call wrapper
(func $calculate (param $op i32) (param $a i32) (param $b i32) (result i32)
(call_indirect (type $binop)
(local.get $a)
(local.get $b)
(local.get $op)
)
)
(export "calculate" (func $calculate))
)
`;
const instance = await loadWasmFromText(wasmCode);
// Call via table index
console.log(instance.exports.calculate(0, 10, 5)); // 15 (add)
console.log(instance.exports.calculate(1, 10, 5)); // 5 (subtract)
console.log(instance.exports.calculate(2, 10, 5)); // 50 (multiply)
// Access table from JavaScript
const table = instance.exports.ops;
console.log('Table length:', table.length); // 3
// Get function from table
const addFunc = table.get(0);
console.log('Direct call:', addFunc(10, 5)); // 15Performance Considerations
Minimizing Boundary Crossings
❌ Bad: Frequent JS ↔︎ Wasm calls:
// Inefficient: Call Wasm for each element
for (let i = 0; i < 10000; i++) {
result[i] = wasmInstance.exports.process(data[i]);
}✅ Good: Batch processing:
// Efficient: Single call with array
const { ptr: dataPtr } = allocateInt32Array(data);
const { ptr: resultPtr } = allocateInt32Array(new Array(10000));
wasmInstance.exports.processArray(dataPtr, resultPtr, 10000);
const result = readInt32Array(resultPtr, 10000);Memory Access Patterns
✅ Use TypedArrays for bulk operations:
// Fast: Direct memory access
const view = new Float64Array(memory.buffer, ptr, length);
for (let i = 0; i < length; i++) {
view[i] *= 2.0;
}
// Slower: Individual loads/stores through Wasm
for (let i = 0; i < length; i++) {
const value = instance.exports.load(ptr + i * 8);
instance.exports.store(ptr + i * 8, value * 2.0);
}String Encoding Optimization
✅ Cache encoder/decoder instances:
// Good: Reuse encoder/decoder
class StringHelper {
constructor() {
this.encoder = new TextEncoder();
this.decoder = new TextDecoder('utf-8');
}
}
// Bad: Create new instances each time
function encodeString(str) {
return new TextEncoder().encode(str); // Wasteful
}Avoid Memory Growth During Hot Paths
// Pre-allocate sufficient memory
const memory = new WebAssembly.Memory({
initial: 100, // Start with 6.4 MB
maximum: 1000 // Max 64 MB
});
// Growth invalidates all TypedArray views!
// Avoid growing during performance-critical operationsError Handling
Catching Wasm Traps
try {
// This might trap (divide by zero, out of bounds, etc.)
const result = instance.exports.divide(10, 0);
} catch (error) {
if (error instanceof WebAssembly.RuntimeError) {
console.error('Wasm runtime error:', error.message);
// Handle trap
} else {
throw error;
}
}Custom Error Handling
// Import error handler
const importObject = {
env: {
throwError: (code) => {
const errors = {
1: 'Invalid input',
2: 'Out of bounds',
3: 'Division by zero'
};
throw new Error(errors[code] || 'Unknown error');
}
}
};
const wasmCode = `
(module
(import "env" "throwError" (func $throwError (param i32)))
(func $safeDivide (param $a i32) (param $b i32) (result i32)
;; Check for division by zero
(if (i32.eqz (local.get $b))
(then
(call $throwError (i32.const 3))
(unreachable)
)
)
(i32.div_s (local.get $a) (local.get $b))
)
(export "safeDivide" (func $safeDivide))
)
`;
const instance = await loadWasmFromText(wasmCode, importObject);
try {
instance.exports.safeDivide(10, 0);
} catch (error) {
console.error('Error:', error.message); // "Division by zero"
}Advanced Patterns
Async Wasm Operations
// Wasm is synchronous, but we can wrap in async
class AsyncWasmWorker {
constructor(wasmInstance) {
this.instance = wasmInstance;
this.queue = [];
this.processing = false;
}
async compute(data) {
return new Promise((resolve, reject) => {
this.queue.push({ data, resolve, reject });
this.processQueue();
});
}
async processQueue() {
if (this.processing || this.queue.length === 0) return;
this.processing = true;
while (this.queue.length > 0) {
const { data, resolve, reject } = this.queue.shift();
try {
// Yield to event loop
await new Promise(r => setTimeout(r, 0));
// Compute in Wasm
const result = this.instance.exports.heavyComputation(data);
resolve(result);
} catch (error) {
reject(error);
}
}
this.processing = false;
}
}Web Workers with Wasm
// main.js
const worker = new Worker('wasm-worker.js');
worker.postMessage({
type: 'init',
wasmUrl: 'module.wasm'
});
worker.onmessage = (event) => {
if (event.data.type === 'result') {
console.log('Result:', event.data.value);
}
};
// Send work
worker.postMessage({
type: 'compute',
data: [1, 2, 3, 4, 5]
});
// wasm-worker.js
let wasmInstance;
self.onmessage = async (event) => {
if (event.data.type === 'init') {
const response = await fetch(event.data.wasmUrl);
const { instance } = await WebAssembly.instantiateStreaming(response);
wasmInstance = instance;
self.postMessage({ type: 'ready' });
}
if (event.data.type === 'compute') {
// Process in Wasm
const result = wasmInstance.exports.process(event.data.data);
self.postMessage({
type: 'result',
value: result
});
}
};SIMD Operations (Bonus)
// WebAssembly SIMD support (post-MVP feature)
const wasmCode = `
(module
(memory 1)
(export "memory" (memory 0))
;; SIMD vector addition (4x f32)
(func $addVec4 (param $a i32) (param $b i32) (param $result i32)
(v128.store
(local.get $result)
(f32x4.add
(v128.load (local.get $a))
(v128.load (local.get $b))
)
)
)
(export "addVec4" (func $addVec4))
)
`;
// Check for SIMD support
if (typeof WebAssembly.SIMD !== 'undefined') {
// Use SIMD version
} else {
// Fallback to scalar version
}Complete Example: Image Processing
// Image blur using WebAssembly
class ImageProcessor {
constructor(wasmInstance) {
this.instance = wasmInstance;
this.memory = wasmInstance.exports.memory;
}
// Load image data into Wasm memory
loadImageData(imageData) {
const { width, height, data } = imageData;
const length = data.length;
// Allocate memory
const ptr = this.instance.exports.malloc(length);
// Copy pixel data
const view = new Uint8ClampedArray(this.memory.buffer, ptr, length);
view.set(data);
return { ptr, width, height };
}
// Blur image
blur(imageData, radius = 5) {
const { ptr, width, height } = this.loadImageData(imageData);
// Allocate output buffer
const outputPtr = this.instance.exports.malloc(imageData.data.length);
// Apply blur in Wasm
this.instance.exports.boxBlur(
ptr,
outputPtr,
width,
height,
radius
);
// Read result
const resultView = new Uint8ClampedArray(
this.memory.buffer,
outputPtr,
imageData.data.length
);
// Create new ImageData
const result = new ImageData(
new Uint8ClampedArray(resultView),
width,
height
);
// Free memory
this.instance.exports.free(ptr);
this.instance.exports.free(outputPtr);
return result;
}
}
// Usage with Canvas
const canvas = document.getElementById('myCanvas');
const ctx = canvas.getContext('2d');
// Load Wasm module
const instance = await loadWasm('image-processor.wasm');
const processor = new ImageProcessor(instance);
// Get image data
const imageData = ctx.getImageData(0, 0, canvas.width, canvas.height);
// Process
const blurred = processor.blur(imageData, 5);
// Draw result
ctx.putImageData(blurred, 0, 0);Summary
JavaScript-WebAssembly interop enables powerful hybrid applications:
Key Concepts:
Loading:
instantiateStreaming()for modulesType mapping: Numbers, BigInt for i64
Memory: Shared linear memory via TypedArrays
Strings: Encode/decode via UTF-8 in memory
Arrays: Pass via memory pointers
Structures: Layout data according to C conventions
Imports: JavaScript functions callable from Wasm
Exports: Functions, memory, globals, tables
Performance Tips:
Minimize boundary crossings
Batch operations
Pre-allocate memory
Use TypedArrays for bulk access
Cache encoder/decoder instances
Best Practices:
Clear ownership of memory allocation
Proper error handling
Memory growth awareness
Use Web Workers for parallelism
The interop layer is where WebAssembly’s computational power meets JavaScript’s rich ecosystem, enabling applications that leverage the strengths of both platforms.
Chapter 16: Building a WebAssembly Compiler
Introduction: Compiling to WebAssembly
Building a compiler that targets WebAssembly
involves transforming a high-level source language into the
low-level WebAssembly binary format. This chapter
demonstrates building a complete compiler for a simple language that
generates valid .wasm modules.
Compiler Pipeline Overview:
Source Code ↓
┌│─────────────────┐
││ Lexical Analysis│ → Tokens └─────────────────┘ ↓
┌│─────────────────┐
││ Syntax Analysis │ → AST └─────────────────┘ ↓
┌│─────────────────┐
││ Semantic Check │ → Typed AST └─────────────────┘ ↓
┌│─────────────────┐
││ Code Generation │ → WAT (Text) └─────────────────┘ ↓
┌│─────────────────┐
││ WAT → Binary │ → .wasm file └─────────────────┘
Target Language: SimpleScript
We’ll compile a simplified language with these features:
Syntax:
// Variables and types
let x: i32 = 42;
let y: f64 = 3.14;
// Functions
fn add(a: i32, b: i32): i32 {
return a + b;
}
// Control flow
if (x > 10) {
return 1;
} else {
return 0;
}
// Loops
while (x < 100) {
x = x + 1;
}
// Export directive
@export
fn main(): i32 {
return add(5, 3);
}Supported Types: i32,
i64, f32, f64
Operations: +, -,
*, /, %, ==,
!=, <, >,
<=, >=
Phase 1: Lexical Analysis
// Token types
const TokenType = {
// Literals
NUMBER: 'NUMBER',
IDENTIFIER: 'IDENTIFIER',
// Keywords
LET: 'LET',
FN: 'FN',
RETURN: 'RETURN',
IF: 'IF',
ELSE: 'ELSE',
WHILE: 'WHILE',
// Types
I32: 'I32',
I64: 'I64',
F32: 'F32',
F64: 'F64',
// Symbols
COLON: 'COLON',
SEMICOLON: 'SEMICOLON',
COMMA: 'COMMA',
LPAREN: 'LPAREN',
RPAREN: 'RPAREN',
LBRACE: 'LBRACE',
RBRACE: 'RBRACE',
EQUALS: 'EQUALS',
PLUS: 'PLUS',
MINUS: 'MINUS',
STAR: 'STAR',
SLASH: 'SLASH',
PERCENT: 'PERCENT',
// Comparisons
EQ_EQ: 'EQ_EQ',
NOT_EQ: 'NOT_EQ',
LT: 'LT',
GT: 'GT',
LT_EQ: 'LT_EQ',
GT_EQ: 'GT_EQ',
// Special
AT: 'AT',
EXPORT: 'EXPORT',
EOF: 'EOF'
};
class Token {
constructor(type, value, line, column) {
this.type = type;
this.value = value;
this.line = line;
this.column = column;
}
}
class Lexer {
constructor(source) {
this.source = source;
this.pos = 0;
this.line = 1;
this.column = 1;
this.keywords = {
'let': TokenType.LET,
'fn': TokenType.FN,
'return': TokenType.RETURN,
'if': TokenType.IF,
'else': TokenType.ELSE,
'while': TokenType.WHILE,
'i32': TokenType.I32,
'i64': TokenType.I64,
'f32': TokenType.F32,
'f64': TokenType.F64,
'export': TokenType.EXPORT
};
}
current() {
return this.source[this.pos];
}
peek(offset = 1) {
return this.source[this.pos + offset];
}
advance() {
const ch = this.current();
this.pos++;
if (ch === '\n') {
this.line++;
this.column = 1;
} else {
this.column++;
}
return ch;
}
skipWhitespace() {
while (this.pos < this.source.length) {
const ch = this.current();
if (ch === ' ' || ch === '\t' || ch === '\n' || ch === '\r') {
this.advance();
} else if (ch === '/' && this.peek() === '/') {
// Skip line comment
while (this.current() !== '\n' && this.pos < this.source.length) {
this.advance();
}
} else if (ch === '/' && this.peek() === '*') {
// Skip block comment
this.advance(); // /
this.advance(); // *
while (this.pos < this.source.length) {
if (this.current() === '*' && this.peek() === '/') {
this.advance(); // *
this.advance(); // /
break;
}
this.advance();
}
} else {
break;
}
}
}
readNumber() {
const start = this.pos;
const startColumn = this.column;
let isFloat = false;
while (this.pos < this.source.length) {
const ch = this.current();
if (ch >= '0' && ch <= '9') {
this.advance();
} else if (ch === '.' && !isFloat) {
isFloat = true;
this.advance();
} else {
break;
}
}
const value = this.source.substring(start, this.pos);
return new Token(
TokenType.NUMBER,
isFloat ? parseFloat(value) : parseInt(value),
this.line,
startColumn
);
}
readIdentifier() {
const start = this.pos;
const startColumn = this.column;
while (this.pos < this.source.length) {
const ch = this.current();
if ((ch >= 'a' && ch <= 'z') ||
(ch >= 'A' && ch <= 'Z') ||
(ch >= '0' && ch <= '9') ||
ch === '_') {
this.advance();
} else {
break;
}
}
const value = this.source.substring(start, this.pos);
const type = this.keywords[value] || TokenType.IDENTIFIER;
return new Token(type, value, this.line, startColumn);
}
nextToken() {
this.skipWhitespace();
if (this.pos >= this.source.length) {
return new Token(TokenType.EOF, null, this.line, this.column);
}
const ch = this.current();
const column = this.column;
// Numbers
if (ch >= '0' && ch <= '9') {
return this.readNumber();
}
// Identifiers and keywords
if ((ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z') || ch === '_') {
return this.readIdentifier();
}
// Two-character operators
if (ch === '=' && this.peek() === '=') {
this.advance();
this.advance();
return new Token(TokenType.EQ_EQ, '==', this.line, column);
}
if (ch === '!' && this.peek() === '=') {
this.advance();
this.advance();
return new Token(TokenType.NOT_EQ, '!=', this.line, column);
}
if (ch === '<' && this.peek() === '=') {
this.advance();
this.advance();
return new Token(TokenType.LT_EQ, '<=', this.line, column);
}
if (ch === '>' && this.peek() === '=') {
this.advance();
this.advance();
return new Token(TokenType.GT_EQ, '>=', this.line, column);
}
// Single-character tokens
const single = {
':': TokenType.COLON,
';': TokenType.SEMICOLON,
',': TokenType.COMMA,
'(': TokenType.LPAREN,
')': TokenType.RPAREN,
'{': TokenType.LBRACE,
'}': TokenType.RBRACE,
'=': TokenType.EQUALS,
'+': TokenType.PLUS,
'-': TokenType.MINUS,
'*': TokenType.STAR,
'/': TokenType.SLASH,
'%': TokenType.PERCENT,
'<': TokenType.LT,
'>': TokenType.GT,
'@': TokenType.AT
};
if (ch in single) {
this.advance();
return new Token(single[ch], ch, this.line, column);
}
throw new Error(`Unexpected character '${ch}' at ${this.line}:${column}`);
}
tokenize() {
const tokens = [];
while (true) {
const token = this.nextToken();
tokens.push(token);
if (token.type === TokenType.EOF) break;
}
return tokens;
}
}Phase 2: Syntax Analysis (Parser)
// AST Node types
class ASTNode {
constructor(type, line, column) {
this.type = type;
this.line = line;
this.column = column;
}
}
class Program extends ASTNode {
constructor(declarations) {
super('Program', 1, 1);
this.declarations = declarations;
}
}
class FunctionDeclaration extends ASTNode {
constructor(name, params, returnType, body, isExport, line, column) {
super('FunctionDeclaration', line, column);
this.name = name;
this.params = params; // [{ name, type }]
this.returnType = returnType;
this.body = body;
this.isExport = isExport;
}
}
class VariableDeclaration extends ASTNode {
constructor(name, varType, initializer, line, column) {
super('VariableDeclaration', line, column);
this.name = name;
this.varType = varType;
this.initializer = initializer;
}
}
class ReturnStatement extends ASTNode {
constructor(expression, line, column) {
super('ReturnStatement', line, column);
this.expression = expression;
}
}
class IfStatement extends ASTNode {
constructor(condition, thenBranch, elseBranch, line, column) {
super('IfStatement', line, column);
this.condition = condition;
this.thenBranch = thenBranch;
this.elseBranch = elseBranch;
}
}
class WhileStatement extends ASTNode {
constructor(condition, body, line, column) {
super('WhileStatement', line, column);
this.condition = condition;
this.body = body;
}
}
class BlockStatement extends ASTNode {
constructor(statements, line, column) {
super('BlockStatement', line, column);
this.statements = statements;
}
}
class ExpressionStatement extends ASTNode {
constructor(expression, line, column) {
super('ExpressionStatement', line, column);
this.expression = expression;
}
}
class BinaryExpression extends ASTNode {
constructor(operator, left, right, line, column) {
super('BinaryExpression', line, column);
this.operator = operator;
this.left = left;
this.right = right;
}
}
class AssignmentExpression extends ASTNode {
constructor(name, value, line, column) {
super('AssignmentExpression', line, column);
this.name = name;
this.value = value;
}
}
class CallExpression extends ASTNode {
constructor(callee, args, line, column) {
super('CallExpression', line, column);
this.callee = callee;
this.args = args;
}
}
class Identifier extends ASTNode {
constructor(name, line, column) {
super('Identifier', line, column);
this.name = name;
}
}
class Literal extends ASTNode {
constructor(value, valueType, line, column) {
super('Literal', line, column);
this.value = value;
this.valueType = valueType; // 'i32', 'f64', etc.
}
}
class Parser {
constructor(tokens) {
this.tokens = tokens;
this.pos = 0;
}
current() {
return this.tokens[this.pos];
}
peek(offset = 1) {
return this.tokens[this.pos + offset];
}
advance() {
return this.tokens[this.pos++];
}
expect(type) {
const token = this.current();
if (token.type !== type) {
throw new Error(
`Expected ${type} but got ${token.type} at ${token.line}:${token.column}`
);
}
return this.advance();
}
match(...types) {
return types.includes(this.current().type);
}
// Parse entry point
parse() {
const declarations = [];
while (this.current().type !== TokenType.EOF) {
declarations.push(this.parseDeclaration());
}
return new Program(declarations);
}
parseDeclaration() {
// Check for @export annotation
let isExport = false;
if (this.match(TokenType.AT)) {
this.advance();
this.expect(TokenType.EXPORT);
isExport = true;
}
if (this.match(TokenType.FN)) {
return this.parseFunctionDeclaration(isExport);
}
throw new Error(`Unexpected token at ${this.current().line}:${this.current().column}`);
}
parseFunctionDeclaration(isExport) {
const fnToken = this.expect(TokenType.FN);
const name = this.expect(TokenType.IDENTIFIER).value;
this.expect(TokenType.LPAREN);
const params = this.parseParameterList();
this.expect(TokenType.RPAREN);
this.expect(TokenType.COLON);
const returnType = this.parseType();
const body = this.parseBlockStatement();
return new FunctionDeclaration(
name,
params,
returnType,
body,
isExport,
fnToken.line,
fnToken.column
);
}
parseParameterList() {
const params = [];
if (this.match(TokenType.RPAREN)) {
return params;
}
do {
const name = this.expect(TokenType.IDENTIFIER).value;
this.expect(TokenType.COLON);
const type = this.parseType();
params.push({ name, type });
if (this.match(TokenType.COMMA)) {
this.advance();
} else {
break;
}
} while (true);
return params;
}
parseType() {
const token = this.advance();
if ([TokenType.I32, TokenType.I64, TokenType.F32, TokenType.F64].includes(token.type)) {
return token.value;
}
throw new Error(`Invalid type at ${token.line}:${token.column}`);
}
parseBlockStatement() {
const lbrace = this.expect(TokenType.LBRACE);
const statements = [];
while (!this.match(TokenType.RBRACE) && !this.match(TokenType.EOF)) {
statements.push(this.parseStatement());
}
this.expect(TokenType.RBRACE);
return new BlockStatement(statements, lbrace.line, lbrace.column);
}
parseStatement() {
if (this.match(TokenType.LET)) {
return this.parseVariableDeclaration();
}
if (this.match(TokenType.RETURN)) {
return this.parseReturnStatement();
}
if (this.match(TokenType.IF)) {
return this.parseIfStatement();
}
if (this.match(TokenType.WHILE)) {
return this.parseWhileStatement();
}
if (this.match(TokenType.LBRACE)) {
return this.parseBlockStatement();
}
return this.parseExpressionStatement();
}
parseVariableDeclaration() {
const letToken = this.expect(TokenType.LET);
const name = this.expect(TokenType.IDENTIFIER).value;
this.expect(TokenType.COLON);
const varType = this.parseType();
let initializer = null;
if (this.match(TokenType.EQUALS)) {
this.advance();
initializer = this.parseExpression();
}
this.expect(TokenType.SEMICOLON);
return new VariableDeclaration(
name,
varType,
initializer,
letToken.line,
letToken.column
);
}
parseReturnStatement() {
const returnToken = this.expect(TokenType.RETURN);
let expression = null;
if (!this.match(TokenType.SEMICOLON)) {
expression = this.parseExpression();
}
this.expect(TokenType.SEMICOLON);
return new ReturnStatement(expression, returnToken.line, returnToken.column);
}
parseIfStatement() {
const ifToken = this.expect(TokenType.IF);
this.expect(TokenType.LPAREN);
const condition = this.parseExpression();
this.expect(TokenType.RPAREN);
const thenBranch = this.parseStatement();
let elseBranch = null;
if (this.match(TokenType.ELSE)) {
this.advance();
elseBranch = this.parseStatement();
}
return new IfStatement(
condition,
thenBranch,
elseBranch,
ifToken.line,
ifToken.column
);
}
parseWhileStatement() {
const whileToken = this.expect(TokenType.WHILE);
this.expect(TokenType.LPAREN);
const condition = this.parseExpression();
this.expect(TokenType.RPAREN);
const body = this.parseStatement();
return new WhileStatement(condition, body, whileToken.line, whileToken.column);
}
parseExpressionStatement() {
const expression = this.parseExpression();
this.expect(TokenType.SEMICOLON);
return new ExpressionStatement(
expression,
expression.line,
expression.column
);
}
parseExpression() {
return this.parseAssignment();
}
parseAssignment() {
const expr = this.parseComparison();
if (this.match(TokenType.EQUALS)) {
this.advance();
const value = this.parseAssignment();
if (expr instanceof Identifier) {
return new AssignmentExpression(
expr.name,
value,
expr.line,
expr.column
);
}
throw new Error('Invalid assignment target');
}
return expr;
}
parseComparison() {
let left = this.parseAdditive();
while (this.match(TokenType.EQ_EQ, TokenType.NOT_EQ,
TokenType.LT, TokenType.GT,
TokenType.LT_EQ, TokenType.GT_EQ)) {
const operator = this.advance().type;
const right = this.parseAdditive();
left = new BinaryExpression(
operator,
left,
right,
left.line,
left.column
);
}
return left;
}
parseAdditive() {
let left = this.parseMultiplicative();
while (this.match(TokenType.PLUS, TokenType.MINUS)) {
const operator = this.advance().type;
const right = this.parseMultiplicative();
left = new BinaryExpression(
operator,
left,
right,
left.line,
left.column
);
}
return left;
}
parseMultiplicative() {
let left = this.parsePrimary();
while (this.match(TokenType.STAR, TokenType.SLASH, TokenType.PERCENT)) {
const operator = this.advance().type;
const right = this.parsePrimary();
left = new BinaryExpression(
operator,
left,
right,
left.line,
left.column
);
}
return left;
}
parsePrimary() {
// Number literal
if (this.match(TokenType.NUMBER)) {
const token = this.advance();
const valueType = Number.isInteger(token.value) ? 'i32' : 'f64';
return new Literal(token.value, valueType, token.line, token.column);
}
// Identifier or function call
if (this.match(TokenType.IDENTIFIER)) {
const token = this.advance();
if (this.match(TokenType.LPAREN)) {
return this.parseCallExpression(token);
}
return new Identifier(token.value, token.line, token.column);
}
// Parenthesized expression
if (this.match(TokenType.LPAREN)) {
this.advance();
const expr = this.parseExpression();
this.expect(TokenType.RPAREN);
return expr;
}
throw new Error(`Unexpected token at ${this.current().line}:${this.current().column}`);
}
parseCallExpression(nameToken) {
this.expect(TokenType.LPAREN);
const args = [];
if (!this.match(TokenType.RPAREN)) {
do {
args.push(this.parseExpression());
if (this.match(TokenType.COMMA)) {
this.advance();
} else {
break;
}
} while (true);
}
this.expect(TokenType.RPAREN);
return new CallExpression(
nameToken.value,
args,
nameToken.line,
nameToken.column
);
}
}Phase 3: Semantic Analysis
class SemanticAnalyzer {
constructor(ast) {
this.ast = ast;
this.scopes = [new Map()]; // Stack of scopes
this.currentFunction = null;
this.errors = [];
}
analyze() {
this.visitProgram(this.ast);
if (this.errors.length > 0) {
throw new Error('Semantic errors:\n' + this.errors.join('\n'));
}
return this.ast;
}
error(message, node) {
this.errors.push(`${message} at ${node.line}:${node.column}`);
}
pushScope() {
this.scopes.push(new Map());
}
popScope() {
this.scopes.pop();
}
defineSymbol(name, info, node) {
const scope = this.scopes[this.scopes.length - 1];
if (scope.has(name)) {
this.error(`Redeclaration of '${name}'`, node);
}
scope.set(name, info);
}
lookupSymbol(name) {
for (let i = this.scopes.length - 1; i >= 0; i--) {
if (this.scopes[i].has(name)) {
return this.scopes[i].get(name);
}
}
return null;
}
visitProgram(node) {
// First pass: collect all function signatures
for (const decl of node.declarations) {
if (decl instanceof FunctionDeclaration) {
const paramTypes = decl.params.map(p => p.type);
this.defineSymbol(decl.name, {
kind: 'function',
paramTypes,
returnType: decl.returnType
}, decl);
}
}
// Second pass: visit function bodies
for (const decl of node.declarations) {
this.visitFunctionDeclaration(decl);
}
}
visitFunctionDeclaration(node) {
this.currentFunction = node;
this.pushScope();
// Define parameters
for (const param of node.params) {
this.defineSymbol(param.name, {
kind: 'variable',
type: param.type
}, node);
}
this.visitBlockStatement(node.body);
this.popScope();
this.currentFunction = null;
}
visitBlockStatement(node) {
this.pushScope();
for (const stmt of node.statements) {
this.visitStatement(stmt);
}
this.popScope();
}
visitStatement(node) {
if (node instanceof VariableDeclaration) {
return this.visitVariableDeclaration(node);
}
if (node instanceof ReturnStatement) {
return this.visitReturnStatement(node);
}
if (node instanceof IfStatement) {
return this.visitIfStatement(node);
}
if (node instanceof WhileStatement) {
return this.visitWhileStatement(node);
}
if (node instanceof BlockStatement) {
return this.visitBlockStatement(node);
}
if (node instanceof ExpressionStatement) {
return this.visitExpression(node.expression);
}
}
visitVariableDeclaration(node) {
if (node.initializer) {
const initType = this.visitExpression(node.initializer);
// Check type compatibility
if (initType !== node.varType) {
this.error(
`Type mismatch: expected ${node.varType}, got ${initType}`,
node
);
}
}
this.defineSymbol(node.name, {
kind: 'variable',
type: node.varType
}, node);
}
visitReturnStatement(node) {
if (node.expression) {
const exprType = this.visitExpression(node.expression);
if (exprType !== this.currentFunction.returnType) {
this.error(
`Return type mismatch: expected ${this.currentFunction.returnType}, got ${exprType}`,
node
);
}
}
}
visitIfStatement(node) {
this.visitExpression(node.condition);
this.visitStatement(node.thenBranch);
if (node.elseBranch) {
this.visitStatement(node.elseBranch);
}
}
visitWhileStatement(node) {
this.visitExpression(node.condition);
this.visitStatement(node.body);
}
visitExpression(node) {
if (node instanceof Literal) {
return node.valueType;
}
if (node instanceof Identifier) {
const symbol = this.lookupSymbol(node.name);
if (!symbol) {
this.error(`Undefined variable '${node.name}'`, node);
return 'i32'; // Default for error recovery
}
return symbol.type;
}
if (node instanceof BinaryExpression) {
const leftType = this.visitExpression(node.left);
const rightType = this.visitExpression(node.right);
if (leftType !== rightType) {
this.error(
`Type mismatch in binary operation: ${leftType} vs ${rightType}`,
node
);
}
// Comparison operators return i32 (boolean)
if ([TokenType.EQ_EQ, TokenType.NOT_EQ, TokenType.LT,
TokenType.GT, TokenType.LT_EQ, TokenType.GT_EQ].includes(node.operator)) {
return 'i32';
}
return leftType;
}
if (node instanceof AssignmentExpression) {
const symbol = this.lookupSymbol(node.name);
if (!symbol) {
this.error(`Undefined variable '${node.name}'`, node);
return 'i32';
}
const valueType = this.visitExpression(node.value);
if (valueType !== symbol.type) {
this.error(
`Type mismatch in assignment: expected ${symbol.type}, got ${valueType}`,
node
);
}
return symbol.type;
}
if (node instanceof CallExpression) {
const funcSymbol = this.lookupSymbol(node.callee);
if (!funcSymbol || funcSymbol.kind !== 'function') {
this.error(`Undefined function '${node.callee}'`, node);
return 'i32';
}
// Check argument count
if (node.args.length !== funcSymbol.paramTypes.length) {
this.error(
`Expected ${funcSymbol.paramTypes.length} arguments, got ${node.args.length}`,
node
);
}
// Check argument types
for (let i = 0; i < node.args.length; i++) {
const argType = this.visitExpression(node.args[i]);
const expectedType = funcSymbol.paramTypes[i];
if (argType !== expectedType) {
this.error(
`Argument ${i + 1} type mismatch: expected ${expectedType}, got ${argType}`,
node
);
}
}
return funcSymbol.returnType;
}
return 'i32';
}
}Phase 4: Code Generation (WAT)
class WATGenerator {
constructor(ast) {
this.ast = ast;
this.output = [];
this.indent = 0;
this.localIndex = 0;
this.locals = new Map();
}
generate() {
this.emit('(module');
this.indent++;
for (const decl of this.ast.declarations) {
if (decl instanceof FunctionDeclaration) {
this.generateFunction(decl);
}
}
this.indent--;
this.emit(')');
return this.output.join('\n');
}
emit(code) {
const spaces = ' '.repeat(this.indent);
this.output.push(spaces + code);
}
generateFunction(node) {
this.locals = new Map();
this.localIndex = 0;
// Allocate parameter indices
for (const param of node.params) {
this.locals.set(param.name, this.localIndex++);
}
// Start function
this.emit(`(func $${node.name}`);
this.indent++;
// Parameters
for (const param of node.params) {
this.emit(`(param $${param.name} ${param.type})`);
}
// Return type
if (node.returnType !== 'void') {
this.emit(`(result ${node.returnType})`);
}
// Collect local variables
const localVars = this.collectLocals(node.body);
for (const [name, type] of localVars) {
this.emit(`(local $${name} ${type})`);
this.locals.set(name, this.localIndex++);
}
// Function body
this.generateBlockStatement(node.body, false);
this.indent--;
this.emit(')');
// Export if needed
if (node.isExport) {
this.emit(`(export "${node.name}" (func $${node.name}))`);
}
}
collectLocals(block) {
const locals = [];
for (const stmt of block.statements) {
if (stmt instanceof VariableDeclaration) {
locals.push([stmt.name, stmt.varType]);
} else if (stmt instanceof BlockStatement) {
locals.push(...this.collectLocals(stmt));
}
}
return locals;
}
generateBlockStatement(node, emitBlock = true) {
if (emitBlock) {
this.emit('(block');
this.indent++;
}
for (const stmt of node.statements) {
this.generateStatement(stmt);
}
if (emitBlock) {
this.indent--;
this.emit(')');
}
}
generateStatement(node) {
if (node instanceof VariableDeclaration) {
return this.generateVariableDeclaration(node);
}
if (node instanceof ReturnStatement) {
return this.generateReturnStatement(node);
}
if (node instanceof IfStatement) {
return this.generateIfStatement(node);
}
if (node instanceof WhileStatement) {
return this.generateWhileStatement(node);
}
if (node instanceof BlockStatement) {
return this.generateBlockStatement(node, true);
}
if (node instanceof ExpressionStatement) {
this.generateExpression(node.expression);
this.emit('(drop)'); // Discard expression result
}
}
generateVariableDeclaration(node) {
if (node.initializer) {
this.generateExpression(node.initializer);
this.emit(`(local.set $${node.name})`);
}
}
generateReturnStatement(node) {
if (node.expression) {
this.generateExpression(node.expression);
}
this.emit('(return)');
}
generateIfStatement(node) {
this.emit('(if');
this.indent++;
// Condition
this.generateExpression(node.condition);
// Then branch
this.emit('(then');
this.indent++;
this.generateStatement(node.thenBranch);
this.indent--;
this.emit(')');
// Else branch
if (node.elseBranch) {
this.emit('(else');
this.indent++;
this.generateStatement(node.elseBranch);
this.indent--;
this.emit(')');
}
this.indent--;
this.emit(')');
}
generateWhileStatement(node) {
this.emit('(block $break');
this.indent++;
this.emit('(loop $continue');
this.indent++;
// Check condition
this.generateExpression(node.condition);
this.emit('(i32.eqz)');
this.emit('(br_if $break)');
// Body
this.generateStatement(node.body);
// Continue loop
this.emit('(br $continue)');
this.indent--;
this.emit(')');
this.indent--;
this.emit(')');
}
generateExpression(node) {
if (node instanceof Literal) {
const instruction = node.valueType === 'i32' ? 'i32.const' :
node.valueType === 'i64' ? 'i64.const' :
node.valueType === 'f32' ? 'f32.const' : 'f64.const';
this.emit(`(${instruction} ${node.value})`);
} else if (node instanceof Identifier) {
this.emit(`(local.get $${node.name})`);
} else if (node instanceof BinaryExpression) {
this.generateBinaryExpression(node);
} else if (node instanceof AssignmentExpression) {
this.generateExpression(node.value);
this.emit(`(local.set $${node.name})`);
this.emit(`(local.get $${node.name})`); // Assignment is an expression
} else if (node instanceof CallExpression) {
// Push arguments
for (const arg of node.args) {
this.generateExpression(arg);
}
this.emit(`(call $${node.callee})`);
}
}
generateBinaryExpression(node) {
this.generateExpression(node.left);
this.generateExpression(node.right);
// Determine type (assume left operand type)
const type = this.getExpressionType(node.left);
const opMap = {
[TokenType.PLUS]: `${type}.add`,
[TokenType.MINUS]: `${type}.sub`,
[TokenType.STAR]: `${type}.mul`,
[TokenType.SLASH]: type.startsWith('i') ? `${type}.div_s` : `${type}.div`,
[TokenType.PERCENT]: `${type}.rem_s`,
[TokenType.EQ_EQ]: `${type}.eq`,
[TokenType.NOT_EQ]: `${type}.ne`,
[TokenType.LT]: type.startsWith('i') ? `${type}.lt_s` : `${type}.lt`,
[TokenType.GT]: type.startsWith('i') ? `${type}.gt_s` : `${type}.gt`,
[TokenType.LT_EQ]: type.startsWith('i') ? `${type}.le_s` : `${type}.le`,
[TokenType.GT_EQ]: type.startsWith('i') ? `${type}.ge_s` : `${type}.ge`
};
this.emit(`(${opMap[node.operator]})`);
}
getExpressionType(node) {
if (node instanceof Literal) {
return node.valueType;
}
return 'i32'; // Default
}
}Phase 5: Binary Generation
const wabt = require('wabt')(); // WebAssembly Binary Toolkit
class Compiler {
compile(source) {
try {
// Lexical analysis
const lexer = new Lexer(source);
const tokens = lexer.tokenize();
console.log('✓ Lexical analysis complete');
// Syntax analysis
const parser = new Parser(tokens);
const ast = parser.parse();
console.log('✓ Syntax analysis complete');
// Semantic analysis
const analyzer = new SemanticAnalyzer(ast);
analyzer.analyze();
console.log('✓ Semantic analysis complete');
// Code generation
const generator = new WATGenerator(ast);
const wat = generator.generate();
console.log('✓ Code generation complete');
console.log('\nGenerated WAT:\n');
console.log(wat);
return wat;
} catch (error) {
console.error('Compilation error:', error.message);
throw error;
}
}
async compileToWasm(source) {
const wat = this.compile(source);
// Convert WAT to WASM binary
const wasmModule = wabt.parseWat('module.wat', wat);
const { buffer } = wasmModule.toBinary({});
return buffer;
}
}Complete Example
// Example source code
const source = `
@export
fn fibonacci(n: i32): i32 {
if (n <= 1) {
return n;
} else {
return fibonacci(n - 1) + fibonacci(n - 2);
}
}
@export
fn factorial(n: i32): i32 {
let result: i32 = 1;
let i: i32 = 2;
while (i <= n) {
result = result * i;
i = i + 1;
}
return result;
}
@export
fn sumSquares(a: i32, b: i32): i32 {
return (a * a) + (b * b);
}
`;
// Compile
const compiler = new Compiler();
async function main() {
try {
const wasmBuffer = await compiler.compileToWasm(source);
// Save to file (Node.js)
const fs = require('fs');
fs.writeFileSync('output.wasm', wasmBuffer);
console.log('\n✓ Binary written to output.wasm');
// Load and test
const wasmModule = await WebAssembly.compile(wasmBuffer);
const instance = await WebAssembly.instantiate(wasmModule);
console.log('\nTesting compiled functions:');
console.log('fibonacci(10) =', instance.exports.fibonacci(10));
console.log('factorial(5) =', instance.exports.factorial(5));
console.log('sumSquares(3, 4) =', instance.exports.sumSquares(3, 4));
} catch (error) {
console.error('Error:', error);
}
}
main();Output: ✓ Lexical analysis complete ✓ Syntax analysis complete ✓ Semantic analysis complete ✓ Code generation complete
Generated WAT:
(module (func $fibonacci (param $n i32) (result i32) (if (i32.le_s (local.get $n) (i32.const 1) ) (then (local.get $n) (return) ) (else (i32.add (call $fibonacci (i32.sub (local.get $n) (i32.const 1) ) ) (call $fibonacci (i32.sub (local.get $n) (i32.const 2) ) ) ) (return) ) ) ) (export “fibonacci” (func $fibonacci)) … )
✓ Binary written to output.wasm
Testing compiled functions: fibonacci(10) = 55 factorial(5) = 120 sumSquares(3, 4) = 25
Optimization Passes (Bonus)
class Optimizer {
optimize(ast) {
ast = this.constantFolding(ast);
ast = this.deadCodeElimination(ast);
return ast;
}
constantFolding(node) {
if (node instanceof BinaryExpression) {
const left = this.constantFolding(node.left);
const right = this.constantFolding(node.right);
// Both operands are constants
if (left instanceof Literal && right instanceof Literal) {
const result = this.evaluateBinaryOp(
node.operator,
left.value,
right.value
);
return new Literal(result, left.valueType, node.line, node.column);
}
node.left = left;
node.right = right;
}
// Recursively optimize children...
return node;
}
evaluateBinaryOp(operator, left, right) {
switch (operator) {
case TokenType.PLUS: return left + right;
case TokenType.MINUS: return left - right;
case TokenType.STAR: return left * right;
case TokenType.SLASH: return Math.floor(left / right);
// ... other operators
}
}
deadCodeElimination(node) {
// Remove unreachable code after return statements
// ...
return node;
}
}Summary
Building a WebAssembly compiler involves:
Lexical Analysis: Tokenize source code
Syntax Analysis: Build AST from tokens
Semantic Analysis: Type checking, symbol resolution
Code Generation: Emit WAT (WebAssembly Text)
Binary Generation: Convert WAT to
.wasmbinary
Key Concepts:
S-expressions: WAT uses Lisp-like syntax
Type system: Explicit types (i32, i64, f32, f64)
Stack machine: Expressions leave results on stack
Structured control flow:
block,loop,ifLocal variables: Indexed, typed locals
Tools:
WABT (WebAssembly Binary Toolkit):
wat2wasm,wasm2watBinaryen: Optimization and validation
Emscripten: C/C++ to WebAssembly
This foundation enables creating domain-specific languages, transpilers, or optimizing compilers targeting WebAssembly!
Chapter 17: WebAssembly System Interface (WASI)
Introduction: Beyond the Browser
WebAssembly was designed for the web, but its portability and safety make it attractive for standalone applications outside browsers. However, the WebAssembly specification deliberately avoids defining system APIs (file I/O, networking, environment access) to remain platform-agnostic.
The Problem:
Browser: Rich APIs (DOM, fetch, WebGL) but sandboxed
Server/CLI: No standard APIs—each runtime invents its own
Result: Portability broken outside browsers
WASI (WebAssembly System Interface) solves this by providing:
Standardized system calls (POSIX-like APIs)
Capability-based security (explicit permissions)
Cross-platform compatibility (Windows, Linux, macOS)
Language-agnostic (works from C, Rust, Go, etc.)
WASI Design Principles
1. Capability-Based Security
Traditional POSIX: Ambient authority (any code can access any file if OS permits)
// Traditional C - can access ANY file
FILE* f = fopen("/etc/passwd", "r"); // OS decides accessWASI: Explicit capabilities (must be granted by host)
// WASI - file descriptor must be pre-opened by host
// Application can ONLY access what was granted
int fd = __wasi_path_open(preopened_dir, "data.txt", ...);Key Concept: The host grants specific directories/resources. The WASM module cannot escape the sandbox.
2. Virtualization
WASI virtualizes OS concepts:
File descriptors (stdin, stdout, stderr, files, sockets)
Clocks (monotonic, realtime)
Random data (secure entropy)
Environment variables
Command-line arguments
This allows WASM modules to be portable across different host environments.
WASI Preview 1 (Current Stable)
Core API Functions
WASI functions follow the naming pattern __wasi_*
and return errno-style error codes.
File Descriptors
// Read from a file descriptor
__wasi_errno_t __wasi_fd_read(
__wasi_fd_t fd, // File descriptor
const __wasi_iovec_t *iovs, // I/O vectors (scatter-gather)
size_t iovs_len, // Number of vectors
__wasi_size_t *nread // Bytes actually read
);
// Write to a file descriptor
__wasi_errno_t __wasi_fd_write(
__wasi_fd_t fd,
const __wasi_ciovec_t *iovs,
size_t iovs_len,
__wasi_size_t *nwritten
);
// Close a file descriptor
__wasi_errno_t __wasi_fd_close(__wasi_fd_t fd);
// Seek within a file
__wasi_errno_t __wasi_fd_seek(
__wasi_fd_t fd,
__wasi_filedelta_t offset,
__wasi_whence_t whence,
__wasi_filesize_t *newoffset
);Path Operations
// Open a file relative to a directory
__wasi_errno_t __wasi_path_open(
__wasi_fd_t dirfd, // Pre-opened directory
__wasi_lookupflags_t dirflags,
const char *path,
size_t path_len,
__wasi_oflags_t oflags, // O_CREAT, O_TRUNC, etc.
__wasi_rights_t fs_rights_base,
__wasi_rights_t fs_rights_inheriting,
__wasi_fdflags_t fs_flags,
__wasi_fd_t *fd
);
// Create a directory
__wasi_errno_t __wasi_path_create_directory(
__wasi_fd_t fd,
const char *path,
size_t path_len
);
// Remove a file
__wasi_errno_t __wasi_path_unlink_file(
__wasi_fd_t fd,
const char *path,
size_t path_len
);
// Get file metadata
__wasi_errno_t __wasi_path_filestat_get(
__wasi_fd_t fd,
__wasi_lookupflags_t flags,
const char *path,
size_t path_len,
__wasi_filestat_t *buf
);Environment & Args
// Get size of environment variables
__wasi_errno_t __wasi_environ_sizes_get(
__wasi_size_t *environ_count,
__wasi_size_t *environ_buf_size
);
// Get environment variables
__wasi_errno_t __wasi_environ_get(
uint8_t **environ,
uint8_t *environ_buf
);
// Get command-line argument sizes
__wasi_errno_t __wasi_args_sizes_get(
__wasi_size_t *argc,
__wasi_size_t *argv_buf_size
);
// Get command-line arguments
__wasi_errno_t __wasi_args_get(
uint8_t **argv,
uint8_t *argv_buf
);Clock & Time
// Get current time
__wasi_errno_t __wasi_clock_time_get(
__wasi_clockid_t clock_id, // REALTIME or MONOTONIC
__wasi_timestamp_t precision,
__wasi_timestamp_t *time
);
// High-resolution sleep
__wasi_errno_t __wasi_poll_oneoff(
const __wasi_subscription_t *in,
__wasi_event_t *out,
__wasi_size_t nsubscriptions,
__wasi_size_t *nevents
);Random Data
// Get cryptographically secure random bytes
__wasi_errno_t __wasi_random_get(
uint8_t *buf,
__wasi_size_t buf_len
);Process Control
// Exit the process
_Noreturn void __wasi_proc_exit(__wasi_exitcode_t rval);
// Raise a signal (limited support)
__wasi_errno_t __wasi_proc_raise(__wasi_signal_t sig);Example 1: Hello World in WASI
WAT Implementation
(module
;; Import fd_write from WASI
(import "wasi_snapshot_preview1" "fd_write"
(func $fd_write (param i32 i32 i32 i32) (result i32)))
(memory 1)
(export "memory" (memory 0))
;; Write "Hello, WASI!\n" at offset 8
(data (i32.const 8) "Hello, WASI!\n")
(func $main (export "_start")
;; Create I/O vector in memory at offset 0
;; iov_base = 8 (pointer to string)
(i32.store (i32.const 0) (i32.const 8))
;; iov_len = 13 (length of string)
(i32.store (i32.const 4) (i32.const 13))
;; Call fd_write(stdout=1, iovs=0, iovs_len=1, nwritten=16)
(call $fd_write
(i32.const 1) ;; stdout
(i32.const 0) ;; pointer to iovs
(i32.const 1) ;; number of iovs
(i32.const 16) ;; pointer to store nwritten
)
drop ;; Ignore return value
)
)
Running with WASI Runtime
# Compile WAT to WASM
wat2wasm hello.wat -o hello.wasm
# Run with wasmtime
wasmtime hello.wasm
# Output: Hello, WASI!
# Run with wasmer
wasmer hello.wasm
# Output: Hello, WASI!Example 2: File I/O in Rust
Rust Source
// Compile with: rustc --target wasm32-wasi -O main.rs
use std::fs;
use std::io::{self, Write};
fn main() -> io::Result<()> {
// Write to a file
fs::write("output.txt", "Hello from WASI!\n")?;
// Read from the file
let contents = fs::read_to_string("output.txt")?;
// Write to stdout
io::stdout().write_all(contents.as_bytes())?;
// List directory contents
for entry in fs::read_dir(".")? {
let entry = entry?;
println!("{}", entry.path().display());
}
Ok(())
}Compiling and Running
# Compile to WASI
rustc --target wasm32-wasi -O main.rs -o app.wasm
# Run with directory access
wasmtime --dir=. app.wasm
# Output:
# Hello from WASI!
# ./app.wasm
# ./output.txt
# ...
# Try WITHOUT directory permission (will fail)
wasmtime app.wasm
# Error: access deniedKey Point: The --dir=. flag
grants the WASM module access to the current
directory. Without it, file operations fail
(capability-based security in action).
Example 3: Environment Variables & Arguments
C Source
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {
// Print arguments
printf("Arguments (%d):\n", argc);
for (int i = 0; i < argc; i++) {
printf(" argv[%d] = %s\n", i, argv[i]);
}
// Print environment variables
printf("\nEnvironment:\n");
char *home = getenv("HOME");
if (home) {
printf(" HOME = %s\n", home);
}
char *path = getenv("PATH");
if (path) {
printf(" PATH = %s\n", path);
}
return 0;
}Running
# Compile
clang --target=wasm32-wasi -O2 env.c -o env.wasm
# Run with environment variables
wasmtime env.wasm arg1 arg2 --env HOME=/home/user --env PATH=/bin
# Output:
# Arguments (3):
# argv[0] = env.wasm
# argv[1] = arg1
# argv[2] = arg2
#
# Environment:
# HOME = /home/user
# PATH = /binWASI I/O Vectors (Scatter-Gather)
WASI uses I/O vectors for efficient batched reads/writes:
typedef struct __wasi_iovec_t {
uint8_t *buf; // Pointer to buffer
__wasi_size_t buf_len; // Buffer length
} __wasi_iovec_t;JavaScript Implementation
// Implementing fd_write in JavaScript
function fd_write(fd, iovs_ptr, iovs_len, nwritten_ptr) {
const memory = wasmInstance.exports.memory.buffer;
const view = new DataView(memory);
let totalWritten = 0;
// Read each iovec
for (let i = 0; i < iovs_len; i++) {
const iov_base = view.getUint32(iovs_ptr + (i * 8), true);
const iov_len = view.getUint32(iovs_ptr + (i * 8) + 4, true);
// Extract bytes
const bytes = new Uint8Array(memory, iov_base, iov_len);
// Write to appropriate output
if (fd === 1) { // stdout
console.log(new TextDecoder().decode(bytes));
} else if (fd === 2) { // stderr
console.error(new TextDecoder().decode(bytes));
}
totalWritten += iov_len;
}
// Write total bytes written
view.setUint32(nwritten_ptr, totalWritten, true);
return 0; // Success
}
// Import object
const importObject = {
wasi_snapshot_preview1: {
fd_write: fd_write,
proc_exit: (code) => {
console.log(`Process exited with code ${code}`);
}
// ... other WASI functions
}
};
const instance = await WebAssembly.instantiate(wasmBytes, importObject);
instance.exports._start();WASI Runtimes
Popular WASI Implementations
Wasmtime (Bytecode Alliance)
# Install curl https://wasmtime.dev/install.sh -sSf | bash # Run with capabilities wasmtime --dir=. --env KEY=value app.wasm arg1 arg2Wasmer (Wasmer Inc.)
# Install curl https://get.wasmer.io -sSf | sh # Run wasmer run app.wasm --dir=. -- arg1 arg2WasmEdge (CNCF Project)
# Install curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install.sh | bash # Run wasmedge --dir=.:. app.wasmNode.js (Experimental)
const { WASI } = require('wasi'); const fs = require('fs'); const wasi = new WASI({ args: process.argv, env: process.env, preopens: { '/sandbox': '/real/path' } }); const importObject = { wasi_snapshot_preview1: wasi.wasiImport }; (async () => { const wasm = await WebAssembly.compile(fs.readFileSync('./app.wasm')); const instance = await WebAssembly.instantiate(wasm, importObject); wasi.start(instance); })();
Advanced WASI Features
1. Pre-opened Directories
# Map local directory to virtual path
wasmtime --mapdir=/app:/home/user/project app.wasm
# Inside WASM, "/" maps to the mapped directory
### 2. Network Sockets (WASI Preview 2)
rust
// Future WASI will support networking
use std::net::TcpListener;
let listener = TcpListener::bind("127.0.0.1:8080")?;
for stream in listener.incoming() {
// Handle connection
}
### 3. Asynchronous I/O
c
// Poll for events
__wasi_subscription_t subscriptions[2];
// Subscribe to stdin readability
subscriptions[0].u.tag = __WASI_EVENTTYPE_FD_READ;
subscriptions[0].u.u.fd_read.file_descriptor = 0; // stdin
// Subscribe to timer
subscriptions[1].u.tag = __WASI_EVENTTYPE_CLOCK;
subscriptions[1].u.u.clock.timeout = 1000000000; // 1 second
__wasi_event_t events[2];
size_t nevents;
__wasi_poll_oneoff(subscriptions, events, 2, &nevents);
---
## WASI vs. Emscripten
| Feature | WASI | Emscripten |
|---------|------|------------|
| **Target** | Standalone/Server | Browser |
| **APIs** | POSIX-like system calls | Browser APIs (DOM, WebGL) |
| **Binary Size** | Small (~100KB runtime) | Large (>1MB runtime) |
| **Startup** | Fast | Slower (async compilation) |
| **Security** | Capability-based | Browser sandbox |
| **Portability** | High (cross-runtime) | Browser-only |
**When to use**:
- **WASI**: CLI tools, servers, plugins, edge computing
- **Emscripten**: Games, graphics, browser apps
---
## WASI Preview 2 & Component Model
### Current Evolution
WASI is transitioning to a **component model** with:
1. **Interface Types**: High-level types (strings, lists, records)
2. **Components**: Composable WASM modules
3. **WIT (WebAssembly Interface Types)**: IDL for components
```wit
// Define an interface in WIT
interface filesystem {
record file-stat {
size: u64,
modified: u64
}
read-file: func(path: string) -> result<list<u8>, error>
write-file: func(path: string, data: list<u8>) -> result<_, error>
stat: func(path: string) -> result<file-stat, error>
}Component Composition
bash # Compose components wasm-tools compose producer.wasm consumer.wasm -o app.wasm
Producer exports an interface
Consumer imports and uses it
No JavaScript glue needed!
Summary
WASI provides:
Standard System API: POSIX-like functions portable across runtimes
Capability Security: Explicit permission model (no ambient authority)
Language Neutral: Works from C, Rust, Go, Zig, etc.
Cross-Platform: Windows, Linux, macOS, embedded systems
Key Functions:
File I/O:
fd_read,fd_write,path_openEnvironment:
environ_get,args_getTime:
clock_time_get,poll_oneoffRandom:
random_get
Future (Preview 2):
Networking (sockets)
Component model (high-level composition)
Interface types (no manual serialization)
WASI enables WebAssembly to become a universal runtime for portable, secure applications beyond the browser!
Chapter 18: JavaScript Engines and WebAssembly Implementations
Introduction: The Engine Landscape
JavaScript engines are the heart of modern web browsers and runtimes. They parse, compile, and execute JavaScript code—and increasingly, WebAssembly modules. Understanding how these engines work internally reveals why WebAssembly is fast and how it integrates with JavaScript.
Major JavaScript Engines:
V8 (Google) - Chrome, Node.js, Deno, Edge
SpiderMonkey (Mozilla) - Firefox
JavaScriptCore (JSC) (Apple) - Safari, Bun
ChakraCore (Microsoft) - Legacy Edge (archived)
Each engine implements both the ECMAScript specification and the WebAssembly specification, but with different architectural approaches.
JavaScript Engine Architecture
Traditional JavaScript Pipeline
Source Code → Lexer → Parser → AST → Bytecode → Interpreter ↓ Profiler (Hot Path Detection) ↓ JIT Compiler (Optimized Machine Code)
V8 Architecture (Modern Multi-Tier)
JavaScript Source ↓ Parser → AST ↓ Ignition (Bytecode Interpreter) ↓ (if hot) TurboFan (Optimizing Compiler) ↓ Machine Code (x64, ARM64, etc.)
Key Insight: JavaScript engines use tiered compilation:
Interpreter (fast startup, slower execution)
Baseline JIT (quick compilation, moderate performance)
Optimizing JIT (slow compilation, peak performance)
This trades off startup time vs. steady-state performance.
WebAssembly in JavaScript Engines
WebAssembly Pipeline
.wasm binary ↓ Validation (type checking, structure verification) ↓ Baseline Compiler (fast, unoptimized machine code) ↓ (background thread) Optimizing Compiler (TurboFan/Ion/OMG) ↓ Optimized Machine Code
Key Differences from JavaScript:
| Feature | JavaScript | WebAssembly |
|---|---|---|
| Parsing | Complex (JS syntax) | Simple (binary format) |
| Validation | Runtime type checks | Ahead-of-time type checking |
| Compilation | Deferred (lazy) | Eager (immediate) |
| Optimization | Speculative (can deoptimize) | Stable (types known) |
| Predictability | Variable (depends on profile) | Consistent (no surprises) |
V8 Engine Deep Dive
V8 Components
Parser: Converts JavaScript source to AST
Ignition: Bytecode interpreter
TurboFan: Optimizing compiler
Liftoff: WebAssembly baseline compiler
TurboFan (Wasm): WebAssembly optimizing compiler
V8 WebAssembly Compilation Strategy
// Streaming compilation
const response = await fetch('module.wasm');
const { instance } = await WebAssembly.instantiateStreaming(response);
// Behind the scenes:
// 1. Download starts
// 2. Liftoff compiles incrementally as bytes arrive
// 3. TurboFan compiles in background
// 4. Hot functions switch to TurboFan codeLiftoff (Baseline Compiler)
Goal: Fast compilation, reasonable performance
Wasm Function → Liftoff ↓
Simple register allocation
One-pass code generation
No optimizations
~10x faster than TurboFan compilation ↓ Machine Code (adequate performance)
Example: Add Function
(func $add (param $a i32) (param $b i32) (result i32)
local.get $a
local.get $b
i32.add
)
Liftoff Output (pseudo-assembly):
; Load parameters from stack/registers
mov eax, [param_a]
mov ebx, [param_b]
; Add
add eax, ebx
; Return (result in eax)
retTurboFan (Optimizing Compiler)
Goal: Maximum performance, slower compilation
Wasm Function → TurboFan ↓
Build intermediate representation (Sea of Nodes)
Type analysis
Inlining
Loop optimizations
Register allocation
Code generation ↓ Optimized Machine Code
Optimizations:
Constant Folding:
i32.const 2i32.const 3i32.add→i32.const 5Dead Code Elimination: Remove unreachable code
Loop Invariant Code Motion: Hoist constant calculations out of loops
Inlining: Replace function calls with function bodies
SIMD Vectorization: Use SIMD instructions when available
Example: Optimized Loop
(func $sum (param $n i32) (result i32)
(local $i i32)
(local $sum i32)
(loop $continue
;; sum += i
local.get $sum
local.get $i
i32.add
local.set $sum
;; i++
local.get $i
i32.const 1
i32.add
local.tee $i
;; if (i < n) continue
local.get $n
i32.lt_s
br_if $continue
)
local.get $sum
)
TurboFan Optimizations:
Strength Reduction: Convert
i32.mulby powers of 2 to shiftsLoop Unrolling: Process multiple iterations per loop
Register Allocation: Keep
$sum,$i,$nin registers
SpiderMonkey Engine (Firefox)
Architecture
JavaScript Source ↓ Parser → Bytecode ↓ Baseline Interpreter ↓ (warm) Baseline JIT Compiler (Ion Baseline) ↓ (hot) Ion (Optimizing Compiler)
SpiderMonkey WebAssembly
Baseline Compiler: Fast, simple code generation
Ion (Wasm): Optimizing compiler with aggressive optimizations
Unique Features:
Cranelift Backend (optional): Rust-based code generator
Streaming Compilation: Compile as download progresses
Tier-up Strategy: Automatically promote hot code
Ion Optimizations
(func $matrix_multiply (param $n i32)
;; ... matrix multiplication loop ...
)
Ion Optimizations:
Loop Vectorization: Use SIMD for parallel operations
Bounds Check Elimination: Remove redundant array bounds checks
Type Specialization: Optimize for specific numeric types
JavaScriptCore (Safari)
Architecture
JavaScript Source ↓ Parser → Bytecode ↓ LLInt (Low-Level Interpreter) ↓ (warm) Baseline JIT ↓ (hot) DFG (Data Flow Graph JIT) ↓ (very hot) FTL (Faster Than Light) - uses B3/Air backend
JavaScriptCore WebAssembly
BBQ (Baseline): Fast baseline compiler
OMG (Optimizing): Advanced optimizing compiler using B3 backend
B3 (Bare Bones Backend):
Intermediate representation for low-level code
Shared between JavaScript (FTL) and WebAssembly (OMG)
Aggressive optimizations:
Instruction selection
Register allocation
Code layout optimization
Example: SIMD Optimization
(func $add_vectors (param $a i32) (param $b i32) (param $result i32) (param $len i32)
(local $i i32)
(loop $loop
;; Load SIMD vectors
(v128.load (local.get $a))
(v128.load (local.get $b))
;; Add vectors
i32x4.add
;; Store result
(local.get $result)
v128.store
;; Increment pointers
(local.set $a (i32.add (local.get $a) (i32.const 16)))
(local.set $b (i32.add (local.get $b) (i32.const 16)))
(local.set $result (i32.add (local.get $result) (i32.const 16)))
(local.set $i (i32.add (local.get $i) (i32.const 4)))
;; Loop condition
(local.get $i)
(local.get $len)
i32.lt_u
br_if $loop
)
)
OMG Optimization: Generates native SIMD instructions (SSE/AVX on x86, NEON on ARM)
Memory Management
Linear Memory in Engines
WebAssembly linear memory is separate from JavaScript heap:
┌│─────────────────────────────────────┐
││ JavaScript Engine Memory │
├│─────────────────────────────────────┤
││ JavaScript Objects (GC-managed) │
││ - Strings, Arrays, Objects │
├│─────────────────────────────────────┤
││ WebAssembly Linear Memory │
││ - ArrayBuffer (unmanaged) │
││ - Fixed layout, no GC │ └─────────────────────────────────────┘
V8 Memory Representation
// JavaScript side
const memory = new WebAssembly.Memory({ initial: 10, maximum: 100 });
const buffer = memory.buffer; // ArrayBuffer
// Behind the scenes in V8:
// 1. Allocate ArrayBuffer (10 pages = 640KB)
// 2. Create backing store (native memory)
// 3. Associate ArrayBuffer with Wasm instanceMemory Growth:
(func $grow_memory
;; Grow by 1 page (64KB)
i32.const 1
memory.grow
drop ;; Ignore old page count
)
Engine Behavior:
Allocate new backing store (larger)
Copy existing data
Update all TypedArray views
Invalidate old buffer reference
let view = new Uint8Array(memory.buffer);
instance.exports.grow_memory();
// view is now DETACHED - must recreate
view = new Uint8Array(memory.buffer);Garbage Collection Integration
Current State (MVP)
WebAssembly cannot directly reference JavaScript objects (no GC integration).
Workaround: Reference Types
// JavaScript table of objects
const table = new WebAssembly.Table({
element: "anyfunc",
initial: 10
});
// Store JavaScript function
table.set(0, () => console.log("Called from Wasm!"));
// Wasm can call via index
(call_indirect (type $void_to_void) (i32.const 0))Future: GC Proposal
Goal: Allow WebAssembly to create and manipulate GC’d objects
;; Future syntax (GC proposal)
(type $point (struct
(field $x (mut i32))
(field $y (mut i32))
))
(func $create_point (param $x i32) (param $y i32) (result (ref $point))
;; Allocate GC'd struct
(struct.new $point (local.get $x) (local.get $y))
)
(func $get_x (param $p (ref $point)) (result i32)
(struct.get $point $x (local.get $p))
)
Engine Implementation:
Wasm structs live in JavaScript GC heap
Same GC algorithms (generational, incremental)
Efficient cross-language references
Optimization Challenges
Deoptimization (JavaScript-specific)
JavaScript engines use speculative optimization:
function add(a, b) {
return a + b;
}
// First 1000 calls: a and b are numbers
// Engine optimizes: add = (int a, int b) => a + b
add(1, 2); // Fast path
// 1001st call: a is a string
add("hello", " world"); // DEOPTIMIZATION!
// Engine reverts to unoptimized codeWebAssembly Advantage: Types are static → no deoptimization
Hidden Classes (JavaScript)
function Point(x, y) {
this.x = x; // Hidden class C0
this.y = y; // Transition to C1
}
// Same property order → same hidden class → fast property access
const p1 = new Point(1, 2); // Uses C1
const p2 = new Point(3, 4); // Uses C1WebAssembly: No hidden classes needed (fixed memory layout)
Inline Caching
JavaScript engines cache property lookups:
function getName(obj) {
return obj.name; // IC: cache offset of 'name' for seen object shapes
}WebAssembly: Direct memory access (no IC needed)
Performance Comparison
Benchmark: Fibonacci (Recursive)
// JavaScript
function fib(n) {
if (n <= 1) return n;
return fib(n - 1) + fib(n - 2);
};; WebAssembly
(func $fib (param $n i32) (result i32)
(if (result i32) (i32.le_s (local.get $n) (i32.const 1))
(then (local.get $n))
(else
(i32.add
(call $fib (i32.sub (local.get $n) (i32.const 1)))
(call $fib (i32.sub (local.get $n) (i32.const 2)))
)
)
)
)
Results (fib(40), V8):
JavaScript: ~800ms
WebAssembly: ~600ms
Speedup: 1.3x
Why WebAssembly is faster:
No type checks (static typing)
No deoptimization risk
Better register allocation
Predictable performance
Benchmark: Matrix Multiplication
// Compiled to WebAssembly
void matmul(float *a, float *b, float *c, int n) {
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
float sum = 0;
for (int k = 0; k < n; k++) {
sum += a[i*n + k] * b[k*n + j];
}
c[i*n + j] = sum;
}
}
}Results (1024×1024 matrices):
JavaScript (typed arrays): ~3000ms
WebAssembly (scalar): ~1200ms
WebAssembly (SIMD): ~400ms
Speedup: 2.5x (scalar), 7.5x (SIMD)
Why SIMD is faster:
Process 4 floats per instruction
Better instruction-level parallelism
Reduced loop overhead
Tooling and Debugging
Chrome DevTools
WebAssembly Debugging:
Source Maps: Map
.wasmto original C/Rust/etc.Breakpoints: Set breakpoints in WebAssembly code
Call Stack: Mixed JavaScript/WebAssembly stack traces
Memory Inspector: View linear memory as hex/typed arrays
// Enable source maps
instance = await WebAssembly.instantiateStreaming(
fetch('module.wasm'),
imports,
{ sourceMap: 'module.wasm.map' }
);V8 Flags for Profiling
# Run Node.js with profiling
node --prof --no-logfile-per-isolate app.js
# Profile WebAssembly compilation
node --trace-wasm-compiler app.js
# Detailed TurboFan output
node --trace-turbo --trace-turbo-graph app.jsSpiderMonkey Profiling
# Firefox with profiling
firefox --profiler
# Capture profile
# Tools → Web Developer → Performance
# Record → Analyze WebAssembly executionFuture Developments
1. WebAssembly Tail Calls
Problem: Deep recursion causes stack overflow
(func $factorial_tail (param $n i64) (param $acc i64) (result i64)
(if (result i64) (i64.eqz (local.get $n))
(then (local.get $acc))
(else
;; Tail call (doesn't grow stack)
(return_call $factorial_tail
(i64.sub (local.get $n) (i64.const 1))
(i64.mul (local.get $n) (local.get $acc))
)
)
)
)
Engine Support: V8, SpiderMonkey (implemented)
2. WebAssembly Threads
Shared Memory:
const memory = new WebAssembly.Memory({
initial: 10,
maximum: 100,
shared: true // Shared between workers
});
const worker = new Worker('worker.js');
worker.postMessage({ memory });Atomic Operations:
;; Atomic increment
(i32.atomic.rmw.add (i32.const 0) (i32.const 1))
3. Exception Handling
(try (result i32)
(do
(call $may_throw)
)
(catch $exception
(i32.const -1) ;; Error code
)
)
Engine Integration: Unified exception handling across JS/Wasm
Summary
JavaScript Engines use tiered compilation:
JavaScript: Interpreter → Baseline JIT → Optimizing JIT (speculative)
WebAssembly: Baseline Compiler → Optimizing Compiler (stable)
Key Engine Components:
| Engine | Wasm Baseline | Wasm Optimizing | Backend |
|---|---|---|---|
| V8 | Liftoff | TurboFan | Custom |
| SpiderMonkey | Baseline | Ion | Cranelift (optional) |
| JavaScriptCore | BBQ | OMG | B3/Air |
WebAssembly Advantages:
Static typing: No runtime type checks
Predictable performance: No deoptimization
Efficient compilation: Simple binary format
SIMD support: Vectorized operations
Low overhead: Direct memory access
Performance:
Typical speedup: 1.5x–3x over JavaScript
SIMD workloads: 5x–10x speedup
Consistent performance (no warmup needed)
Future:
GC integration (direct object references)
Tail calls (efficient recursion)
Threads (parallel execution)
Exception handling (unified with JavaScript)
JavaScript engines are rapidly evolving to make WebAssembly a first-class citizen alongside JavaScript!
Appendix A: Quick Reference – JavaScript Syntax for C Programmers
This appendix provides a practical reference for C programmers learning JavaScript, particularly in the context of WebAssembly interop. It maps common C constructs to their JavaScript equivalents.
A.1 Variables and Data Types
Variable Declaration
| C | JavaScript | Notes |
|---|---|---|
int x; |
let x; |
Block-scoped, mutable |
int x = 42; |
let x = 42; |
Type inferred dynamically |
const int MAX = 100; |
const MAX = 100; |
Immutable binding |
static int count = 0; |
let count = 0; (module scope) |
No static keyword |
Key Difference: JavaScript is dynamically typed — types are determined at runtime.
let x = 42; // Number
x = "hello"; // Now a String (allowed!)
x = [1, 2, 3]; // Now an Array (allowed!)Primitive Types
| C Type | JavaScript Type | Size | Range |
|---|---|---|---|
int |
Number |
64-bit float | ±2^53 (safe integers) |
long long |
BigInt |
Arbitrary | Unlimited (use 42n) |
float |
Number |
64-bit float | IEEE 754 double |
double |
Number |
64-bit float | IEEE 754 double |
char |
String |
16-bit UTF-16 | Single character string |
bool |
Boolean |
N/A | true or false |
void* |
N/A | Use indices | Pointers → array indices |
Important: JavaScript Number is
always a 64-bit IEEE 754 float. For WebAssembly
i64, use BigInt:
// WASM i32 → JavaScript Number
const x = 42;
// WASM i64 → JavaScript BigInt
const y = 42n; // Note the 'n' suffix
// BigInt operations
const sum = 100n + 200n; // 300nType Checking
// C: compile-time type checking
// JavaScript: runtime type checking
typeof 42; // "number"
typeof 42n; // "bigint"
typeof "hello"; // "string"
typeof true; // "boolean"
typeof undefined; // "undefined"
typeof null; // "object" (historical quirk!)
typeof [1, 2, 3]; // "object"
typeof {x: 1}; // "object"A.2 Arrays and Memory
Arrays
| C | JavaScript | Notes |
|---|---|---|
int arr[10]; |
let arr = new Array(10); |
Creates sparse array |
int arr[] = {1, 2, 3}; |
let arr = [1, 2, 3]; |
Array literal |
arr[i] |
arr[i] |
Zero-indexed |
sizeof(arr)/sizeof(arr[0]) |
arr.length |
Dynamic property |
Typed Arrays (for WebAssembly interop):
// C: int buffer[1024];
// JavaScript: Fixed-type, efficient arrays
const buffer = new Int32Array(1024); // 32-bit signed integers
buffer[0] = 42;
const floats = new Float32Array(256); // 32-bit floats
const bytes = new Uint8Array(1024); // 8-bit unsigned
// Backed by ArrayBuffer (WebAssembly linear memory)
const memory = new WebAssembly.Memory({ initial: 1 });
const view = new Uint8Array(memory.buffer);Common Typed Arrays
| JavaScript Type | C Equivalent | Bytes per Element |
|---|---|---|
Int8Array |
int8_t[] |
1 |
Uint8Array |
uint8_t[] |
1 |
Int16Array |
int16_t[] |
2 |
Uint16Array |
uint16_t[] |
2 |
Int32Array |
int32_t[] |
4 |
Uint32Array |
uint32_t[] |
4 |
Float32Array |
float[] |
4 |
Float64Array |
double[] |
8 |
BigInt64Array |
int64_t[] |
8 |
BigUint64Array |
uint64_t[] |
8 |
Pointers and Memory Access
// C: Pointer arithmetic
int *ptr = array;
int value = *(ptr + 5); // array[5]
ptr++; // Move to next element// JavaScript: No pointers, use indices
const array = new Int32Array(buffer);
const value = array[5]; // Direct indexing
// No pointer arithmetic neededWebAssembly Memory Model:
// Linear memory is a giant ArrayBuffer
const memory = instance.exports.memory;
const bytes = new Uint8Array(memory.buffer);
// Read 32-bit int at offset 100
const view32 = new Int32Array(memory.buffer);
const value = view32[25]; // offset 100 ÷ 4 bytes = index 25
// Or use DataView for mixed types
const dataView = new DataView(memory.buffer);
const value = dataView.getInt32(100, true); // true = little-endianA.3 Strings
String Basics
| C | JavaScript | Notes |
|---|---|---|
char str[] = "hello"; |
let str = "hello"; |
Immutable in JS |
char *str = "hello"; |
const str = "hello"; |
String literal |
strlen(str) |
str.length |
Property, not function |
strcmp(a, b) |
a === b |
Direct comparison |
strcat(dest, src) |
dest + src |
Concatenation |
Key Difference: JavaScript strings are immutable and UTF-16 encoded.
const str = "hello";
str[0] = "H"; // Does nothing (silently fails)
const upper = str.toUpperCase(); // Returns new string "HELLO"String Operations
// Length
"hello".length; // 5
// Indexing (read-only)
"hello"[0]; // "h"
"hello".charAt(0); // "h"
// Concatenation
"hello" + " " + "world"; // "hello world"
// Substring
"hello".substring(1, 4); // "ell" (start, end)
"hello".slice(1, 4); // "ell"
// Search
"hello".indexOf("ll"); // 2
"hello".includes("ll"); // true
// Case conversion
"hello".toUpperCase(); // "HELLO"
"HELLO".toLowerCase(); // "hello"
// Split
"a,b,c".split(","); // ["a", "b", "c"]C String ↔︎ JavaScript String (WebAssembly)
// Read C string from WebAssembly memory
function readCString(memory, offset) {
const bytes = new Uint8Array(memory.buffer);
let end = offset;
// Find null terminator
while (bytes[end] !== 0) end++;
// Decode UTF-8 bytes
const decoder = new TextDecoder();
return decoder.decode(bytes.subarray(offset, end));
}
// Write JavaScript string to WebAssembly memory
function writeCString(memory, offset, str) {
const bytes = new Uint8Array(memory.buffer);
const encoder = new TextEncoder();
const encoded = encoder.encode(str);
bytes.set(encoded, offset);
bytes[offset + encoded.length] = 0; // Null terminator
return offset;
}
// Usage
const ptr = instance.exports.malloc(256);
writeCString(memory, ptr, "Hello from JavaScript!");
instance.exports.printf(ptr); // Calls C printfA.4 Operators
Arithmetic Operators
| C | JavaScript | Notes |
|---|---|---|
a + b |
a + b |
Addition (or string concatenation) |
a - b |
a - b |
Subtraction |
a * b |
a * b |
Multiplication |
a / b |
a / b |
Always float division |
a % b |
a % b |
Remainder |
++a |
++a |
Pre-increment |
a++ |
a++ |
Post-increment |
a ** b |
a ** b |
Exponentiation (ES2016) |
Critical Difference: Integer division
// C: Integer division
int a = 7 / 2; // 3// JavaScript: Always float division
let a = 7 / 2; // 3.5
let b = Math.floor(7/2); // 3 (integer division)
let c = (7/2) | 0; // 3 (bitwise trick)Comparison Operators
| C | JavaScript | Notes |
|---|---|---|
a == b |
a === b |
Use strict equality |
a != b |
a !== b |
Strict inequality |
a < b |
a < b |
Less than |
a > b |
a > b |
Greater than |
a <= b |
a <= b |
Less or equal |
a >= b |
a >= b |
Greater or equal |
Important: Use === (strict) not
== (loose):
// Loose equality (type coercion - avoid!)
42 == "42"; // true (bad!)
0 == false; // true (bad!)
null == undefined; // true (confusing!)
// Strict equality (recommended)
42 === "42"; // false (good!)
42 === 42; // true
0 === false; // falseLogical Operators
| C | JavaScript | Notes |
|---|---|---|
a && b |
a && b |
Returns a or b (not boolean!) |
a \|\| b |
a \|\| b |
Returns a or b |
!a |
!a |
Boolean NOT |
| N/A | a ?? b |
Nullish coalescing (ES2020) |
Short-circuit evaluation:
// && returns first falsy value or last value
5 && 10; // 10
0 && 10; // 0
// || returns first truthy value or last value
5 || 10; // 5
0 || 10; // 10
null || "default"; // "default"
// ?? returns right side only if left is null/undefined
0 ?? 10; // 0 (0 is not null/undefined)
null ?? 10; // 10
undefined ?? 10; // 10Bitwise Operators
| C | JavaScript | Notes |
|---|---|---|
a & b |
a & b |
AND |
a \| b |
a \| b |
OR |
a ^ b |
a ^ b |
XOR |
~a |
~a |
NOT |
a << b |
a << b |
Left shift |
a >> b |
a >> b |
Signed right shift |
a >>> b |
a >>> b |
Unsigned right shift |
Important: Bitwise ops convert to 32-bit signed integers:
const a = 0b1010; // 10
const b = 0b1100; // 12
a & b; // 0b1000 = 8
a | b; // 0b1110 = 14
a ^ b; // 0b0110 = 6
~a; // -11 (two's complement)
// Shifts
a << 2; // 40
a >> 1; // 5
-1 >>> 1; // 2147483647 (unsigned)A.5 Control Flow
Conditionals
// C
if (condition) {
// ...
} else if (other) {
// ...
} else {
// ...
}// JavaScript (identical syntax)
if (condition) {
// ...
} else if (other) {
// ...
} else {
// ...
}
// Ternary operator
const result = condition ? value1 : value2;Truthy/Falsy:
// Falsy values (convert to false)
false, 0, 0n, "", null, undefined, NaN
// Everything else is truthy
if (42) { } // true
if ("hello") { } // true
if ([]) { } // true (empty array is truthy!)
if ({}) { } // true (empty object is truthy!)Switch Statement
// C
switch (value) {
case 1:
// ...
break;
case 2:
case 3:
// ...
break;
default:
// ...
}// JavaScript (identical syntax)
switch (value) {
case 1:
// ...
break;
case 2:
case 3:
// ...
break;
default:
// ...
}
// Supports any type (not just integers)
switch (str) {
case "hello":
console.log("Greeting");
break;
case "bye":
console.log("Farewell");
break;
}Loops
// C: while
while (condition) {
// ...
}
// C: do-while
do {
// ...
} while (condition);
// C: for
for (int i = 0; i < n; i++) {
// ...
}// JavaScript: while (identical)
while (condition) {
// ...
}
// JavaScript: do-while (identical)
do {
// ...
} while (condition);
// JavaScript: for (identical)
for (let i = 0; i < n; i++) {
// ...
}
// JavaScript: for-of (iterate values)
for (const value of array) {
console.log(value);
}
// JavaScript: for-in (iterate keys - avoid for arrays!)
for (const key in object) {
console.log(key, object[key]);
}Loop Control
| C | JavaScript | Notes |
|---|---|---|
break; |
break; |
Exit loop |
continue; |
continue; |
Next iteration |
goto label; |
N/A | No goto in JavaScript |
A.6 Functions
Function Declaration
// C
int add(int a, int b) {
return a + b;
}
void print_message(const char *msg) {
printf("%s\n", msg);
}// JavaScript: function declaration
function add(a, b) {
return a + b;
}
function printMessage(msg) {
console.log(msg);
}
// JavaScript: arrow function (ES2015)
const add = (a, b) => a + b;
const square = x => x * x; // Single param, no parens
const greet = () => "Hello"; // No params
// Arrow with block body
const complex = (a, b) => {
const sum = a + b;
return sum * 2;
};Function Parameters
// Default parameters (ES2015)
function greet(name = "World") {
return `Hello, ${name}!`;
}
greet(); // "Hello, World!"
greet("Alice"); // "Hello, Alice!"
// Rest parameters (variadic)
function sum(...numbers) {
return numbers.reduce((a, b) => a + b, 0);
}
sum(1, 2, 3, 4); // 10
// Destructuring parameters
function drawPoint({x, y}) {
console.log(`Point at (${x}, ${y})`);
}
drawPoint({x: 10, y: 20});Function Pointers (WebAssembly Tables)
// C: function pointer
int (*operation)(int, int);
operation = add;
int result = operation(5, 3); // Calls add(5, 3)// JavaScript: functions are first-class
let operation = add;
let result = operation(5, 3);
// WebAssembly: use Table for function references
const table = new WebAssembly.Table({
element: "anyfunc",
initial: 10
});
// Store function at index 0
table.set(0, instance.exports.add);
// Call via call_indirect in WebAssembly
// (call_indirect (type $binary_op) (i32.const 0))A.7 Structures and Objects
Structures
// C
struct Point {
int x;
int y;
};
struct Point p = {10, 20};
p.x = 30;// JavaScript: object literal
const p = {
x: 10,
y: 20
};
p.x = 30;
// Constructor function (old style)
function Point(x, y) {
this.x = x;
this.y = y;
}
const p1 = new Point(10, 20);
// Class (ES2015)
class Point {
constructor(x, y) {
this.x = x;
this.y = y;
}
distance() {
return Math.sqrt(this.x ** 2 + this.y ** 2);
}
}
const p2 = new Point(10, 20);Memory Layout (WebAssembly Interop)
// C struct
struct Point {
float x; // Offset 0
float y; // Offset 4
}; // Total size: 8 bytes
void move_point(struct Point *p, float dx, float dy) {
p->x += dx;
p->y += dy;
}// JavaScript: manual memory layout
const memory = instance.exports.memory;
const floats = new Float32Array(memory.buffer);
// Point at offset 100 (byte offset 100 = float index 25)
const pointIndex = 25;
// Read point
const x = floats[pointIndex];
const y = floats[pointIndex + 1];
// Write point
floats[pointIndex] = 10.5;
floats[pointIndex + 1] = 20.3;
// Call C function
const byteOffset = pointIndex * 4; // Convert to byte offset
instance.exports.move_point(byteOffset, 5.0, 3.0);A.8 Common Patterns
Memory Allocation
// C
int *buffer = malloc(1024 * sizeof(int));
// ... use buffer ...
free(buffer);// JavaScript: automatic garbage collection
const buffer = new Int32Array(1024);
// ... use buffer ...
// Automatically freed when no longer referenced
// WebAssembly: manual allocation
const malloc = instance.exports.malloc;
const free = instance.exports.free;
const ptr = malloc(1024 * 4); // 1024 ints
// ... use memory at ptr ...
free(ptr);Error Handling
// C: return codes
int divide(int a, int b, int *result) {
if (b == 0) return -1; // Error
*result = a / b;
return 0; // Success
}
int result;
if (divide(10, 2, &result) < 0) {
fprintf(stderr, "Error!\n");
}// JavaScript: exceptions
function divide(a, b) {
if (b === 0) {
throw new Error("Division by zero");
}
return a / b;
}
try {
const result = divide(10, 0);
} catch (error) {
console.error("Error:", error.message);
} finally {
console.log("Cleanup");
}File I/O (WASI)
// C
FILE *f = fopen("data.txt", "r");
char buffer[256];
fgets(buffer, sizeof(buffer), f);
fclose(f);// JavaScript (Node.js)
import fs from 'fs';
const data = fs.readFileSync('data.txt', 'utf8');
// WebAssembly with WASI
import { WASI } from 'wasi';
const wasi = new WASI({
args: process.argv,
env: process.env,
preopens: {
'/sandbox': '.'
}
});
const instance = await WebAssembly.instantiate(wasmModule, {
wasi_snapshot_preview1: wasi.wasiImport
});
wasi.start(instance);A.9 Common Pitfalls
1. Equality Comparison
// Bad: loose equality
if (x == 42) { } // Avoid!
// Good: strict equality
if (x === 42) { }2. Integer Division
// Bad: float division
const pages = total / pageSize; // 10.5
// Good: integer division
const pages = Math.floor(total / pageSize); // 103. Array Bounds
// C: undefined behavior
// JavaScript: returns undefined (no crash)
const arr = [1, 2, 3];
arr[10]; // undefined (not an error!)4. Type Coercion
// Surprising behavior
"5" + 3; // "53" (string concatenation)
"5" - 3; // 2 (numeric subtraction)
"5" * "3"; // 15 (both converted to numbers)
// Be explicit
Number("5") + 3; // 85. this Binding
const obj = {
value: 42,
getValue: function() {
return this.value;
}
};
obj.getValue(); // 42
const fn = obj.getValue;
fn(); // undefined (this is not obj!)
// Use arrow functions or bind
const boundFn = obj.getValue.bind(obj);
boundFn(); // 42A.10 Quick Reference Table
C → JavaScript Conversion
| Category | C | JavaScript |
|---|---|---|
| Variable | int x = 42; |
let x = 42; |
| Constant | const int X = 42; |
const X = 42; |
| Array | int arr[10]; |
new Int32Array(10) |
| String | char *str = "hi"; |
let str = "hi"; |
| Function | int add(int a, int b) |
function add(a, b) |
| Struct | struct Point {int x, y;}; |
{x: 0, y: 0} |
| Pointer | int *ptr |
Index into TypedArray |
| malloc | malloc(size) |
new Uint8Array(size) |
| free | free(ptr) |
Garbage collected |
| printf | printf("%d", x); |
console.log(x); |
| NULL | NULL |
null or undefined |
| true/false | 1/0 |
true/false |
This appendix provides the essential mappings for C programmers working with JavaScript and WebAssembly. For deeper learning, consult the ECMAScript specification and WebAssembly documentation!
Appendix B: WebAssembly Instruction Reference
Based on the WebAssembly specification documents provided, here’s a comprehensive reference for WebAssembly instructions:
Overview
WebAssembly uses a stack machine architecture with a small, well-defined set of instructions. Instructions manipulate values on an implicit operand stack and can be categorized by their functionality.
Instruction Categories
1. Control Flow Instructions
These manage program flow and block structures:
;; Block structures
block [blocktype] ;; Begin a block
loop [blocktype] ;; Begin a loop
if [blocktype] ;; Conditional execution
else ;; Alternative branch
end ;; End block/loop/if
;; Branching
br [labelidx] ;; Unconditional branch
br_if [labelidx] ;; Conditional branch
br_table [vec(labelidx)] [labelidx] ;; Table branch
return ;; Return from function
;; Function calls
call [funcidx] ;; Direct function call
call_indirect [tableidx] [typeidx] ;; Indirect call via table
2. Parametric Instructions
Stack manipulation operations:
drop ;; Remove top stack value
select ;; Conditional selection
select [vec(valtype)] ;; Typed conditional selection
3. Variable Instructions
Access local and global variables:
;; Local variables
local.get [localidx] ;; Get local variable
local.set [localidx] ;; Set local variable
local.tee [localidx] ;; Set local and keep value on stack
;; Global variables
global.get [globalidx] ;; Get global variable
global.set [globalidx] ;; Set global variable
4. Numeric Instructions
Integer Operations (i32/i64)
Constants:
i32.const [i32] ;; Push 32-bit integer constant
i64.const [i64] ;; Push 64-bit integer constant
Arithmetic:
i32.add / i64.add ;; Addition
i32.sub / i64.sub ;; Subtraction
i32.mul / i64.mul ;; Multiplication
i32.div_s / i64.div_s ;; Signed division
i32.div_u / i64.div_u ;; Unsigned division
i32.rem_s / i64.rem_s ;; Signed remainder
i32.rem_u / i64.rem_u ;; Unsigned remainder
Bitwise:
i32.and / i64.and ;; Bitwise AND
i32.or / i64.or ;; Bitwise OR
i32.xor / i64.xor ;; Bitwise XOR
i32.shl / i64.shl ;; Shift left
i32.shr_s / i64.shr_s ;; Arithmetic shift right
i32.shr_u / i64.shr_u ;; Logical shift right
i32.rotl / i64.rotl ;; Rotate left
i32.rotr / i64.rotr ;; Rotate right
Comparison:
i32.eqz / i64.eqz ;; Equal to zero
i32.eq / i64.eq ;; Equal
i32.ne / i64.ne ;; Not equal
i32.lt_s / i64.lt_s ;; Less than (signed)
i32.lt_u / i64.lt_u ;; Less than (unsigned)
i32.gt_s / i64.gt_s ;; Greater than (signed)
i32.gt_u / i64.gt_u ;; Greater than (unsigned)
i32.le_s / i64.le_s ;; Less or equal (signed)
i32.le_u / i64.le_u ;; Less or equal (unsigned)
i32.ge_s / i64.ge_s ;; Greater or equal (signed)
i32.ge_u / i64.ge_u ;; Greater or equal (unsigned)
Unary:
i32.clz / i64.clz ;; Count leading zeros
i32.ctz / i64.ctz ;; Count trailing zeros
i32.popcnt / i64.popcnt ;; Count set bits
Floating-Point Operations (f32/f64)
Constants:
f32.const [f32] ;; Push 32-bit float constant
f64.const [f64] ;; Push 64-bit float constant
Arithmetic:
f32.add / f64.add ;; Addition
f32.sub / f64.sub ;; Subtraction
f32.mul / f64.mul ;; Multiplication
f32.div / f64.div ;; Division
f32.min / f64.min ;; Minimum
f32.max / f64.max ;; Maximum
f32.copysign / f64.copysign ;; Copy sign
Unary:
f32.abs / f64.abs ;; Absolute value
f32.neg / f64.neg ;; Negation
f32.sqrt / f64.sqrt ;; Square root
f32.ceil / f64.ceil ;; Ceiling
f32.floor / f64.floor ;; Floor
f32.trunc / f64.trunc ;; Truncate
f32.nearest / f64.nearest ;; Round to nearest
Comparison:
f32.eq / f64.eq ;; Equal
f32.ne / f64.ne ;; Not equal
f32.lt / f64.lt ;; Less than
f32.gt / f64.gt ;; Greater than
f32.le / f64.le ;; Less or equal
f32.ge / f64.ge ;; Greater or equal
5. Conversion Instructions
Type conversions:
;; Integer wrapping/extension
i32.wrap_i64 ;; Wrap i64 to i32
i64.extend_i32_s ;; Extend i32 to i64 (signed)
i64.extend_i32_u ;; Extend i32 to i64 (unsigned)
;; Float truncation to integer
i32.trunc_f32_s ;; Truncate f32 to i32 (signed)
i32.trunc_f32_u ;; Truncate f32 to i32 (unsigned)
i32.trunc_f64_s ;; Truncate f64 to i32 (signed)
i32.trunc_f64_u ;; Truncate f64 to i32 (unsigned)
i64.trunc_f32_s ;; Truncate f32 to i64 (signed)
i64.trunc_f32_u ;; Truncate f32 to i64 (unsigned)
i64.trunc_f64_s ;; Truncate f64 to i64 (signed)
i64.trunc_f64_u ;; Truncate f64 to i64 (unsigned)
;; Integer to float conversion
f32.convert_i32_s ;; Convert i32 to f32 (signed)
f32.convert_i32_u ;; Convert i32 to f32 (unsigned)
f32.convert_i64_s ;; Convert i64 to f32 (signed)
f32.convert_i64_u ;; Convert i64 to f32 (unsigned)
f64.convert_i32_s ;; Convert i32 to f64 (signed)
f64.convert_i32_u ;; Convert i32 to f64 (unsigned)
f64.convert_i64_s ;; Convert i64 to f64 (signed)
f64.convert_i64_u ;; Convert i64 to f64 (unsigned)
;; Float promotion/demotion
f32.demote_f64 ;; Demote f64 to f32
f64.promote_f32 ;; Promote f32 to f64
;; Reinterpretation
i32.reinterpret_f32 ;; Reinterpret f32 as i32
i64.reinterpret_f64 ;; Reinterpret f64 as i64
f32.reinterpret_i32 ;; Reinterpret i32 as f32
f64.reinterpret_i64 ;; Reinterpret i64 as f64
6. Memory Instructions
Linear memory access operations:
;; Memory size and growth
memory.size ;; Get memory size (in pages)
memory.grow ;; Grow memory by delta pages
;; Load operations (i32)
i32.load [memarg] ;; Load 32-bit integer
i32.load8_s [memarg] ;; Load 8-bit signed, extend to 32
i32.load8_u [memarg] ;; Load 8-bit unsigned, extend to 32
i32.load16_s [memarg] ;; Load 16-bit signed, extend to 32
i32.load16_u [memarg] ;; Load 16-bit unsigned, extend to 32
;; Load operations (i64)
i64.load [memarg] ;; Load 64-bit integer
i64.load8_s [memarg] ;; Load 8-bit signed, extend to 64
i64.load8_u [memarg] ;; Load 8-bit unsigned, extend to 64
i64.load16_s [memarg] ;; Load 16-bit signed, extend to 64
i64.load16_u [memarg] ;; Load 16-bit unsigned, extend to 64
i64.load32_s [memarg] ;; Load 32-bit signed, extend to 64
i64.load32_u [memarg] ;; Load 32-bit unsigned, extend to 64
;; Load operations (float)
f32.load [memarg] ;; Load 32-bit float
f64.load [memarg] ;; Load 64-bit float
;; Store operations
i32.store [memarg] ;; Store 32-bit integer
i32.store8 [memarg] ;; Store lower 8 bits
i32.store16 [memarg] ;; Store lower 16 bits
i64.store [memarg] ;; Store 64-bit integer
i64.store8 [memarg] ;; Store lower 8 bits
i64.store16 [memarg] ;; Store lower 16 bits
i64.store32 [memarg] ;; Store lower 32 bits
f32.store [memarg] ;; Store 32-bit float
f64.store [memarg] ;; Store 64-bit float
7. Table Instructions
Table manipulation for indirect function calls:
table.get [tableidx] ;; Get table element
table.set [tableidx] ;; Set table element
table.size [tableidx] ;; Get table size
table.grow [tableidx] ;; Grow table
table.fill [tableidx] ;; Fill table range
table.copy [tableidx] [tableidx] ;; Copy table elements
table.init [tableidx] [elemidx] ;; Initialize table from element
elem.drop [elemidx] ;; Drop element segment
8. Reference Instructions
Reference type operations:
ref.null [heaptype] ;; Create null reference
ref.is_null ;; Test if reference is null
ref.func [funcidx] ;; Create function reference
9. Vector (SIMD) Instructions
128-bit vector operations (when SIMD extension is enabled):
Load/Store:
v128.load [memarg] ;; Load 128-bit vector
v128.store [memarg] ;; Store 128-bit vector
v128.const [v128] ;; Vector constant
Lane operations (examples for i8x16, i16x8, i32x4, i64x2, f32x4, f64x2):
i8x16.splat ;; Splat value to all lanes
i8x16.extract_lane_s [laneidx] ;; Extract signed lane
i8x16.extract_lane_u [laneidx] ;; Extract unsigned lane
i8x16.replace_lane [laneidx] ;; Replace lane value
Arithmetic (vectorized):
i32x4.add ;; Vector addition
i32x4.sub ;; Vector subtraction
i32x4.mul ;; Vector multiplication
f32x4.add ;; Float vector addition
f32x4.sqrt ;; Float vector square root
Comparison and selection:
i32x4.eq ;; Vector equality comparison
i32x4.lt_s ;; Vector less than (signed)
v128.bitselect ;; Bitwise selection
Relaxed SIMD operations (0xFD prefix with secondary opcodes):
i8x16.relaxed_swizzle ;; Relaxed swizzle
i32x4.relaxed_trunc_f32x4_s ;; Relaxed truncation
f32x4.relaxed_madd ;; Relaxed multiply-add
f32x4.relaxed_min/max ;; Relaxed min/max
10. Exception Handling Instructions
Try-catch block operations:
try [blocktype] ;; Begin try block
catch [tagidx] ;; Catch specific exception
catch_all ;; Catch any exception
throw [tagidx] ;; Throw exception
rethrow [labelidx] ;; Rethrow exception
11. Atomic Instructions (Threads extension)
Thread-safe memory operations:
memory.atomic.notify [memarg] ;; Notify waiting threads
memory.atomic.wait32 [memarg] ;; Wait on 32-bit value
memory.atomic.wait64 [memarg] ;; Wait on 64-bit value
;; Atomic load/store
i32.atomic.load [memarg]
i32.atomic.store [memarg]
i64.atomic.load [memarg]
i64.atomic.store [memarg]
;; Atomic read-modify-write
i32.atomic.rmw.add [memarg] ;; Atomic add
i32.atomic.rmw.sub [memarg] ;; Atomic subtract
i32.atomic.rmw.and [memarg] ;; Atomic AND
i32.atomic.rmw.or [memarg] ;; Atomic OR
i32.atomic.rmw.xor [memarg] ;; Atomic XOR
i32.atomic.rmw.xchg [memarg] ;; Atomic exchange
i32.atomic.rmw.cmpxchg [memarg] ;; Atomic compare-exchange
12. Bulk Memory Instructions
Efficient memory and table operations:
memory.copy ;; Copy memory region
memory.fill ;; Fill memory region
memory.init [dataidx] ;; Initialize memory from data segment
data.drop [dataidx] ;; Drop data segment
13. GC Instructions (Garbage Collection proposal)
Structure and array operations:
struct.new [typeidx] ;; Create structure
struct.new_default [typeidx] ;; Create with defaults
struct.get [typeidx] [fieldidx] ;; Get field
struct.set [typeidx] [fieldidx] ;; Set field
array.new [typeidx] ;; Create array
array.new_default [typeidx] ;; Create with defaults
array.get [typeidx] ;; Get element
array.set [typeidx] ;; Set element
array.len ;; Get array length
Instruction Encoding
Instructions are encoded in binary format with:
Single-byte opcodes (0x00-0xFF) for most instructions
Multi-byte opcodes with prefixes:
0xFC- Numeric/saturating operations0xFD- SIMD operations0xFE- Atomic operations (reserved)
Example encoding: i32.add → 0x6A f64.sqrt → 0x9F v128.load → 0xFD 0x00 i32.atomic.load → 0xFE 0x10 0x00
Instruction Properties
Type Signatures
Each instruction has a specific type signature showing stack behavior:
[input_types] → [output_types]
Examples: i32.add: [i32 i32] → [i32] i32.eqz: [i32] → [i32] call: [t1* t2*] → [t3*] (depends on function type)
Validation Rules
Instructions must satisfy:
Type correctness - Stack has required input types
Label validity - Branch targets exist within scope
Index bounds - All indices reference valid definitions
Mutability - Global/memory writes respect mutability
Execution Semantics
Deterministic - Same inputs always produce same outputs (except NaN handling)
Trapping - Invalid operations trap (division by zero, out-of-bounds access)
Stack-based - All operations via implicit stack manipulation
Usage Notes
No strings - Only numeric types in MVP; strings require memory encoding
32-bit addressing - Memory limited to 4GB in current implementations
Little-endian - Memory layout is always little-endian
IEEE 754 - Floating-point follows IEEE 754-2019 standard
Structured control - All control flow via blocks, no arbitrary
goto
Summary Statistics
From the specification index:
~400+ instructions including all extensions
4 numeric types (i32, i64, f32, f64)
1 vector type (v128 with SIMD)
Reference types (funcref, externref, custom heap types)
Multi-byte encoding for extended instruction sets
This reference covers the core WebAssembly instruction set as defined in the official specification (Release 3.0, 2025-10-06), including both MVP features and newer extensions like SIMD, exception handling, and garbage collection proposals. 🔧
Appendix C: Tools, Libraries, and Resources
Based on the extracted content from WebAssembly: The Definitive Guide, here’s a comprehensive summary of the tools, libraries, and resources appendix:
Overview
This appendix provides installation guidance and references for the essential WebAssembly ecosystem tools. As Steve Jobs noted: “Technology is nothing. What’s important is that you have a faith in people, that they’re basically good and smart, and if you give them tools, they’ll do wonderful things with them.”
The tools covered are foundational for WebAssembly development and work across various platforms, though some are easier to install on Linux or macOS than Windows.
1. Emscripten
Description
Emscripten is a complete compiler toolchain that translates C and C++ code to WebAssembly. It provides comprehensive support for running compiled code in both browser and Node.js environments.
Key Features
Full C/C++ to WebAssembly compilation
Support for widely used dependencies:
Standard C/C++ libraries
OpenGL
Other common libraries
Browser and Node.js runtime support
Extensive ecosystem compatibility
Installation
The official Getting Started guide provides detailed instructions for multiple operating systems:
Resource: https://emscripten.org/docs/getting_started/index.html
Platform Support
Linux
macOS
Windows
2. WebAssembly Binary Toolkit (WABT)
Description
WABT (pronounced “wabbit”) is a suite of tools for working with WebAssembly binary and text formats. It’s essential for debugging, validation, and format conversion.
Key Features
Format conversion between all WebAssembly formats:
.wasm(binary).wat(text)And several others
Module inspection tools:
Dumping module details
Structure validation
Disassembly
Online tools available (browser-based)
Installation
Build instructions for all three major operating systems are available on GitHub:
Repository: https://github.com/WebAssembly/wabt
Online Demo
Try the tools without installation:
Demo: https://webassembly.github.io/wabt/demo
Notable Tools in WABT
wasm2wat- Binary to text format converterwat2wasm- Text to binary format converterwasm-objdump- Display information about wasm fileswasm-validate- Validate wasm fileswasm-strip- Remove debugging information
3. Wasm3
Description
Wasm3 is a high-performance WebAssembly interpreter that claims to be “the fastest WebAssembly interpreter, and the most universal runtime.” It’s designed for embedded systems and resource-constrained environments.
Key Features
Extremely portable - runs on diverse platforms
High performance interpretation
Tracks new proposals actively
Small footprint suitable for embedded systems
Platform Support
Desktop/Server:
Linux
Windows
macOS
FreeBSD
Android
iOS
Embedded/IoT:
OpenWrt
Yocto
Buildroot (network equipment)
Raspberry Pi and other single-board computers
Various microcontrollers
Web:
- Most modern browsers
Installation
Multiple installation methods documented:
Installation Guide: https://github.com/wasm3/wasm3/blob/main/docs/Installation.md
Additional Resources
Cookbook: https://github.com/wasm3/wasm3/blob/main/docs/Cookbook.md
Use Cases
Embedded systems
IoT devices
Edge computing
Mobile applications
Testing and development
4. Wasmtime
Description
Wasmtime is described as a “fast, secure, and standards-compliant runtime for WebAssembly and WASI.” It’s one of the most actively developed and feature-complete runtimes.
Key Features
Optimizing runtime - JIT compilation for performance
WASI support - Full WASI (WebAssembly System Interface) implementation
Up-to-date proposals - Tracks and implements latest WebAssembly proposals
Extensive programmatic libraries - APIs for multiple languages
Production-ready - Used in real-world applications
Language Bindings
Wasmtime provides APIs for:
Rust (native)
C/C++
Python
.NET
Go
And others
Installation
Documentation: https://docs.wasmtime.dev
Installation Instructions: https://docs.wasmtime.dev/cli-install.html
Use Cases
Server-side WebAssembly execution
Plugin systems
Sandboxed code execution
Command-line tools
Embedded runtime in applications
Command-Line Tool
# Example usage
wasmtime run module.wasm
wasmtime compile module.wasm
wasmtime wast test.wast5. SwiftWasm
Description
SwiftWasm is a toolchain for compiling Swift code to WebAssembly, enabling Swift developers to target WebAssembly platforms.
Key Features
Swift to WebAssembly compilation
Swift standard library support
Integration with existing Swift ecosystem
Modern Swift language features
Installation
Multiple installation options available:
Setup Guide: https://book.swiftwasm.org/getting-started/setup.html
Use Cases
iOS/macOS developers targeting WebAssembly
Cross-platform Swift applications
Web applications written in Swift
Shared codebases between native and web
Additional Tools and Utilities
Common C/C++ Development Tools
nm - Binary inspection tool
Prints contents of binary files
Shows symbol tables
Useful for debugging compiled modules
Online Resources
From the book’s references:
C Programming Tutorials:
Learn-C.org - Interactive C tutorial (with ads)
Practical C Programming by Steve Oualline (O’Reilly)
The C Programming Language by Kernighan & Ritchie
LLVM Resources:
Note: LLVM used to stand for “Low-Level Virtual Machine,” but now it’s just “LLVM”
Algorithm Resources:
Merge sort algorithms - O(n log n) complexity
Installation Best Practices
General Guidelines
Check documentation first - Each tool has comprehensive installation guides
Platform-specific instructions - Follow OS-specific steps carefully
Use package managers when available (Homebrew, apt, etc.)
Build from source if needed - Most tools provide clear build instructions
Test installations - Verify tools work before starting projects
Troubleshooting Tips
Dependency issues - Ensure all prerequisites are installed
Path configuration - Add tools to system PATH
Version compatibility - Check that tool versions work together
Online communities - GitHub issues and forums are helpful
Summary Table
| Tool | Primary Use | Platform Support | Installation Difficulty |
|---|---|---|---|
| Emscripten | C/C++ to Wasm compilation | Linux, macOS, Windows | Moderate |
| WABT | Format conversion, inspection | Linux, macOS, Windows | Easy-Moderate |
| Wasm3 | Universal interpreter | Extremely broad | Easy |
| Wasmtime | Optimizing runtime, WASI | Linux, macOS, Windows | Easy |
| SwiftWasm | Swift to Wasm compilation | Linux, macOS | Moderate |
Development Workflow
Typical Tool Chain
Write code in your preferred language (C/C++, Rust, Swift, etc.)
Compile to WebAssembly using appropriate compiler:
Emscripten for C/C++
SwiftWasm for Swift
rustc for Rust
Inspect/debug using WABT tools:
Convert to WAT for readability
Validate structure
Examine symbols
Test execution with runtime:
Wasmtime for WASI modules
Wasm3 for embedded targets
Browser for web applications
Optimize and deploy
Additional Notes
Quote from the Book
“It is unsurprising, given all of the languages, tools, and frameworks that we discuss in this book, that there is a fair amount to install.”
Important Considerations
No comprehensive list - The appendix acknowledges it’s not exhaustive
Platform variations - Some tools easier on Linux/macOS than Windows
Active development - WebAssembly ecosystem evolves rapidly
Community support - GitHub repos are primary resource locations
Online alternatives - Many tools offer browser-based versions
Historical Context
The book references several foundational concepts:
C’s history is integral to modern operating systems
Security issues have been a major concern in C/C++
LLVM evolution from “Low-Level Virtual Machine” to just “LLVM”
Common idioms like “ten pounds of manure in a five-pound bag” for size constraints
Conclusion
This appendix provides essential starting points for WebAssembly development. While not comprehensive, it covers the most important tools in the ecosystem. The linked documentation for each tool provides detailed, up-to-date installation and usage information.
The WebAssembly tooling landscape continues to evolve, so checking official documentation and GitHub repositories is recommended for the latest updates and features. 🛠️
Note: The content is based on the extracted pages from WebAssembly: The Definitive Guide (wasm-defguide.pdf, pages 380-387). For the most current information, always refer to the official documentation links provided above.