Skip to content
GitHub Get Started
Reference

Agent SDK Snapshots

This page is an internals deep-dive on agent SDK snapshotting — an optional optimization that loads an agent’s SDK once per sidecar and reuses it for every session, instead of re-evaluating the whole SDK module graph on each createSession. For the agent-author view (how to opt in and the rules your SDK must follow), see Software Definition → SDK snapshotting & snapshot-safety. For how sessions work in general, see Agent Sessions.

The problem: per-session SDK re-evaluation

Section titled “The problem: per-session SDK re-evaluation”

When a session starts, its agent adapter runs inside a fresh V8 isolate and imports the agent SDK (for Pi, @mariozechner/pi-coding-agent). Importing a real-world SDK means resolving, loading, compiling, and evaluating a large module graph — hundreds of modules running their top-level initialization. That evaluation dominates session-creation latency, and because every session gets a fresh isolate, it is paid again on every createSession.

The work is identical every time: the same modules, evaluated to the same post-init heap, only to be thrown away when the session ends. Snapshotting captures that post-init heap once and stamps it into every new isolate.

agentOS already boots every guest isolate from a V8 startup snapshot of the runtime bridge (the polyfill layer that provides fetch, node builtins, the kernel-backed module loader, etc.). A startup snapshot is a serialized image of a V8 heap after some code has run; restoring it into a fresh isolate reproduces that heap by deserialization rather than re-execution, and each restore produces an independent context.

Agent SDK snapshotting extends that same mechanism: it evaluates the agent SDK into the same snapshot context, right after the bridge, so the captured image contains the bridge and the fully-initialized SDK. Restoring it gives a fresh isolate where the SDK is already present — no import, no evaluation.

The three components are unchanged. Snapshotting only changes how the executor’s isolate is seeded:

  • Client. Builds the SDK bundle at package-build time and passes it to the sidecar as trusted VM configuration (jsRuntime.snapshotUserlandCode). It decides which agents opt in.
  • Sidecar. Owns the snapshot. It builds the snapshot once, caches it process-wide, and seeds each session’s isolate from it. The SDK runs inside the isolate under the same permission policy as any guest code — snapshotting grants the SDK no extra capability.
  • Executor. The agent adapter restores into an isolate where the SDK is already on the global, reads it, and proceeds. It is untrusted guest code as always.

1. Bundle the SDK to one snapshottable unit

Section titled “1. Bundle the SDK to one snapshottable unit”

The agent SDK and its transitive dependencies are bundled (esbuild, IIFE) into a single file shipped with the agent package (dist/sdk-snapshot.js). The bundle evaluates the SDK and publishes its public API on a well-known global (e.g. globalThis.__PI_SDK_RUNTIME__). Node builtins stay external (resolved by the bridge’s in-snapshot polyfills); heavy provider SDKs that are only reached via dynamic import() stay lazy and load post-restore from the VFS.

2. Capture the evaluated SDK into the snapshot

Section titled “2. Capture the evaluated SDK into the snapshot”

The sidecar runs the bridge, then the SDK bundle, in one snapshot-creation context, then serializes the heap. The result is a startup blob containing both. Per-session configuration (cwd, model, API keys) is not captured — it is injected after restore, so one blob serves every session.

3. Cache it process-wide, keyed by content

Section titled “3. Cache it process-wide, keyed by content”

The blob is stored in a sidecar-process-wide cache keyed by sha256(bridge + bundle). Any change to the bridge or the bundled dependency graph changes the key and triggers exactly one rebuild; an unchanged bundle is a cache hit. This is what makes it build-once-per-sidecar: the cache is shared across every VM and session in the process.

Building the snapshot is itself the expensive evaluation, just done once. To keep it off the session-create critical path, the sidecar pre-warms at VM creation: when a VM is configured with a snapshot bundle, the sidecar builds the blob into the cache before the first session is created. The first session then restores from a warm cache like every session after it.

On createSession, the agent’s isolate is created from the cached blob. The SDK is already evaluated in the snapshot’s default context, and each session gets a fresh context cloned from it. The adapter reads the SDK off the global instead of importing it. If no snapshot is configured — or snapshot creation fails — the adapter transparently falls back to the per-session dynamic-import path, so snapshotting never affects correctness, only latency.

Each session leases a fresh context cloned from the snapshot’s default context. The captured SDK is shared read-only through the blob, but live state is not: a global, a captured-object mutation, or even a built-in prototype change made in one session is not observable in another. This is the same isolation guarantee as a non-snapshot session — a fresh isolate per session — and is covered by dedicated tests (a session that mutates globalThis, an SDK object, and Array.prototype, with a second session from the same snapshot seeing none of it).

A startup snapshot can only capture a pure JS heap. The SDK’s module-initialization code (everything that runs at import time, before any function is called) must not, at top level:

  • Create a native handle — load a .node addon, instantiate WebAssembly, or produce any V8 External object (for example an ICU-backed Intl.Segmenter singleton). These cannot be serialized and abort snapshot creation.
  • Open an fd, socket, timer, or worker, or leave a pending promise at the end of evaluation.
  • Read non-deterministic or per-session state (process.env, cwd, model, Date.now(), Math.random(), a random UUID) into a module constant — it would be frozen to the build-time value.

Real-world SDKs frequently break these by accident. agentOS makes such an SDK snapshottable with build-time transforms in the bundle that defer the offending work to first use (e.g. wrap a module-level native singleton in a lazy proxy, inline a top-level config-file read, convert eager fire-and-forget imports to synchronous module references). Each transform asserts the source shape it expects, so an upstream SDK change surfaces as a build error rather than a silent regression. The full author-facing rules live in the Software Definition reference.

Snapshotting is opt-in per agent, via agent.snapshot: true on the agent software descriptor, and requires the agent package to ship a snapshot-safe dist/sdk-snapshot.js. Today only the Pi agent opts in; other agents run the normal per-session import path. An agent qualifies by (1) being snapshot-safe and (2) building the bundle — there is nothing Pi-specific in the runtime mechanism.