Skip to Content
Overview

Token Optimization for AI Agents

VulpineOS reduces the tokens required for an agent to understand a web page from tens of thousands down to a few thousand, cutting cost and latency while improving accuracy.

The Problem

A typical e-commerce page produces ~50,000 tokens when serialized as a full accessibility tree. Most agents hit context window limits or waste budget on structural noise. VulpineOS applies four optimization layers.

Viewport Pruning

Only nodes visible in the current viewport (plus a small buffer) are included in snapshots. Off-screen content is excluded entirely.

Full page AX tree: ~50,000 tokens Viewport-pruned: ~3,000-5,000 tokens (90-94% reduction)

The viewport buffer is configurable — by default it extends 200px beyond the visible area to capture elements the user is about to scroll into:

{ "method": "Page.getOptimizedDOM", "params": { "viewportBuffer": 200 } }

Result Caching

Consecutive snapshots of the same page are diffed internally. If the DOM hasn’t changed since the last snapshot, VulpineOS returns a cache hit flag instead of re-serializing the entire tree.

// First call: full snapshot (3,200 tokens) { "v": 1, "nodes": [...], "cached": false } // Second call, no changes: cache hit (12 tokens) { "v": 1, "cached": true, "hash": "a1b2c3" }

Cache invalidation happens automatically on navigation, DOM mutations, or scroll position changes beyond the viewport buffer.

Incremental Snapshots

When the page has changed partially (e.g., a dropdown opened, a form field updated), VulpineOS sends only the delta:

{ "v": 1, "incremental": true, "added": [[3, "li", "New option", {"idx": 7}]], "removed": [7, 12], "modified": [{"idx": 4, "name": "Updated text"}] }

Incremental snapshots are typically 80% smaller than full snapshots:

ScenarioFull snapshotIncrementalReduction
Dropdown open3,200 tokens420 tokens87%
Form field update3,200 tokens180 tokens94%
Tab switch3,200 tokens890 tokens72%
Page navigation3,200 tokensN/A (full)0%

Tool Call Batching

Multiple tool calls in a single agent turn are batched into one Juggler round-trip, reducing protocol overhead:

// Agent requests 3 actions — batched into one message { "batch": [ { "method": "Page.getOptimizedDOM" }, { "method": "Page.screenshot", "params": { "clip": true } }, { "method": "Page.evaluate", "params": { "expression": "document.title" } } ] }

Batching saves ~200ms of round-trip latency per additional tool call and avoids redundant page state serialization.

End-to-End Example

A typical agent interaction with a search results page:

Step 1 — Initial snapshot: 4,100 tokens (viewport-pruned + optimized DOM) Step 2 — Click search result: 0 tokens (action only) Step 3 — New page snapshot: 3,400 tokens (full, new page) Step 4 — Scroll down: 620 tokens (incremental, new viewport content) Step 5 — Extract data: 0 tokens (cached, no DOM change) ───────── Total: 8,120 tokens Without optimization: ~200,000 tokens (5 full AX tree dumps) Savings: 96%

Configuration

const context = await browser.newContext({ firefoxUserPrefs: { 'vulpineos.dom_export.enabled': true, 'vulpineos.dom_export.viewport_pruning': true, 'vulpineos.dom_export.caching': true, 'vulpineos.dom_export.incremental': true, } })

See also

Last updated on