Token Optimization for AI Agents
VulpineOS reduces the tokens required for an agent to understand a web page from tens of thousands down to a few thousand, cutting cost and latency while improving accuracy.
The Problem
A typical e-commerce page produces ~50,000 tokens when serialized as a full accessibility tree. Most agents hit context window limits or waste budget on structural noise. VulpineOS applies four optimization layers.
Viewport Pruning
Only nodes visible in the current viewport (plus a small buffer) are included in snapshots. Off-screen content is excluded entirely.
Full page AX tree: ~50,000 tokens
Viewport-pruned: ~3,000-5,000 tokens (90-94% reduction)The viewport buffer is configurable — by default it extends 200px beyond the visible area to capture elements the user is about to scroll into:
{ "method": "Page.getOptimizedDOM", "params": { "viewportBuffer": 200 } }Result Caching
Consecutive snapshots of the same page are diffed internally. If the DOM hasn’t changed since the last snapshot, VulpineOS returns a cache hit flag instead of re-serializing the entire tree.
// First call: full snapshot (3,200 tokens)
{ "v": 1, "nodes": [...], "cached": false }
// Second call, no changes: cache hit (12 tokens)
{ "v": 1, "cached": true, "hash": "a1b2c3" }Cache invalidation happens automatically on navigation, DOM mutations, or scroll position changes beyond the viewport buffer.
Incremental Snapshots
When the page has changed partially (e.g., a dropdown opened, a form field updated), VulpineOS sends only the delta:
{
"v": 1,
"incremental": true,
"added": [[3, "li", "New option", {"idx": 7}]],
"removed": [7, 12],
"modified": [{"idx": 4, "name": "Updated text"}]
}Incremental snapshots are typically 80% smaller than full snapshots:
| Scenario | Full snapshot | Incremental | Reduction |
|---|---|---|---|
| Dropdown open | 3,200 tokens | 420 tokens | 87% |
| Form field update | 3,200 tokens | 180 tokens | 94% |
| Tab switch | 3,200 tokens | 890 tokens | 72% |
| Page navigation | 3,200 tokens | N/A (full) | 0% |
Tool Call Batching
Multiple tool calls in a single agent turn are batched into one Juggler round-trip, reducing protocol overhead:
// Agent requests 3 actions — batched into one message
{
"batch": [
{ "method": "Page.getOptimizedDOM" },
{ "method": "Page.screenshot", "params": { "clip": true } },
{ "method": "Page.evaluate", "params": { "expression": "document.title" } }
]
}Batching saves ~200ms of round-trip latency per additional tool call and avoids redundant page state serialization.
End-to-End Example
A typical agent interaction with a search results page:
Step 1 — Initial snapshot: 4,100 tokens (viewport-pruned + optimized DOM)
Step 2 — Click search result: 0 tokens (action only)
Step 3 — New page snapshot: 3,400 tokens (full, new page)
Step 4 — Scroll down: 620 tokens (incremental, new viewport content)
Step 5 — Extract data: 0 tokens (cached, no DOM change)
─────────
Total: 8,120 tokens
Without optimization: ~200,000 tokens (5 full AX tree dumps)
Savings: 96%Configuration
const context = await browser.newContext({
firefoxUserPrefs: {
'vulpineos.dom_export.enabled': true,
'vulpineos.dom_export.viewport_pruning': true,
'vulpineos.dom_export.caching': true,
'vulpineos.dom_export.incremental': true,
}
})See also
- Token-Optimized DOM Export — compressed array-of-tuples format
- Cost Tracking — per-agent token usage and budget limits
- MCP Browser Tools — 36 tools for AI agent browser control