MCP Tools
VulpineOS provides a Model Context Protocol (MCP) server that exposes browser automation as tool calls. Each AI browser agent uses these tools to browse — every action goes through the anti-detect browser engine (Camoufox). OpenClaw agents call these tools directly for all browser automation tasks.
./vulpineos --mcp-serverAvailable Tools (36)
Core Browser Control
| Tool | Description |
|---|---|
vulpine_navigate | Navigate a session to a URL |
vulpine_snapshot | Token-optimized semantic DOM with @ref identifiers |
vulpine_click | Click at page coordinates |
vulpine_type | Type into the focused element |
vulpine_screenshot | Capture a base64 PNG screenshot |
vulpine_scroll | Scroll by pixel delta |
vulpine_new_context | Create an isolated browser context and page |
vulpine_close_context | Close a browser context |
vulpine_get_ax_tree | Read the injection-filtered accessibility tree |
vulpine_click_ref | Click by @ref from vulpine_snapshot |
vulpine_type_ref | Focus and type by @ref |
vulpine_hover_ref | Hover by @ref |
vulpine_snapshot is the primary page-reading tool. It returns compressed DOM with element refs:
{
"name": "vulpine_snapshot",
"arguments": { "sessionId": "sess-1", "viewportOnly": true }
}{"v":1,"title":"Example","url":"https://example.com","nodes":[
[0,"doc","Example"],
[1,"a","Home",{"hr":"/"},"@0"],
[1,"btn","Sign Up",null,"@1"]
]}Reliability And Interaction
| Tool | Description |
|---|---|
vulpine_wait | Wait for an element, text, network idle, stable DOM, or URL substring |
vulpine_find | Search interactive elements by text, label, placeholder, or role |
vulpine_verify | Check state after an action: exists, visible, checked, value, text, URL, title |
vulpine_screenshot_diff | Compare screenshot checkpoints for visual changes |
vulpine_page_settled | Wait for ready state, DOM stability, and pending images |
vulpine_select_option | Select dropdown options by value or visible text |
vulpine_fill_form | Fill multiple form fields in one call |
vulpine_page_info | Summarize page URL, title, scroll, forms, buttons, links, and modals |
vulpine_press_key | Press keys and shortcuts with modifiers |
vulpine_clear_input | Clear focused or selected input text |
vulpine_get_form_errors | Extract HTML5, CSS, and ARIA form validation errors |
Human Realism
| Tool | Description |
|---|---|
vulpine_human_click | Move and click with natural timing, curves, and jitter |
vulpine_human_type | Type with variable human-like cadence |
vulpine_human_scroll | Scroll with inertial timing |
Extension Surfaces
These tools expose stable VulpineOS feature surfaces. Open-source builds return graceful unavailable errors when a provider is not present; commercial/private builds can attach their own providers without exposing implementation source.
| Tool | Description |
|---|---|
vulpine_annotated_screenshot | Screenshot with durable @N labels for interactive elements |
vulpine_click_label | Click an element from the latest annotated screenshot by label |
vulpine_get_credential | Return stored credential metadata for a site, never plaintext |
vulpine_autofill | Ask the credential provider to fill username and password fields |
vulpine_start_audio_capture | Start an audio capture session |
vulpine_stop_audio_capture | Stop an audio capture session |
vulpine_read_audio_chunk | Read base64 audio bytes from a capture session |
vulpine_list_mobile_devices | List mobile devices visible to the bridge provider |
vulpine_connect_mobile_device | Start a local CDP bridge to an Android device and return an endpoint |
vulpine_disconnect_mobile_device | Stop a mobile bridge session |
{
"name": "vulpine_connect_mobile_device",
"arguments": { "udid": "R58N12ABCDE" }
}Ref-Based Workflow
The recommended agent workflow:
- Call
vulpine_snapshotto get the page DOM with@refidentifiers - Agent reads the snapshot and decides which element to interact with
- Call
vulpine_click_reforvulpine_type_refwith the@ref— no coordinates needed - Call
vulpine_snapshotagain to see the result
This is more reliable than coordinate-based clicking because refs resolve to the actual DOM element, scroll it into view, and compute the center point automatically.
OpenClaw Integration
VulpineOS auto-generates ~/.openclaw-vulpine/openclaw.json. When foxbridge is available, it sets browser.cdpUrl so OpenClaw uses its native CDP browser tools through Camoufox:
{
"browser": {
"enabled": true,
"cdpUrl": "ws://localhost:9222"
}
}See also
- Getting Started — install and launch your first agent
- Token-Optimized DOM Export — compressed snapshots for LLM context
- Foxbridge CDP Proxy — embedded CDP server for OpenClaw
- Agent Scripting DSL — JSON automation without LLM calls