Skip to Content
Vulpine API Reference

Vulpine API Reference

Vulpine API is the hosted commercial API for VulpineOS browser extraction. It uses real browser sessions when available, falls back to HTTP fetches where appropriate, and returns structured JSON with usage metadata.

Base URL:

https://api.vulpineos.com

Local development:

http://localhost:8080

Authentication

All /v1/* endpoints except /v1/providers require a bearer token:

Authorization: Bearer vk_live_...

The health endpoint is public.

Endpoints

MethodPathAuthPurpose
GET/healthNoService health
GET/v1/providersNoAvailable LLM providers and models
POST/v1/extractYesExtract structured data from one URL
POST/v1/batchYesQueue batch extraction jobs
GET/v1/extractionsYesList extraction history
GET/v1/tasks/{id}YesPoll an async task
POST/v1/tasks/{id}/cancelYesCancel a queued/running task
GET/v1/usageYesCurrent usage and cost metadata
GET/v1/keysYesList API keys
POST/v1/keysYesCreate an API key
POST/v1/monitorYesCreate a recurring extraction monitor
GET/v1/monitorsYesList monitors
DELETE/v1/monitors/{id}YesDelete a monitor

Extract

POST /v1/extract

Use a JSON Schema when you know the target shape:

{ "url": "https://example.com/product/123", "schema": { "type": "object", "properties": { "name": { "type": "string" }, "price": { "type": "number" } }, "required": ["name", "price"] } }

Use a prompt when the schema is not fixed:

{ "url": "https://example.com/catalog", "prompt": "Extract product names, prices, and availability from this page." }

Use async mode for long jobs:

{ "url": "https://example.com/catalog", "prompt": "Extract all products.", "async": true }

Use pagination when the target spans multiple pages:

{ "url": "https://example.com/catalog", "prompt": "Extract product names and prices from every catalog page.", "options": { "paginate": true, "max_pages": 5, "next_selector": "a[rel='next']" } }

Response shape:

{ "status": "completed", "data": { "name": "Example Product", "price": 19.99 }, "metadata": { "completeness": 1, "tokens": 1240, "cost": 0.0021, "duration_ms": 3810, "pages": 1 }, "free_tier_remaining": 99 }

Partial responses keep the same shape with status: "partial" and metadata explaining missing fields or retry behavior.

Batch

POST /v1/batch

Queues up to 100 URLs with the same schema or prompt. Batch jobs are always asynchronous.

{ "urls": [ "https://example.com/a", "https://example.com/b" ], "schema": { "type": "object", "properties": { "title": { "type": "string" } } } }

Response:

{ "tasks": [ { "id": "task_1", "poll": "/v1/tasks/task_1" }, { "id": "task_2", "poll": "/v1/tasks/task_2" } ] }

Tasks

Poll an async task:

GET /v1/tasks/{id}

Cancel a pending or running task:

POST /v1/tasks/{id}/cancel

Task response:

{ "id": "task_123", "status": "completed", "result": { "data": { "title": "Example" }, "metadata": { "completeness": 1 } } }

Statuses are queued, running, completed, failed, or canceled.

Extraction History

GET /v1/extractions

Returns recent extraction tasks and metadata for the authenticated tenant.

Usage

GET /v1/usage

Returns monthly extraction count, token totals, and estimated cost metadata.

API Keys

List keys:

GET /v1/keys

Create a key:

POST /v1/keys
{ "name": "Production worker" }

Key creation returns the secret once. Store it immediately.

Providers

GET /v1/providers

Returns available model providers and supported model identifiers.

Monitors

Create a recurring extraction monitor:

POST /v1/monitor
{ "url": "https://example.com/pricing", "prompt": "Extract the current plan prices.", "interval_seconds": 3600, "max_runs": 24, "webhook_url": "https://example.com/webhooks/vulpine" }

List monitors:

GET /v1/monitors

Delete a monitor:

DELETE /v1/monitors/{id}

Current monitor scheduling enqueues recurring extraction jobs. Change detection, condition evaluation, and richer webhook change payloads are tracked separately in the paid API backlog.

Errors

Errors return JSON:

{ "error": "missing url" }

Common status codes:

StatusMeaning
400Invalid request body or missing required field
401Missing or invalid API key
404Task or monitor not found
429Rate limit or quota exceeded
500Internal execution error

SDKs

TypeScript:

import { Vulpine } from '@vulpineos/sdk' const client = new Vulpine({ apiKey: process.env.VULPINE_API_KEY }) const result = await client.extract({ url: 'https://example.com/catalog', prompt: 'Extract product names and prices', options: { paginate: true, max_pages: 5 } })

Python:

from vulpineos import Vulpine, ExtractRequest client = Vulpine(api_key=os.environ["VULPINE_API_KEY"]) result = client.extract(ExtractRequest( url="https://example.com/catalog", prompt="Extract product names and prices", options={"paginate": True, "max_pages": 5}, ))

See also

Last updated on