Your Agent's Browser
Stop managing browsers. Elevate your agents.
One API call. Many actions. Rich results.
Structured Steps API
Your agent generates JSON steps. We execute them and return rich results.
const response = await fetch("https://api.riddledc.com/v1/run", {
method: "POST",
headers: {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
},
body: JSON.stringify({
steps: [
{ goto: "https://example.com/login" },
{ fill: { selector: "#email", value: "user@example.com" } },
{ fill: { selector: "#password", value: "secret" } },
{ click: "button[type=submit]" },
{ waitForUrl: "**/dashboard**" },
{ screenshot: "dashboard" }
]
})
});
// Sync by default - results returned directly
const { status, screenshots, console: logs } = await response.json();
// screenshots.dashboard = "https://cdn.riddledc.com/..."Rich results
Get back screenshots, console logs, network HAR, assertion results, and downloaded files—not just pixels.
No Playwright required
Your agent generates JSON, not code. No syntax errors, no script injection risks.
Assert without vision
Use assert steps to check page state. Only screenshot when you need to.
Pack work into sessions
30s minimum per job. Do multiple navigations and screenshots in one call for maximum value.
The Problem with Vision Loops
The screenshot-per-action pattern is expensive and slow:
Each browser call starts at ~$0.004 + vision API costs. A 50-step task with naive one-call-per-step architecture can cost $0.20+ in browser time alone. The fix: batch deterministic steps into single calls, screenshot only at decision points.
What Slows Your Agent Down
The Login Loop
Every screenshot restarts the browser. Every restart loses cookies. Your agent wastes 30+ seconds re-authenticating for each action.
Chrome Memory Instability
Running headless Chrome locally eats RAM. Parallel agents crash. Memory leaks accumulate. Your agent becomes unreliable after a few hundred steps.
Infrastructure Complexity
Docker containers, browser pools, scaling—you want to build agent logic, not manage Chrome infrastructure.
Cost Accumulation
Vision API calls run $0.01-0.02 per image. Your browser costs pile on top. Browser costs should be noise, not a line item.
Assert Before You Screenshot
Check page state without burning vision API credits. Only screenshot when you need human-level understanding.
const response = await fetch("https://api.riddledc.com/v1/run", {
method: "POST",
headers: {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
},
body: JSON.stringify({
steps: [
{ goto: "https://example.com/checkout" },
// Check if we're on the right page without screenshotting
{ assert: [{ selectorExists: ".checkout-form" }], onFail: ["screenshot", "abort"] },
// Fill the form
{ fill: { selector: "#card-number", value: "4242424242424242" } },
{ fill: { selector: "#expiry", value: "12/25" } },
{ click: "button.submit-payment" },
// Assert success, only screenshot if something went wrong
{ assert: [{ urlIncludes: "/confirmation" }], onFail: ["screenshot", "abort"] },
// Success! Screenshot the confirmation
{ screenshot: "confirmation" }
]
})
});
// If all asserts pass: 1 screenshot
// If any assert fails: screenshot of failure state + abortOr Just Grab a Screenshot
Don't need multi-step flows? Get a PNG in one call.
curl -X POST "https://api.riddledc.com/v1/run" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com", "sync": true}' \
-o screenshot.png
# PNG bytes. 3-4 seconds. No polling.Skip the Login Loop
Inject cookies or headers. Your agent authenticates once, screenshots forever.
# Pass session cookies - skip login entirely
curl -X POST "https://api.riddledc.com/v1/run" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://app.example.com/dashboard",
"options": {
"cookies": [
{"name": "session_id", "value": "abc123", "domain": "app.example.com"}
]
}
}' -o dashboard.png
# Or use Bearer tokens for API-protected pages
curl -X POST "https://api.riddledc.com/v1/run" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://app.example.com/dashboard",
"options": {
"headers": {"Authorization": "Bearer YOUR_APP_TOKEN"}
}
}' -o dashboard.pngYour agent logs in once, extracts the session cookie, then passes it to every Riddle call. No more 30-second login loops burning tokens and time.Full auth guide →
Batch for Sub-Penny Screenshots
Need multiple screenshots? One call, multiple URLs, one billing minimum.
# 5 screenshots, one API call, ~$0.0008 each
curl -X POST "https://api.riddledc.com/v1/run" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"urls": [
"https://example.com",
"https://example.com/pricing",
"https://example.com/docs",
"https://example.com/about",
"https://example.com/contact"
]
}'
# Returns job_id, poll for screenshots
# Total cost: ~$0.004 for all 5Python Integration
No SDK needed. Standard HTTP requests.
import requests
import base64
def screenshot_for_vision(url, api_key, cookies=None):
"""Take screenshot, return base64 for vision LLM."""
body = {"url": url}
if cookies:
body["options"] = {"cookies": cookies}
response = requests.post(
"https://api.riddledc.com/v1/run",
headers={"Authorization": f"Bearer {api_key}"},
json=body
)
return base64.b64encode(response.content).decode()
# Use with GPT-4V
screenshot_b64 = screenshot_for_vision(
"https://example.com",
API_KEY,
cookies=[{"name": "session", "value": "abc123", "domain": "example.com"}]
)
response = openai.chat.completions.create(
model="gpt-4o",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "What actions are available on this page?"},
{"type": "image_url", "image_url": {"url": f"data:image/png;base64,{screenshot_b64}"}}
]
}]
)TypeScript / Node.js Integration
For LangChainJS, Next.js, or any Node-based agent.
async function screenshotForVision(
url: string,
apiKey: string,
cookies?: { name: string; value: string; domain: string }[]
): Promise<string> {
const body: Record<string, unknown> = { url };
if (cookies) body.options = { cookies };
const response = await fetch("https://api.riddledc.com/v1/run", {
method: "POST",
headers: {
"Authorization": `Bearer ${apiKey}`,
"Content-Type": "application/json",
},
body: JSON.stringify(body),
});
const buffer = await response.arrayBuffer();
return Buffer.from(buffer).toString("base64");
}
// Use with OpenAI SDK
import OpenAI from "openai";
const openai = new OpenAI();
const screenshotB64 = await screenshotForVision(
"https://example.com",
process.env.RIDDLE_API_KEY!,
[{ name: "session", value: "abc123", domain: "example.com" }]
);
const response = await openai.chat.completions.create({
model: "gpt-4o",
messages: [{
role: "user",
content: [
{ type: "text", text: "What actions are available on this page?" },
{ type: "image_url", image_url: { url: `data:image/png;base64,${screenshotB64}` } }
]
}]
});Works With Your Stack
Browser-Use
Replace local Chrome with Riddle API calls. Same observe-think-act loop, no infrastructure.
LangChain
Custom tool wrapper for PlayWrightBrowserToolkit. Simpler than managing browser pools.
CrewAI
Screenshot tool for your crew's web research and verification tasks.
Custom Agents
Simple REST API. Works with any language, any framework. No SDK required.
Need More Control?
For multi-step workflows, use script mode. Navigate, click, fill forms, then screenshot.
# Login flow + screenshot in one call
curl -X POST "https://api.riddledc.com/v1/run" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"script": "await page.goto("https://app.example.com/login");
await page.fill("input[name=email]", "user@example.com");
await page.fill("input[name=password]", "password123");
await page.click("button[type=submit]");
await page.waitForURL("**/dashboard");
await saveScreenshot("dashboard");"
}'Script mode runs full Playwright scripts server-side. Great for complex workflows, but most agents should use steps mode or url mode with cookie injection.
vs. Self-Hosted Chrome
| Self-Hosted Puppeteer | Riddle API | |
|---|---|---|
| Memory per agent | 500MB-2GB | 0 (API call) |
| Parallel agents | Limited by RAM | Unlimited |
| Setup time | Hours (Docker, deps) | Minutes (API key) |
| Session persistence | Complex pooling | Cookie injection |
| Cost per job | ~$0.001 + infra + your time | from $0.004 (<$0.001/screenshot batched) |
Pricing
30s minimum, sync by default
Multiple screenshots per job
Browser costs should be ~5% of your LLM spend, not a major line item. No subscriptions. Pay for what you use.
Agent Guide
Building an agent? Get the complete technical reference—copy-paste ready for your agent's context.
Give Your Agent a Browser
Create an account and start making requests in minutes.