E2E Testing Tips: Lessons from Testing with Riddle
Practical tips learned from extensive browser automation—including using Riddle to test Riddle itself.
We recently ran a comprehensive test suite against our own playground UI—using Riddle to test Riddle. Along the way, we learned (and re-learned) several lessons about effective browser automation.
Here are the practical tips that will save you time and money.
1. Know When to Use Sync vs Async
Riddle offers two modes, and choosing correctly matters:
Sync Mode
- 28-second hard limit—your workflow must complete within this window
- Returns results directly in the response
- Great for quick screenshots and simple workflows
- Use
include: ['screenshots']to get JSON with all screenshots
// Sync mode - results come back directly
const response = await fetch("https://api.riddledc.com/v1/run", {
method: "POST",
headers: {
"Authorization": "Bearer " + token,
"Content-Type": "application/json"
},
body: JSON.stringify({
steps: [
{ goto: "https://example.com" },
{ screenshot: "homepage" }
],
sync: true,
timeout_sec: 28,
include: ["screenshots"] // Important! Gets JSON with all screenshots
})
});
const data = await response.json();
// data.screenshots = [{ name: "homepage.png", data: "base64..." }]Async Mode
- No time limit (up to 30 minutes)
- Returns a job ID immediately, poll for results
- Required for complex, multi-page workflows
- Better for production pipelines
// Async mode - poll for results
const submitResponse = await fetch("https://api.riddledc.com/v1/run", {
method: "POST",
headers: { "Authorization": "Bearer " + token, "Content-Type": "application/json" },
body: JSON.stringify({
steps: [...],
sync: false, // Async mode
timeout_sec: 120
})
});
const { job_id } = await submitResponse.json();
// Poll for completion
const artifacts = await pollForArtifacts(job_id);Rule of thumb: Use sync for anything under 20 seconds, async for everything else.
2. The include Parameter Changes Everything
Without include, sync mode returns a raw PNG of just the first screenshot. With it, you get rich JSON:
{
include: ["screenshots", "console", "har"]
}screenshots—All captured screenshots as base64 (or URLs if response is large)console—Browser console logs with timestampshar—Full network request/response data
Pro tip: If you're capturing multiple screenshots in one job, always include screenshotsor you'll only get the first one.
3. Handling Large Responses Gracefully
Multi-page workflows with HAR can generate massive responses. A workflow navigating through several pages might produce a 10MB+ HAR file.
Riddle handles this automatically: when the response would exceed size limits, screenshots and HAR are returned as CDN URLs instead of base64 data:
// Normal response (small)
{
screenshots: [
{ name: "page1.png", data: "data:image/png;base64,..." }
]
}
// Large response (auto-converted to URLs)
{
screenshots: [
{ name: "page1.png", url: "https://cdn.riddledc.com/..." }
],
har: {
url: "https://cdn.riddledc.com/...",
_note: "HAR returned as URL due to size limits"
},
_note: "Screenshots returned as URLs instead of base64 due to response size limits"
}Your code should handle both formats:
function getScreenshotUrl(screenshot) {
if (screenshot.data) {
// Base64 data - use directly as image src
return screenshot.data.startsWith('data:')
? screenshot.data
: `data:image/png;base64,${screenshot.data}`;
} else if (screenshot.url) {
// CDN URL - fetch or use directly
return screenshot.url;
}
}4. Always Wait After Navigation
This is the most common source of flaky tests. After any action that triggers navigation, wait for the page to stabilize:
// BAD - screenshot might capture loading state
await page.click("a.next-page");
await saveScreenshot("results");
// GOOD - wait for page to fully load
await page.click("a.next-page");
await page.waitForLoadState("networkidle");
await saveScreenshot("results");In steps mode (JSON), use the waitForLoadState step:
{
"steps": [
{ "click": "a.next-page" },
{ "waitForLoadState": "networkidle" },
{ "screenshot": "results" }
]
}networkidle waits until there are no network requests for 500ms—usually what you want for SPAs and dynamic content.
5. Write Reliable Selectors
Generic selectors are the enemy of reliable tests:
// FRAGILE - could match anything
await page.click("button");
await page.click("a");
await page.locator("div").first();
// ROBUST - specific and intentional
await page.click('button:has-text("Submit")');
await page.click('a[href="/dashboard"]');
await page.locator('[data-testid="user-menu"]');Selector priority (most to least reliable):
data-testidattributes (if available)- Unique IDs:
#submit-btn - Text content:
button:has-text("Submit") - Specific classes:
.primary-action-btn - Attribute selectors:
input[name="email"]
6. Use Console Logging for Debugging
When things go wrong, console logs are your best friend:
await page.goto("https://example.com");
// Log what you find
const itemCount = await page.locator(".item").count();
console.log("Found items:", itemCount);
const buttonVisible = await page.locator('button:has-text("Submit")').isVisible();
console.log("Submit button visible:", buttonVisible);
// Log current URL after navigation
await page.click("a.next");
await page.waitForLoadState("networkidle");
console.log("Current URL:", page.url());
await saveScreenshot("debug");Include console in your request to retrieve these logs:
{
include: ["screenshots", "console"]
}
// Response includes:
{
console: {
summary: { total_entries: 3, log_count: 3 },
entries: {
log: [
{ timestamp: 1702..., message: "Found items: 5" },
{ timestamp: 1702..., message: "Submit button visible: true" },
{ timestamp: 1702..., message: "Current URL: https://example.com/results" }
]
}
}
}7. Optimize for the 30-Second Minimum
Riddle bills per second with a 30-second minimum per job. This has two implications:
Pack Multiple Screenshots Into One Job
A 5-second job costs the same as a 30-second job. So this:
// 5 separate jobs = 5 × $0.004 = $0.02
await runJob({ url: "https://example.com/page1" });
await runJob({ url: "https://example.com/page2" });
await runJob({ url: "https://example.com/page3" });
await runJob({ url: "https://example.com/page4" });
await runJob({ url: "https://example.com/page5" });Is much more expensive than this:
// 1 job with 5 screenshots = $0.004
await runJob({
steps: [
{ goto: "https://example.com/page1" },
{ screenshot: "page1" },
{ goto: "https://example.com/page2" },
{ screenshot: "page2" },
{ goto: "https://example.com/page3" },
{ screenshot: "page3" },
{ goto: "https://example.com/page4" },
{ screenshot: "page4" },
{ goto: "https://example.com/page5" },
{ screenshot: "page5" }
]
});5x cheaper for the same result.
Don't Fear Longer Jobs
After the 30-second minimum, you're only paying ~$0.008/minute. A complex 2-minute workflow costs about $0.02—don't artificially split it into multiple jobs just to "keep things short."
8. The E2E Testing Pattern
Here's the pattern we used to test our own UI:
async function testFeature(token, testName, script) {
// Submit job
const job = await submitJob(token, script);
console.log(`[${testName}] Job: ${job.job_id}`);
// Poll for completion
const result = await pollForCompletion(token, job.job_id);
console.log(`[${testName}] Status: ${result.status}`);
// Check console output for test result
const consoleData = await fetchConsole(result.artifacts);
const logs = consoleData?.entries?.log || [];
const resultLog = logs.find(l => l.message.includes("RESULT:"));
const testResult = resultLog?.message.replace("RESULT:", "").trim();
return {
test: testName,
status: testResult === "SUCCESS" ? "PASS" : "FAIL",
screenshots: result.artifacts.filter(a => a.name.endsWith(".png")).length
};
}
// The test script reports its own result
const script = `
await page.goto("https://myapp.com/login");
await page.fill("#email", "test@example.com");
await page.fill("#password", "password");
await page.click('button[type="submit"]');
await page.waitForLoadState("networkidle");
const loggedIn = await page.locator(".dashboard").isVisible();
if (loggedIn) {
console.log("RESULT: SUCCESS");
} else {
console.log("RESULT: FAIL");
}
await saveScreenshot("final-state");
`;This pattern lets you run comprehensive test suites and programmatically check results.
9. Give Screenshots Unique, Descriptive Names
Each saveScreenshot() call needs a unique name:
// BAD - overwrites previous screenshot
await saveScreenshot("screenshot");
await page.click(".next");
await saveScreenshot("screenshot"); // Overwrites!
// GOOD - clear, sequential names
await saveScreenshot("1-homepage");
await page.click(".next");
await saveScreenshot("2-after-click");
await page.fill("#search", "query");
await saveScreenshot("3-search-filled");Descriptive names make debugging much easier when reviewing results.
Quick Reference
| Situation | Solution |
|---|---|
| Quick screenshot (<20s) | Sync mode |
| Complex workflow | Async mode |
| Multiple screenshots | include: ["screenshots"] |
| Debugging failures | include: ["console"] + console.log() |
| Flaky after clicks | waitForLoadState("networkidle") |
| Cost optimization | Batch screenshots into one job |
Try It Yourself
The best way to learn is by doing. Start with a simple workflow and build up from there.