What you’ll know by the end of this check
- Why “no API” is no longer a wall — it’s a fork in the road
- The two ways Claude can drive a web app, and which one to reach for first
- A four-option ladder for handling credentials without leaking them
The setup: a real family problem
Annie got a Skylight. It’s a smart-home display that lives on the kitchen wall. The to-do list lives at app.ourskylight.com. The family lives in Slack.
Goal: when someone adds “pick up prescriptions” to the Skylight, it shows up in #to-dos so anyone can see it without walking to the kitchen.
Skylight has no API. No webhook. No public docs. Nothing to integrate against.
This is not a Skylight problem. It’s the most common shape of automation request you’ll ever get — every internal tool at work, half the SaaS apps you actually use, every legacy admin dashboard with a 2014 login screen. No API.
Before computer use, that meant: write a custom scraper, fight CAPTCHAs, maintain it forever. Now Claude has two ways through the door.
Two tracks, same destination
┌─────────────────────────────┐
│ app.ourskylight.com │
│ (To-Do List) │
└──────────┬──────────────────┘
│
┌──────────────┼──────────────┐
│ │
Track A (CLI) Track B (Desktop)
Playwright MCP Computer Use
reads the DOM sees the screen
│ │
└──────┬──────────────────────┘
│
┌──────▼───────┐
│ Slack │
│ #to-dos │
└──────────────┘
Track A — Playwright MCP (DOM-based)
Claude Code in the terminal, plus an MCP server that exposes a real browser. The server is maintained by Microsoft’s Playwright team (microsoft/playwright-mcp, docs).
claude mcp add playwright npx @playwright/mcp@latest
That’s it. The browser auto-downloads on first use. Node 18+ is the only prerequisite.
Claude calls structured tools — browser_navigate, browser_snapshot, browser_click, browser_fill. It reads the accessibility tree, not pixels. It sees the page the way a screen reader does.
Fast. Deterministic. Cheap on tokens. Breaks the moment Skylight ships a DOM redesign.
Track B — Desktop Computer Use (screenshot-based)
Claude Desktop app, computer-use toggle on. Claude takes a screenshot, sees the desktop the way you see it, then sends mouse and keyboard events.
No setup beyond installing the app and turning the toggle on.
Slower (every action is a screenshot round trip). Vision-native. Survives any UI change a human could survive. Costs more tokens. Requires the screen unlocked.
Computer use is in beta and lives in the official Anthropic docs. A working reference implementation (Docker + agent loop + web UI) is on GitHub.
When to reach for which
| Choose Track A (Playwright MCP) when… | Choose Track B (Computer Use) when… |
|---|---|
| The site has a clean, stable DOM | The site is canvas, heavy SVG, or vision-only |
| You want speed and low token cost | You’re hitting CAPTCHAs or anti-bot challenges |
| You want to schedule it unattended | You want to debug visually in real time |
| You’re comfortable with CLI tooling | You’d rather watch Claude work than read JSON |
| The site is forms + buttons + lists | The UI is genuinely weird (custom drag-drop, pop-ups, modals stacked four deep) |
Default to Track A. It’s faster to set up, faster to iterate, faster to run, and easier to debug. Reach for Track B when Track A breaks — and let the breakage tell you why.
Credentials: four ways to handle them
Both tracks need to log in. Here’s the ladder, easiest to most secure.
Option 1 — Paste in the prompt
This is the format Anthropic actually recommends in the computer use docs: “If you need the model to log in, provide it with the username and password in your prompt inside xml tags like <robot_credentials>.”
<robot_credentials>
email: annie@example.com
password: hunter2
</robot_credentials>
Same docs link the prompt-injection mitigation guide and Anthropic’s browser-use defenses post — read both before pasting anything sensitive.
Pros: zero setup. Works the first try. Cons: lives in your conversation history forever. Don’t use this for anything you’d care about leaking.
Option 2 — Environment variable
export SKYLIGHT_USER="annie@example.com"
export SKYLIGHT_PASS="hunter2"
Then in the prompt: “Log in using $SKYLIGHT_USER and $SKYLIGHT_PASS from the environment.”
Pros: out of the prompt, out of the transcript.
Cons: lives in your shell history if you typed it directly. Use a .env file with set -a; source .env; set +a and gitignore it.
Option 3 — 1Password CLI
op read "op://Personal/Skylight/password"
The format is 1Password’s secret-reference syntax — op://vault/item/field. Full op read reference. Inject the value at session start. Nothing plaintext anywhere on disk.
Pros: auditable, rotatable, works on every device with op installed. The right answer for anything sensitive.
Cons: requires a 1Password subscription and the op CLI authenticated to your account.
Option 4 — macOS Keychain
security find-generic-password -s "Skylight" -a "annie@example.com" -w
Native to the OS. Free. Survives reboots. Scriptable. Full flag list at ss64.com/mac/security.html — -s is service, -a is account, -w returns just the password.
Pros: zero extra software. Granular per-app. Cons: macOS-only. Less polished than 1Password for sharing or rotation.
Rule of thumb: experiments use Option 1, anything you’d repeat uses Option 3 or 4.
Picking a winner
After both tracks log in and post the same list to Slack, you compare. Run them against the same site, the same day, the same network, and rate them on:
| Criterion | What you’re measuring |
|---|---|
| Setup difficulty | Time from “I want to try this” to “it ran.” |
| Reliability | Of 10 runs, how many finished without intervention? |
| Speed | Wall-clock time end-to-end. |
| Token cost | What did the session burn? |
| Brittleness | Will a UI redesign break it tomorrow? |
| Reusability | Can you point this at a different site without rewriting? |
| Debuggability | When it fails, can you see why in 30 seconds? |
| Scheduling potential | Could a cron job run this unattended at 7am? |
Write the verdict down somewhere version-controlled. Future-you and future-teammate-you will both want it.
In the Skylight run, Track A won on speed and scheduling, Track B won on first-try success (the Skylight login screen has a JS-rendered form Playwright didn’t see without an explicit wait). The right answer wasn’t either one — it was Track A with a Track B fallback.
Things to try right now (15 minutes)
- Pick a web app you use weekly that has no API — your bank, your kid’s school portal, an internal tool at work.
- Open Claude Code. Add Playwright MCP:
claude mcp add playwright npx @playwright/mcp@latest. - Ask Claude to navigate to that site and read one specific piece of information back to you. Use Option 1 credentials for the test.
- If it works, you’re on Track A. If it stalls or misreads the page, you’ve found a Track B candidate.
- Write down which site you tried and which track won. That’s your first data point.
Ready to verify this check?
You can name both tracks, explain when to use each, list at least three of the four credential options, and you’ve run one of them against a real site. Mark it cleared.
Sources and further reading
Computer Use (Anthropic)
- Computer use tool — Claude API docs — the canonical reference. Includes the
<robot_credentials>recommendation in the prompting tips section. - Mitigating prompt injections in browser use — Anthropic’s research post on the defenses now shipping in computer use.
- Mitigate jailbreaks and prompt injections — implementation guide.
- anthropic-quickstarts/computer-use-demo — Docker reference implementation.
Playwright MCP
- microsoft/playwright-mcp — official Microsoft repo.
- Playwright MCP getting started — setup walkthrough on playwright.dev.
- @playwright/mcp on npm — package listing.
Credentials
- 1Password CLI ·
op read— command reference. - 1Password secret-reference syntax — the
op://vault/item/fieldformat. securitycommand — ss64 — macOS keychain CLI flags.- SecKeychainFindGenericPassword (Apple Developer) — the underlying API the
securityCLI wraps.