MCP servers that can take a screenshot
6 verified servers expose a tool that can take a screenshot of a web page
A screenshot is how an agent shows its work on the visual web: proof a page rendered, a frame from a failing test, a design captured for comparison. Several browser and scraping servers expose it as one tool, and they split into two camps depending on whether you bring your own browser.
These verified servers let an agent capture an image of a web page.
Playwright
Microsoft
Microsoft's official browser-automation server that drives pages via the accessibility tree, not pixels.
browser_take_screenshot
browser_take_screenshot captures whatever Playwright currently has on screen, the default when an agent is driving a live browser and needs to see the result of a step.
Jina AI
Jina AI
Jina AI's official remote server gives agents web search, URL-to-markdown reading, reranking, and embeddings-powered tools.
capture_screenshot_url
Point capture_screenshot_url at a URL and it shoots the page in one call, with no browser session to set up or tear down.
BrowserStack
BrowserStack
BrowserStack's official server runs manual and automated tests on real browsers and devices, and debugs the failures.
takeAppScreenshot
Not a browser page either: takeAppScreenshot captures a native app on a real device, for when what renders on an actual phone is the thing to verify.
Cypress
JADEV GROUP
A maintained MCP server that runs your Cypress E2E suite from an agent, returns structured results, and surfaces failure context.
cypress_get_screenshot
Inside a Cypress run, cypress_get_screenshot pulls the frame from a test, so a failing assertion can hand back exactly what broke.
Figma
Figma
Figma's official MCP server: turn designs into code context, read variables and components, and write to the canvas.
get_screenshot
Not a web page at all: get_screenshot renders a selected Figma node to an image, the reference shot an agent diffs its built UI against.
ScrapingBee
ScrapingBee
ScrapingBee's official MCP server: scrape pages to text or HTML, screenshot, extract data, search the web, and pull Amazon, Walmart, and YouTube data.
get_screenshot
get_screenshot runs through ScrapingBee's managed rendering and anti-bot layer, so it captures pages that turn away a bare headless browser, all in one keyed HTTP call.
What to know
The picks split by how the shot is taken. Playwright drives a live browser you control and captures whatever is on screen mid-session, which is what you want when the shot is one step in a longer interaction. A test runner like Cypress is different: it captures during a headless run, and the tool retrieves the PNG Cypress saved when an assertion failed, not a fresh frame. A third group takes a URL and hands back an image with no browser to manage (Jina, ScrapingBee), faster for a one-off grab. BrowserStack is its own case, capturing a native app on a real device rather than a browser page, for when on-device rendering is what you need to check.
A screenshot is evidence, and evidence only helps if you can tie it back to what happened. An image with no record of which step produced it, or what the agent was checking, is just a PNG in a folder.
Questions
- Do I need to run a browser to take a screenshot?
- Not always. Playwright and Cypress screenshot a browser you are driving, which fits multi-step interactions. Jina and ScrapingBee take a URL and return an image with no browser to manage, which is quicker for a single capture. Pick based on whether the shot is part of a session or a one-off.
- Can an agent capture what renders on a real phone?
- Through BrowserStack, for a native app: takeAppScreenshot runs against real devices, so you get a genuine on-device capture rather than a desktop browser resized to a narrow width. That distinction matters when the bug you are chasing only shows up on an actual device.