The MCP server is free and open source, but it calls Replicate's API, which is paid and billed per second of compute (or per run for some official models) against your REPLICATE_API_TOKEN. You only pay for the predictions and trainings you run.

Does it support remote/OAuth?

Yes. The recommended deployment is the remote SSE endpoint at https://mcp.replicate.com/sse, which uses a browser-based authentication flow — you paste your Replicate API token into a web page and the server stores it in Cloudflare KV, never exposing it to the model. You can also run it locally over stdio with npx replicate-mcp and a REPLICATE_API_TOKEN.

Replicate MCP server

OfficialReplicateConfig last verified Jun 1, 2026

Replicate's official MCP server: discover, compare, and run thousands of hosted AI models — image, video, audio, and language — straight from your agent.

The Replicate MCP server is Replicate's official integration that puts its entire HTTP API behind an agent's tool calls. Replicate hosts thousands of community and official models — image generators like Flux and SDXL, video models, speech and music models, upscalers, and language models — each callable as a prediction. This server lets a coding agent search the catalog, read a model's schema, run a prediction, poll for the result, and manage trainings, deployments, files, and collections, all without leaving the editor or chat.

The recommended deployment is the hosted remote endpoint at https://mcp.replicate.com/sse, an SSE transport that runs an OAuth-style web flow: the first time you connect, you paste a Replicate API token into a browser page and the server stores it in Cloudflare KV, acting as a trusted intermediary so the token is never exposed to the model. You can also run it locally over stdio with npx replicate-mcp@latest and a REPLICATE_API_TOKEN environment variable. To keep the context window small, the server exposes a static set of common tools plus three dynamic meta-tools (list_api_endpoints, get_api_endpoint_schema, invoke_api_endpoint) that let the agent reach any endpoint in the API on demand.

Quick install

Copy-paste configs are provided for all 8 supported clients. Pick your client below.

Add to ~/.claude.json

~/.claude.json

json

{
  "mcpServers": {
    "replicate": {
      "type": "http",
      "url": "https://mcp.replicate.com/sse"
    }
  }
}

Or via CLI

bash

claude mcp add --transport http replicate https://mcp.replicate.com/sse

Heads up

First tool call opens a browser to authorize.

Step-by-step guides

Available tools

Tool	Description
get_account	Return information about the user or organization associated with the provided API token.
list_collections	List the collections of models featured on Replicate, as a paginated list of collection objects.
get_collections	Get a single collection of models by slug, including the nested list of models in that collection.
list_hardware	List the available hardware SKUs (CPU and GPU types) for running models and trainings.
search_models	Get a list of public models matching a search query, ranked by relevance.
list_models	Get a paginated list of public models on Replicate.
get_models	Get the metadata for a public model by owner and name.
create_models	Create a new model on Replicate under your account or organization.
delete_models	Delete a model you own. The model must have no versions and no predictions.
get_models_readme	Get the README content (Markdown) for a model.
list_models_examples	List example predictions made using the model, useful for understanding typical inputs and outputs.
create_models_predictions	Create a prediction using an official model, passing the inputs you provide.
list_models_versions	List the versions of a model, sorted with the most recent version first.
get_models_versions	Get the metadata and input/output schema for a specific model version.
delete_models_versions	Delete a model version and all associated predictions, including all output files.
create_predictions	Create a prediction for the model version and inputs you provide.
get_predictions	Get the current state of a prediction, including outputs once it has completed.
list_predictions	Get a paginated list of all predictions created by the account associated with the API token.
cancel_predictions	Cancel a prediction that is currently running.
create_trainings	Start a new training of the model version you specify, to fine-tune a model.
get_trainings	Get the current state of a training.
list_trainings	Get a paginated list of all trainings created by the account associated with the API token.
cancel_trainings	Cancel a training that is currently running.
create_deployments	Create a new deployment with a chosen model version, hardware, and min/max instances for scaling.
get_deployments	Get information about a deployment by name, including its current release.
list_deployments	List the deployments associated with the current account, including the latest release configuration for each.
update_deployments	Update properties of an existing deployment, including hardware, min/max instances, and the underlying model version.
delete_deployments	Delete a deployment.
create_deployments_predictions	Create a prediction for the deployment and inputs you provide.
create_files	Upload a file to Replicate so it can be used as input to a prediction.
get_files	Get the details of a file you have uploaded.
list_files	Get a paginated list of all files created by the account associated with the API token.
download_files	Download a file by providing the file owner, access expiry, and a valid signature.
delete_files	Delete a file. Subsequent requests to the file resource return 404 Not Found.
get_default_webhooks_secret	Get the signing secret for the default webhook endpoint, used to verify webhook requests come from Replicate.
list_api_endpoints	Dynamic meta-tool: list or search every endpoint in the Replicate API by name, resource, operation, or tag.
get_api_endpoint_schema	Dynamic meta-tool: get the input schema for a named API endpoint so the agent can construct valid arguments.
invoke_api_endpoint	Dynamic meta-tool: invoke any Replicate API endpoint by name with the arguments matching its schema.

Required configuration

REPLICATE_API_TOKENRequired
Replicate API token from replicate.com/account/api-tokens. Required for the local stdio server; the remote server obtains it through the browser auth flow.

What you can do with it

Run image, video, and audio models from your agent

Ask the agent to generate an image with Flux or transcribe audio: it searches the catalog with search_models, reads the version schema, calls create_predictions, and polls get_predictions until the output URL is ready — no manual API wiring.

Manage fine-tunes and production deployments

Kick off a fine-tune with create_trainings, then stand up a scalable endpoint with create_deployments and run inference against it. The agent can also list and update deployments to tune hardware and instance counts.

FAQ

Is it free?: The MCP server is free and open source, but it calls Replicate's API, which is paid and billed per second of compute (or per run for some official models) against your REPLICATE_API_TOKEN. You only pay for the predictions and trainings you run.
Does it support remote/OAuth?: Yes. The recommended deployment is the remote SSE endpoint at https://mcp.replicate.com/sse, which uses a browser-based authentication flow — you paste your Replicate API token into a web page and the server stores it in Cloudflare KV, never exposing it to the model. You can also run it locally over stdio with npx replicate-mcp and a REPLICATE_API_TOKEN.

Compare Replicate alternatives →← Browse all ai-ml servers