Replicate MCP server

OfficialReplicateConfig last verified Jun 1, 2026

Replicate's official MCP server: discover, compare, and run thousands of hosted AI models — image, video, audio, and language — straight from your agent.

The Replicate MCP server is Replicate's official integration that puts its entire HTTP API behind an agent's tool calls. Replicate hosts thousands of community and official models — image generators like Flux and SDXL, video models, speech and music models, upscalers, and language models — each callable as a prediction. This server lets a coding agent search the catalog, read a model's schema, run a prediction, poll for the result, and manage trainings, deployments, files, and collections, all without leaving the editor or chat.

The recommended deployment is the hosted remote endpoint at https://mcp.replicate.com/sse, an SSE transport that runs an OAuth-style web flow: the first time you connect, you paste a Replicate API token into a browser page and the server stores it in Cloudflare KV, acting as a trusted intermediary so the token is never exposed to the model. You can also run it locally over stdio with npx replicate-mcp@latest and a REPLICATE_API_TOKEN environment variable. To keep the context window small, the server exposes a static set of common tools plus three dynamic meta-tools (list_api_endpoints, get_api_endpoint_schema, invoke_api_endpoint) that let the agent reach any endpoint in the API on demand.

Quick install

Copy-paste configs are provided for all 8 supported clients. Pick your client below.

Add to ~/.claude.json

~/.claude.json
json
{
  "mcpServers": {
    "replicate": {
      "type": "http",
      "url": "https://mcp.replicate.com/sse"
    }
  }
}
Or via CLI
bash
claude mcp add --transport http replicate https://mcp.replicate.com/sse

Heads up

  • First tool call opens a browser to authorize.

Available tools

ToolDescription
get_accountReturn information about the user or organization associated with the provided API token.
list_collectionsList the collections of models featured on Replicate, as a paginated list of collection objects.
get_collectionsGet a single collection of models by slug, including the nested list of models in that collection.
list_hardwareList the available hardware SKUs (CPU and GPU types) for running models and trainings.
search_modelsGet a list of public models matching a search query, ranked by relevance.
list_modelsGet a paginated list of public models on Replicate.
get_modelsGet the metadata for a public model by owner and name.
create_modelsCreate a new model on Replicate under your account or organization.
delete_modelsDelete a model you own. The model must have no versions and no predictions.
get_models_readmeGet the README content (Markdown) for a model.
list_models_examplesList example predictions made using the model, useful for understanding typical inputs and outputs.
create_models_predictionsCreate a prediction using an official model, passing the inputs you provide.
list_models_versionsList the versions of a model, sorted with the most recent version first.
get_models_versionsGet the metadata and input/output schema for a specific model version.
delete_models_versionsDelete a model version and all associated predictions, including all output files.
create_predictionsCreate a prediction for the model version and inputs you provide.
get_predictionsGet the current state of a prediction, including outputs once it has completed.
list_predictionsGet a paginated list of all predictions created by the account associated with the API token.
cancel_predictionsCancel a prediction that is currently running.
create_trainingsStart a new training of the model version you specify, to fine-tune a model.
get_trainingsGet the current state of a training.
list_trainingsGet a paginated list of all trainings created by the account associated with the API token.
cancel_trainingsCancel a training that is currently running.
create_deploymentsCreate a new deployment with a chosen model version, hardware, and min/max instances for scaling.
get_deploymentsGet information about a deployment by name, including its current release.
list_deploymentsList the deployments associated with the current account, including the latest release configuration for each.
update_deploymentsUpdate properties of an existing deployment, including hardware, min/max instances, and the underlying model version.
delete_deploymentsDelete a deployment.
create_deployments_predictionsCreate a prediction for the deployment and inputs you provide.
create_filesUpload a file to Replicate so it can be used as input to a prediction.
get_filesGet the details of a file you have uploaded.
list_filesGet a paginated list of all files created by the account associated with the API token.
download_filesDownload a file by providing the file owner, access expiry, and a valid signature.
delete_filesDelete a file. Subsequent requests to the file resource return 404 Not Found.
get_default_webhooks_secretGet the signing secret for the default webhook endpoint, used to verify webhook requests come from Replicate.
list_api_endpointsDynamic meta-tool: list or search every endpoint in the Replicate API by name, resource, operation, or tag.
get_api_endpoint_schemaDynamic meta-tool: get the input schema for a named API endpoint so the agent can construct valid arguments.
invoke_api_endpointDynamic meta-tool: invoke any Replicate API endpoint by name with the arguments matching its schema.

Required configuration

  • REPLICATE_API_TOKENRequired

    Replicate API token from replicate.com/account/api-tokens. Required for the local stdio server; the remote server obtains it through the browser auth flow.

What you can do with it

Run image, video, and audio models from your agent

Ask the agent to generate an image with Flux or transcribe audio: it searches the catalog with search_models, reads the version schema, calls create_predictions, and polls get_predictions until the output URL is ready — no manual API wiring.

Manage fine-tunes and production deployments

Kick off a fine-tune with create_trainings, then stand up a scalable endpoint with create_deployments and run inference against it. The agent can also list and update deployments to tune hardware and instance counts.

FAQ

Is it free?
The MCP server is free and open source, but it calls Replicate's API, which is paid and billed per second of compute (or per run for some official models) against your REPLICATE_API_TOKEN. You only pay for the predictions and trainings you run.
Does it support remote/OAuth?
Yes. The recommended deployment is the remote SSE endpoint at https://mcp.replicate.com/sse, which uses a browser-based authentication flow — you paste your Replicate API token into a web page and the server stores it in Cloudflare KV, never exposing it to the model. You can also run it locally over stdio with npx replicate-mcp and a REPLICATE_API_TOKEN.
← Browse all ai-ml servers