Best MCP servers for LLM gateways
Routing to the right model is its own problem: you want to compare candidates, validate model IDs, fall back when one provider is down, and keep your code provider-agnostic. LLM gateway servers let an agent browse model catalogs, check what is available, and reach many backends through one interface instead of hard-coding a single vendor. The right pick depends on your goal, a unified router across hundreds of hosted models, a provider with both inference and a deep model and dataset catalog, or a platform for operating your own deployed models. For agents that build other agents, being able to discover and validate models programmatically is the difference between a working pipeline and a 404. The servers below cover the common gateway shapes, each a real MCP server with a verified, current install config.
OpenRouter
heltonteixeira (community)
Community OpenRouter MCP server: chat with 300+ language models through one unified API, search the model catalog, and validate model IDs from your agent.
A community OpenRouter server lets an agent chat with 300+ language models through one unified API, search the model catalog, and validate model IDs, the canonical multi-provider router.
Hugging Face
Hugging Face
Hugging Face's official MCP server: search and explore models, datasets, Spaces, papers, and docs from your AI assistant.
Hugging Face's official server searches and explores models, datasets, Spaces, papers, and docs, the broadest catalog for discovering what model to use in the first place.
Together AI
Manas Bharadwaj
Community MCP server for Together AI image generation: create high-quality images with the FLUX.1 Schnell model straight from your agent.
A community Together AI server generates high-quality images with FLUX.1 Schnell, a fast hosted-inference option from a major open-model provider.
Baseten
Baseten
Baseten's official MCP servers give an agent live access to your model deployments and Baseten's docs: deploy, call, and operate models from your editor.
Baseten's official servers give an agent live access to your model deployments and Baseten's docs so it can deploy, call, and operate models, best when you run your own endpoints.