Is the arXiv MCP server hosted or self-hosted?

Self-hosted. The arXiv server runs locally over stdio, with no managed remote endpoint. The servers on this page are hosted, so you connect by URL with nothing to install, though none of them index arXiv papers.

Which hosted alternative is closest to arXiv?

None match arXiv's paper-download tools. For research over a managed connection, Exa is the closest on finding relevant sources by meaning, while Firecrawl and Tavily are best for pulling full page content from the open web rather than preprints.

Hosted arXiv MCP alternatives

The arXiv MCP server runs locally over stdio; there is no managed endpoint. If you would rather add a server by URL and authenticate over a hosted connection, with nothing to install, the servers below all work that way.

They are general web search and scraping servers, not paper indexes. None download arXiv preprints the way the arXiv server does, but they reach the open web over a managed connection, which is what most research beyond arXiv needs.

The 8 best hosted alternatives

FirecrawlOfficial6,500
Firecrawl's official server runs hosted and turns any website into clean, LLM-ready data through scrape, crawl, map, search, and extract. The managed way to pull full web pages into research.
Set up Firecrawl →
ExaOfficial4,511
Exa's official server offers a hosted endpoint for neural web search and clean full-page content built for LLMs. A strong fit for finding sources by meaning across the open web.
Set up Exa →
Bright DataOfficial2,426
Hosted and built to get past blocks, CAPTCHAs, and geo-restrictions, Bright Data's official server does reliable web search and scraping. The managed pick for sources that resist scraping.
Set up Bright Data →
TavilyOfficial2,100
Real-time web search with page extraction, crawling, and mapping all run hosted through Tavily's official server, built for AI. A balanced research feed over a URL.
Set up Tavily →
ApifyOfficial1,300
Apify offers a hosted server exposing 6,000+ Actors plus run, dataset, and store tools for scraping and automating the web. A fit for structured extraction at scale with nothing to run.
Set up Apify →
Jina AIOfficial702
Web search, URL-to-markdown reading, reranking, and embeddings-powered tools all run on Jina AI's remote server. A hosted fit when research results feed an embedding or reranking step.
Set up Jina AI →
SerpApiOfficial141
Pure search results: SerpApi's official server returns structured output from Google, Bing, and dozens of other engines through one tool. The hosted pick for clean SERP data to widen a search.
Set up SerpApi →
ScrapingBeeOfficial
Pages to text or HTML, screenshots, data extraction, web search, and Amazon, Walmart, and YouTube data all come through ScrapingBee's hosted server. Broad scraping over a managed endpoint.
Set up ScrapingBee →

How to choose

None of these read arXiv's preprint corpus, so they widen research beyond papers rather than replace the arXiv server. For pulling web pages, Firecrawl and ScrapingBee fit; Exa is the neural search pick; Bright Data and Apify scrape at scale; SerpApi returns clean SERP data; Jina adds reranking and embeddings. Keep arXiv for papers and add one of these for the open web, all over a managed URL.

FAQ

Is the arXiv MCP server hosted or self-hosted?: Self-hosted. The arXiv server runs locally over stdio, with no managed remote endpoint. The servers on this page are hosted, so you connect by URL with nothing to install, though none of them index arXiv papers.
Which hosted alternative is closest to arXiv?: None match arXiv's paper-download tools. For research over a managed connection, Exa is the closest on finding relevant sources by meaning, while Firecrawl and Tavily are best for pulling full page content from the open web rather than preprints.

← Back to the arXiv MCP server