For a while, we built the smartest things humanity had ever made and then left them sitting in a sealed room.

That is not a metaphor I am reaching for. It is close to literal. A frontier model in 2024 could reason about your codebase, draft your email, explain a legal clause, and it could do exactly none of it to your actual codebase, your actual inbox, your actual contract, unless someone hand-wired a custom bridge for that one task. The model could think. It could not reach. Every time you wanted it to touch one more real thing in the world, an engineer had to go build a one-off connector, by hand, again.

I felt this personally. I have built tools that wrap language models, and the part that hurt was never the intelligence. It was the plumbing. The model was ready. The world was right there. And between them sat a pile of bespoke glue code that broke every time an API shifted under it.

MCP is the thing that fixed the reaching. And once you understand it, a lot of the AI world stops looking like magic and starts looking like something you could have designed yourself. So let me take you through it slowly, the way I wish someone had taken me.

The one-sentence version

MCP, the Model Context Protocol, is an open standard for connecting AI applications to the outside world: your files, your databases, your tools, your APIs.

The analogy the people who made it use is the right one, so I will keep it: MCP is a USB-C port for AI. Before USB-C, every device had its own charger, its own cable, its own little incompatible plug, and a drawer in your house was full of them. USB-C said: one shape, everything fits. MCP is that, for the connection between a model and the things it needs to touch.

The Modelcan think
MCP
the plug
The Worldfiles, APIs, tools
The whole point in one picture. The intelligence was never the bottleneck. The standard, reliable connection between intelligence and reality was. MCP is that connection.

“But we already had APIs. Why not just use REST?”

This is the question everyone asks, and it is a good one. I asked it too. The honest answer is that comparing MCP to REST is almost a category error, they live at different layers and were built for different kinds of caller. Let me make that concrete instead of hand-wavy.

A REST API was designed for a human developer writing deterministic code. You, the developer, read the docs once, at design time. You learn that GET /users/42/orders exists. You hard-code that call. The program runs the exact path you wrote, every time, forever. REST assumes the caller already knows the map before the journey starts.

An AI agent is a different kind of caller entirely. It is probabilistic. It does not have your code memorized; it figures out what to do at runtime, from intent. So it needs things REST never promised to give it:

REST APIMCP
Built forHuman developers, fixed codeAI agents, runtime reasoning
DiscoveryYou read the docs at design timeThe agent asks tools/list at runtime and gets a live menu
StateStateless. Each call forgets the last.Stateful session. Context carries across steps.
SchemaDescribed in docs for a human to readSelf-describing, machine-readable, handed to the model directly
Adding a featureShip new docs, hope clients update their codeServer announces it; agent discovers it instantly
Not better or worse. Different layers. REST serves code that already knows what it wants. MCP serves a model that has to find out what it can do, while it is doing it.

The cleanest way I can put it: REST tells you what to do if you already read the manual. MCP hands the model the manual and lets it read on the spot. That single shift, discovery at runtime instead of design time, is most of why MCP exists.

And here is the part people miss: MCP usually does not replace your REST API. Most MCP servers are thin translators sitting on top of an existing REST backend, adding the AI-friendly layer. Your API keeps doing its job. MCP just makes it legible to a model.

The problem it really solves: the tangle

There is a deeper reason a standard had to exist, and it is the most convincing argument of all. Picture it honestly.

You have M different AI applications (Claude, Cursor, your own agent). You have N different systems you want them to reach (GitHub, Postgres, Slack, your internal API). Without a standard, connecting them means building a custom bridge for every pair. That is M times N pieces of fragile glue, each one a thing that can rot. People call it the N×M problem, and it is exactly the drawer full of incompatible chargers.

Without a standard: N×M glue

Claude→ custom →GitHub
Claude→ custom →Slack
Cursor→ custom →GitHub
Cursor→ custom →Postgres
Agent→ custom →Slack
…and on, and on, each one breakable

With MCP: one plug each

ClaudeMCP
CursorMCP
AgentMCP
MCPGitHub
MCPSlack, Postgres…
build a server once, every client uses it
Build an MCP server for GitHub once, and every MCP-speaking app can use it. Write a client once, and it can talk to every MCP server in existence. The tangle collapses into a hub.

The three players: host, client, server

Before the moving parts, the cast. This trips people up because the words sound interchangeable, so let me pin them down exactly.

  • The host is the AI application you actually use. Claude Desktop, Cursor, VS Code, your own agent. It is the thing in charge.
  • The server is a program that exposes some slice of the world: a filesystem, a database, the Sentry API. It can run locally on your machine or remotely on someone’s platform.
  • The client is the quiet middleman. The host spins up one client per server, and that client holds the dedicated connection to it. Two servers connected means two clients inside the host.

So when VS Code connects to a GitHub server and a Postgres server, it is running two clients, one married to each server. That one-client-per-server detail is the thing to hold onto.

How a connection actually works, start to finish

Here is where it stops being abstract. Under the hood, MCP is just structured messages going back and forth in a format called JSON-RPC 2.0. Plain text, request and response. Watch one full handshake. This is the real sequence, simplified to its bones:

1
Initialize, the handshakeClient and server greet each other and negotiate what each can do. initialize → "I support tools and resources." This is capability negotiation; nobody assumes, everybody declares.
2
Discover, "what can you do?"The client asks tools/list. The server replies with a live menu: every tool, its description, and the exact shape of input it expects. The model now knows its options.
3
Call, "do this one"The model decides. The client sends tools/call with the tool name and arguments. The server runs the real work and returns the result as content the model can read.
4
Notify, "things changed"If the server's tools change mid-session, it can push notifications/tools/list_changed. No request needed. The client refreshes its menu. The connection stays alive and current.
Greet, discover, call, stay in sync. The magic step is the second one: the agent learns what it can do by asking, at runtime, not by you hard-coding it months earlier. That is the whole difference from REST, captured in one message.

The three things a server can offer

When you build an MCP server, you are not just exposing functions. The protocol gives you three distinct kinds of thing to offer, and choosing the right one is the actual craft. They are the three primitives.

tools Tools Functions the model can call to do something with a side effect. Query a database, send a message, create a file. "Take an action."
resources Resources Read-only data the model can pull in for context. A file's contents, a row, an API response, a schema. No side effects, just knowledge. "Here, read this."
prompts Prompts Reusable templates the server hands over: a structured code-review flow, a tuned query pattern. Pre-built ways to use the rest well. "Try it like this."
Tools act. Resources inform. Prompts guide. A database server, for instance, might expose a tool to run queries, a resource holding the schema, and a prompt with good example queries baked in.

There is a quieter, more elegant half to this that almost nobody mentions: the client has primitives too. The server can ask the client to do things back. It can request sampling (ask the host’s model to think about something, without the server needing its own model), elicitation (ask the human a clarifying question or for confirmation), and logging. So it is genuinely two-way. The server is not just a vending machine; it can tap the model and the user on the shoulder when it needs them. That bidirectionality is one of the prettiest design choices in the whole thing.

And notice who is in control of each piece, because it is deliberate:

Tools
the model decides when to call
Resources
the app picks what to load
Prompts
the user usually invokes
Elicitation
the human answers
Control is spread on purpose. The model drives tool calls, the application governs context, the human stays in the loop for prompts and confirmations. No single party runs away with it.

How you actually build one

Here is the part that surprised me most: building a server is small. The SDKs (Python, TypeScript, Java, Kotlin, C#, Go, Ruby, and more) do the protocol grunt work, the JSON-RPC, the lifecycle, the message framing, so you write almost nothing but your actual logic.

In Python, a tool is, honestly, just a decorated function. You write a normal function with type hints and a docstring, and the SDK turns it into a fully described MCP tool, because the description and input schema the model needs are generated from your hints and docstring. You write a function; you get a tool.

weather_server.py
from mcp.server.fastmcp import FastMCP mcp = FastMCP("weather") @mcp.tool() async def get_forecast(latitude: float, longitude: float) -> str: """Get the weather forecast for a location.""" # your real logic: call an API, return text return fetch_forecast(latitude, longitude) if __name__ == "__main__": mcp.run(transport="stdio")
A working MCP server, near enough. The decorator registers the tool. The type hints become its input schema. The docstring becomes the description the model reads. The last line starts it talking over stdio. That is the shape, in any SDK, the names just change.

That last line names the transport, the channel the messages travel over, and there are two that matter:

  • stdio: the server runs as a local process on your own machine and talks through standard input and output. No network, no latency, perfect for “a server that reads my local files.” This is the default for local servers in Claude Desktop and Claude Code.
  • Streamable HTTP: the server runs remotely as a real web service, reachable over HTTP, optionally streaming results back. This is how a company exposes an official MCP server to the world, with proper authentication (OAuth) on top.

Same exact JSON-RPC messages either way. The transport is just the pipe; the conversation inside it is identical. That clean separation is why a server you wrote for local stdio can later be served remotely with barely a change.

Where this actually shows up: real scenarios

Concepts settle once you see them carrying weight. Here is MCP doing real work:

ScenarioWhat the server exposesWhat you get
Coding assistant on your repoTools to read/edit files, run tests; a resource of the project structureThe agent works in your real codebase, not a copy you pasted in
Chat over your databaseA query tool, the schema as a resource, example queries as a prompt"Show me last quarter's churn" becomes a real, safe query
Design to codeA tool that reads a Figma file's structureThe model generates a web app from the actual design, not a screenshot
Personal assistantCalendar and notes servers (Google Calendar, Notion)"What's my week look like, and draft replies" against your real life
Ops and debuggingAn error-tracking server (like Sentry), remote over HTTPThe agent pulls live incidents and reasons about them in context
Every row is the same story: the model stops guessing from stale pasted text and starts working against the live thing. That is the difference between a clever chatbot and an agent that gets work done.

There is a cost story here worth seeing too. With REST, an agent that needs a user’s order status might call three endpoints (get_user, get_orders, get_shipments) and stitch them together, three round-trips, each one burning tokens and time. A well-designed MCP tool, track_order(email), returns the whole answer in one call. For a human writing code, three calls is nothing. For an agent reasoning step by step, every extra call is real money and real latency:

Three granular REST calls, stitched by the agentmore steps, more tokens
One outcome-shaped MCP toolone round-trip
A lesson the protocol quietly teaches: design tools around outcomes the agent wants, not around your database tables. Agentic iteration is expensive in a way that ordinary code is not, so fewer, smarter tools beat many tiny ones.

The part you must not skip: trust

I will be plain about this, because it matters. An MCP server is something you connect to your AI, and through that AI, to your data and your machine. A malicious or careless server can be handed real reach. Connecting a server you have not vetted is like running a program you downloaded from a stranger, because that is essentially what it is.

So: prefer official servers and ones you wrote yourself. Be especially wary of any server that pulls in content from the open internet, because that content can carry instructions of its own, the prompt-injection problem riding in through the side door. The convenience of MCP is exactly that it lets a model act. Anything that can act on your behalf is something you have to be able to trust. Treat installing a server with the same seriousness you treat installing software, because it is.

Why this one is worth understanding deeply

Most things in the AI stack are getting more complicated. MCP is one of the rare ones that made things simpler, and it did it the way good standards always do: by picking a small, clear contract and getting everyone to agree on it. A host, a client, a server. Three primitives a server can offer. Messages in plain JSON-RPC. Discover at runtime, call when needed, stay in sync. That is the whole spine of it.

Once you hold that spine, the future stops looking like sorcery. The next time an AI app reaches into your calendar, edits your repo, or queries your database and hands you back the answer, you will know there is no magic in it. There is a model that learned to ask “what can I do here?”, a server that answered honestly, and a small, well-designed plug between them, finally letting the thing that could always think, reach out and touch the world.

← Back to blog