MCP Explained: How AI Agents Connect to Any Tool

June 23, 2026 · AI Agent Internals: How Coding Agents Really Work (part 2)

▶ Watch on YouTube & subscribe to The Stack Underflow

In the previous episode, we established that language models only produce text — it is the orchestrator that actually executes tools. That raises an immediate follow-up: how does the orchestrator know what tools are available, and how does it actually talk to them?

The Model Context Protocol (MCP) is the answer. It is a single, shared protocol that lets any AI host (your editor, your agent) connect to any tool server without writing bespoke integration code for each pair.

The one-sentence version: MCP defines a standard three-role architecture — host, client, server — so that one AI editor can talk to every tool, and one tool server can work with every editor, without anyone writing custom glue code.

The problem MCP solves

Before MCP existed, every AI editor had to implement its own integration for every external tool. Cursor talking to GitHub was custom code. VS Code talking to GitHub was different custom code. Add Postgres and you had yet another round of custom code for each editor.

With three editors and three tools, that is nine separate integrations. Add a new tool and every editor must update. Add a new editor and every tool must be re-integrated. The combinatorial explosion looks like this:

Before MCP:

  Cursor    ──── custom code ────▶  GitHub
  Cursor    ──── custom code ────▶  Postgres
  VS Code   ──── custom code ────▶  GitHub
  VS Code   ──── custom code ────▶  Postgres
  Claude Code─── custom code ────▶  GitHub
  Claude Code─── custom code ────▶  Postgres

  N editors × M tools = N×M integrations

MCP collapses that matrix to N + M: each editor implements MCP once, each tool implements MCP once, and they all just plug in.

The three roles: host, client, server

MCP defines exactly three roles and the distinction between them matters.

Role	What it is	Examples
Host	The AI application the user runs	Cursor, VS Code, Claude Code
Client	A small connector living inside the host	One client per server connection
Server	A separate program that exposes capabilities	A GitHub MCP server, a Postgres MCP server

The relationship to hold in your head: the host contains clients. The host is not the client. If your editor is connected to five MCP servers, it has five clients running inside it. Each client connects to exactly one server. That one-to-one pairing keeps the system clean and the failure domain isolated.

What a server actually exposes

Most explanations stop at “servers expose tools,” but a server exposes three distinct capability types:

Tools — functions the model can call (e.g., create_github_issue, run_query). These are model-controlled: the LLM decides when to invoke them.
Resources — read-only data sources the application can fetch (e.g., file contents, a database table). These are app-controlled.
Prompts — pre-built prompt templates the user can select. These are user-controlled.

The control axis is a meaningful distinction. Resources are pulled by the host on the app’s terms; the model does not autonomously fetch them. Prompts are surfaced as user-facing options, not autonomous model decisions.

The protocol in action: five steps

Here is what happens from the moment your editor starts up to the moment a tool result lands back in the model’s context.

┌─────────────────────────────────────────┐
│  Host (e.g., VS Code)                   │
│                                         │
│  ┌──────────┐    ┌──────────┐           │
│  │ Client A │    │ Client B │  ...      │
│  └────┬─────┘    └────┬─────┘           │
└───────┼───────────────┼─────────────────┘
        │               │
   ┌────▼────┐     ┌────▼────┐
   │Server A │     │Server B │
   │(GitHub) │     │(Postgres│
   └─────────┘     └─────────┘

Step 1 — Handshake. The host creates a client for each configured server. The client and server exchange protocol versions and declare their capabilities.

Step 2 — Discovery. The client asks: “What do you offer?” The server responds with its full list of tools, resources, and prompts. The host forwards the tool schemas (plain JSON Schema objects — same format as seen in episode 1) to the language model.

Step 3 — Tool call. The model decides it needs a tool and emits a structured tool-call in its output text. The client routes it to the correct server.

Step 4 — Execution. The server actually runs the logic: queries the database, calls the GitHub API, reads a file. This is where real side effects happen.

Step 5 — Result. The result flows back through the client into the model’s context, and the loop continues.

Transport: local vs. remote

All communication uses JSON-RPC, but MCP supports two transport modes:

STDIO transport — the server runs as a child process on the same machine. The client writes to its stdin and reads from its stdout. Simple and zero-config for local tooling.
HTTP transport — the server runs on a remote machine. Communication goes over HTTP (typically Server-Sent Events for streaming). Same protocol, different wire.

The model has no visibility into which transport is in use. From the LLM’s perspective, a tool call goes in and a result comes back — the plumbing is invisible.

The payoff

Build one MCP server for your company’s internal database. Every MCP-aware editor can use it on day one, with no additional integration work. Build a new MCP-aware editor. Every existing MCP server works immediately.

After MCP:

  Cursor ──┐
  VS Code ─┼──▶ MCP Protocol ──▶ GitHub MCP Server
  Claude──┘                  └──▶ Postgres MCP Server

  N editors + M tools = N+M implementations, not N×M

One protocol. Anywhere.

Common misconceptions

“The host and the client are the same thing.” They are not. The host is the user-facing application (your editor). Clients are small internal connectors the host manages — one per server. The distinction matters when debugging which component is failing.
“MCP servers only expose tools.” Servers expose three capability types: tools (model-controlled calls), resources (app-controlled data), and prompts (user-selectable templates). Ignoring resources and prompts means missing significant parts of the protocol.
“MCP requires a remote server.” Local STDIO servers run as child processes on your machine with no network involved. Many MCP servers are local-only command-line programs.
“The model executes tools directly.” The model emits a structured text output that looks like a tool call. The orchestrator (the host) is what actually invokes the server. The model never runs code itself.

Frequently asked questions

What is the relationship between MCP and the orchestrator from episode 1? They are the same thing from different angles. Episode 1 called the thing that executes tools the “orchestrator.” MCP is the protocol that defines how that orchestrator (now called the “host”) discovers and calls tools via clients and servers.

Do all AI editors support MCP? MCP is an open protocol and adoption is growing fast. Claude Code, Cursor, and VS Code (with extensions) are among the early adopters. Because the protocol is open, any editor can implement it without permission.

Can a single host connect to multiple MCP servers at once? Yes — and that is the normal case. The host spins up one client per server. Each client manages its own connection and capability list. The host merges all tool schemas before passing them to the model.

Is JSON-RPC a new or unusual choice? JSON-RPC is a lightweight, well-understood remote procedure call spec that has been around since 2005. It is a deliberate choice for simplicity: plain JSON requests and responses with a method name and optional params. No binary framing, no specialized tooling required to debug it.

Where this fits in the series

This is episode 2 of “How Claude Actually Works” (published here as “AI Agent Internals: How Coding Agents Really Work”). Episode 1 established that the model only produces text and the orchestrator executes tools. This episode explained the MCP protocol that wires those tools up. Episode 3 covers the agent loop — propose, review, apply, repeat — the cycle that turns individual tool calls into multi-step autonomous work.

Browse all tutorials in the series to follow the full arc from raw token generation to production coding agents.

Found this useful? The deep version lives on YouTube — new breakdowns of how AI dev tools actually work, weekly.

Subscribe on YouTube →