If you’ve spent time wiring up AI agents to external systems, you’ve probably hit the same wall I did: every integration is a one-off. You write a custom function, shove it into a tool definition, hope the schema is right, debug JSON blobs, and repeat. It works, but it doesn’t compose. Nothing is reusable. Every new agent starts from scratch.

MCP — the Model Context Protocol — is Anthropic’s answer to that. It’s a standardized way for language models to talk to external tools, data sources, and services. Think of it like a universal adapter: you build a server once, and any MCP-compatible client can use it. Claude, Claude Code, and a growing list of other tools all speak MCP natively.

The protocol itself is straightforward. But writing a raw MCP server is not. You’re dealing with JSON-RPC, capability negotiation, input schema generation, and transport layers before you’ve written a single line of actual business logic. That’s where FastMCP comes in.

What FastMCP Actually Does

FastMCP is a Python framework that handles all the protocol plumbing. If you’ve used FastAPI, the API will feel immediately familiar. You define functions, add decorators, and FastMCP handles schema generation, serialization, and the protocol handshake. The result is an MCP server that’s readable, testable, and composable.

Here’s what a minimal server looks like:

from fastmcp import FastMCP

mcp = FastMCP("my-server")

@mcp.tool()
def add(a: int, b: int) -> int:
    """Add two numbers together."""
    return a + b

That’s it. FastMCP reads the type hints, generates the JSON schema, registers the tool, and handles the rest. Run it and you have a working MCP server.

The Three Primitives: Tools, Resources, and Prompts

MCP has three core concepts. Understanding when to use each is the most important design decision you’ll make.

Tools are callable functions. The model decides when to invoke them and with what arguments. Use tools for actions: fetching data, writing to a database, calling an API, running a calculation. Tools should be narrow and composable.

Resources are addressable data. They’re identified by URIs and the model can read them on demand. Think of resources like files or database rows — static or slowly-changing data that the model might want to inspect. Use resources when the model needs to read something without necessarily doing something.

Prompts are reusable message templates. They let you define prompt patterns that clients can invoke with parameters. Useful for standardizing how the model approaches certain tasks.

In practice, 90% of what you’ll build is tools. Resources shine for things like documentation lookup or configuration retrieval. Prompts are underused but powerful for multi-turn workflows.

Building a Real Server

Let me walk through something more realistic: a server that wraps a few common homelab-adjacent operations. I’ll use anonymized examples, but the pattern applies to anything.

First, install FastMCP:

pip install fastmcp

Now let’s build a server with a mix of sync and async tools:

from fastmcp import FastMCP
import httpx
from typing import Optional

mcp = FastMCP("ops-helper")

@mcp.tool()
async def check_service_health(url: str) -> dict:
    """Check if a service is responding and return status details."""
    async with httpx.AsyncClient(timeout=5.0) as client:
        try:
            resp = await client.get(url)
            return {
                "status": "up",
                "http_code": resp.status_code,
                "latency_ms": resp.elapsed.total_seconds() * 1000,
            }
        except httpx.TimeoutException:
            return {"status": "timeout", "http_code": None}
        except Exception as e:
            return {"status": "error", "message": str(e)}

@mcp.tool()
def parse_log_level_counts(log_text: str) -> dict:
    """Count occurrences of each log level in a block of log text."""
    levels = ["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"]
    return {level: log_text.count(level) for level in levels}

FastMCP infers the input schema from the function signatures. The docstrings become the tool descriptions that the model uses to decide when and how to call each tool. Write them carefully — they’re part of the interface.

Adding Resources

Resources use URI templates. This is where you expose structured data that a model might want to browse or read:

@mcp.resource("config://app/{app_name}")
def get_app_config(app_name: str) -> str:
    """Return the configuration for a named application."""
    configs = {
        "api-server": "port: 8080\nworkers: 4\ntimeout: 30s",
        "worker": "concurrency: 8\nqueue: default\nretry: 3",
    }
    return configs.get(app_name, f"No config found for {app_name}")

The client can now request config://app/api-server and get back structured config text. You can also expose resource lists so the model can discover what’s available:

@mcp.resource("config://app/")
def list_apps() -> list[str]:
    """List all available application configurations."""
    return ["api-server", "worker", "scheduler"]

Running and Connecting

FastMCP supports two transport modes. stdio is the default and is what Claude Desktop and Claude Code use for local servers. http (with SSE) is better for remote servers or when you need to run the server as a persistent process.

For stdio:

python my_server.py

FastMCP handles the stdio loop automatically. In Claude Code, you configure this in your settings:

{
  "mcpServers": {
    "ops-helper": {
      "command": "python",
      "args": ["/path/to/my_server.py"]
    }
  }
}

For HTTP mode, you pass transport="sse" when running:

if __name__ == "__main__":
    mcp.run(transport="sse", host="0.0.0.0", port=8000)

This is useful when your MCP server needs to run in a container or on a different machine. For remote servers, you’ll want authentication — FastMCP doesn’t handle auth itself, so put a reverse proxy in front or handle it at the application level.

Testing Without an LLM

One thing I wish I’d known earlier: you can test MCP tools without spinning up an LLM at all. FastMCP exposes a direct Python interface for testing:

# test_server.py
import asyncio
from my_server import mcp

async def test():
    # Call a tool directly
    result = await mcp.call_tool("parse_log_level_counts", {
        "log_text": "INFO starting up\nERROR failed to connect\nINFO retry"
    })
    print(result)

asyncio.run(test())

This is vastly faster than testing through a full agent loop. Write unit tests this way. Your tools should be testable pure functions — if they require too much mocking to test directly, that’s a sign the tool is doing too much.

You can also use the MCP Inspector, a browser-based tool from the MCP team, to interactively test your server. Point it at a running SSE server and you get a UI to call tools and inspect responses. Useful for debugging schema issues.

Composability: Mounting Servers

FastMCP 2.x added something I’ve found genuinely useful: server composition. You can mount one FastMCP server into another, building larger servers from smaller focused ones.

from fastmcp import FastMCP
from observability_server import obs_mcp
from deployment_server import deploy_mcp

gateway = FastMCP("platform-gateway")
gateway.mount(obs_mcp, prefix="obs")
gateway.mount(deploy_mcp, prefix="deploy")

Now tools from both servers are available under namespaced prefixes: obs_check_health, deploy_rollout_status, and so on. This lets you keep servers small and focused — each in its own module, independently testable — while presenting a single interface to the agent.

This is particularly powerful when different servers have different dependency requirements. The observability server might import prometheus-client, the deployment server might need kubernetes. Keep them separate, mount them together.

Schema Design Matters More Than You Think

The quality of your tools isn’t just about what they do — it’s about how well the model can figure out when to use them. A few patterns I’ve settled on:

Be specific in docstrings. “Get data” is useless. “Return the last N log lines from a named service, filtered by optional log level” gives the model what it needs to make a good decision.

Use Pydantic models for complex inputs. FastMCP handles Pydantic models natively. For tools that take structured config or multi-field inputs, define a Pydantic model rather than a pile of individual parameters.

from pydantic import BaseModel

class SearchQuery(BaseModel):
    query: str
    max_results: int = 10
    include_archived: bool = False

@mcp.tool()
def search_documents(params: SearchQuery) -> list[dict]:
    """Search the document store using semantic similarity."""
    # implementation
    ...

Return structured data. Return dicts or Pydantic models, not strings. The model handles structured output better and you can evolve the schema without breaking prompts.

One tool, one concern. Tools that do too much become unpredictable. A tool that “either fetches or updates depending on the input” is a footgun. Split it.

What I’d Do Differently

When I first started building MCP servers, I made a few mistakes that cost time.

I over-designed the resource layer. Resources felt elegant on paper, but in practice most of what I was exposing was better as tool results. Resources shine for large, browsable datasets or documentation — not for dynamic operational data. Start with tools; reach for resources when you find yourself writing list_all_* functions.

I also underestimated the importance of error handling at the tool boundary. An unhandled exception inside a tool propagates up as a protocol error, and the model’s error recovery is unpredictable. Wrap tool internals defensively and return structured error objects when something goes wrong, rather than raising:

@mcp.tool()
async def fetch_metrics(endpoint: str) -> dict:
    """Fetch current metrics from a service endpoint."""
    try:
        async with httpx.AsyncClient(timeout=5.0) as client:
            resp = await client.get(endpoint)
            return {"success": True, "data": resp.json()}
    except Exception as e:
        return {"success": False, "error": str(e)}

Finally, I underused the context object. FastMCP passes a Context parameter to tools if you request it, and it lets you log progress, report status, and access request metadata. For long-running tools, this is how you give the model visibility into what’s happening:

from fastmcp import Context

@mcp.tool()
async def long_running_task(items: list[str], ctx: Context) -> dict:
    """Process a list of items."""
    results = []
    for i, item in enumerate(items):
        await ctx.report_progress(i, len(items))
        results.append(process(item))
    return {"results": results, "count": len(results)}

MCP is still early. The ecosystem is moving fast and best practices are still forming. But the protocol is solid, FastMCP makes server development accessible, and the composability story is genuinely compelling for multi-agent setups. If you’re building anything agent-adjacent, it’s worth the afternoon to learn.