Agent Daily News

MCP vs Function Calling: Which Tool Integration to Use

Cover image for MCP vs Function Calling: Which Tool Integration to Use

MCP vs function calling for AI agents: they solve different layers of the same problem. When to use each, how they compose, and the token-cost trade-off.

TL;DR — Function calling is how a model requests a tool. MCP is how tools are packaged and shared across agents. They’re not competitors, they’re different layers. Use raw function calling for one app’s private tools. Use MCP when the same tools serve multiple agents or clients. Most real systems use MCP on top of function calling, because under the hood MCP tools still become function-call definitions.

The Question Is Wrong

“MCP vs function calling” gets asked constantly, and it’s a category error. It’s like asking “REST vs HTTP.” Function calling is the model-level mechanism: the LLM outputs a structured request to invoke a named function with arguments. MCP (Model Context Protocol) is a higher-level standard for how tool servers expose capabilities to any client.

Here’s the thing that clears it up: when you connect an MCP server to Claude or GPT-4o, the MCP tools get translated into function-call definitions before they reach the model. The model still does function calling. MCP just standardizes where those definitions come from and how the results flow back.

So the real question isn’t “which one,” it’s “do I need the MCP layer on top of function calling, or not?”

Function Calling: The Mechanism

Every major model API supports function calling. You pass tool definitions in the request; the model responds with a structured call. (See OpenAI’s function-calling docs for the canonical spec.)

from openai import OpenAI

client = OpenAI(base_url="https://api.sandbase.ai/v1", api_key="sk-...")

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current weather for a city",
        "parameters": {
            "type": "object",
            "properties": {
                "city": {"type": "string", "description": "City name"}
            },
            "required": ["city"]
        }
    }
}]

response = client.chat.completions.create(
    model="anthropic/claude-sonnet-4",
    messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
    tools=tools
)

# Model responds with a tool_call. You execute it, append the result, call again.
tool_call = response.choices[0].message.tool_calls[0]
print(tool_call.function.name)       # "get_weather"
print(tool_call.function.arguments)  # '{"city": "Tokyo"}'

You define the tool, the model asks to call it, you run the actual function, and you feed the result back. The model never executes anything. It only emits intent. This is the whole mechanism, and MCP doesn’t replace it.

Strengths: Zero extra infrastructure. Lowest latency. Full control. Perfect for tools that only one application uses.

Weakness: The tool definitions and execution logic live inside your app. Want the same get_weather in another agent? Copy-paste. Five agents, five copies, five places to fix a bug.

MCP: The Packaging Layer

MCP solves the reuse problem. You write a tool server once; it exposes tools, resources, and prompts over a standard protocol. Any MCP-compatible client connects and uses them.

// A minimal MCP server exposing the same weather tool
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { z } from "zod";

const server = new McpServer({ name: "weather", version: "1.0.0" });

server.tool(
  "get_weather",
  "Get current weather for a city",
  { city: z.string().describe("City name") },
  async ({ city }) => {
    const data = await fetchWeather(city); // your real implementation
    return { content: [{ type: "text", text: JSON.stringify(data) }] };
  }
);

Now Claude Desktop, Cursor, your CLI agent, and your production app all connect to this one server. Fix a bug once, every client gets it. That’s the entire pitch. For a hands-on build, see our custom MCP server for data analysis walkthrough; for wiring servers into an agent, connecting MCP servers to your agent.

Strengths: Write once, use everywhere. A growing ecosystem of pre-built servers (GitHub, Postgres, Slack, filesystem). Standard auth and transport.

Weakness: More moving parts. Every tool schema gets injected into the context window, and MCP servers tend to expose many tools, so token overhead adds up fast.

The Comparison That Actually Helps

DimensionRaw Function CallingMCP
What it isModel-level mechanismTool-packaging protocol
Reuse across agentsCopy-pasteConnect to same server
InfrastructureNoneRun a server (stdio or HTTP)
LatencyLowest+1 hop (usually negligible)
Token overheadYou control exactly what’s sentAll server tools injected
Auth/transportYou build itStandardized
Best forOne app’s private toolsTools shared across clients
EcosystemNone (you write everything)Pre-built servers available

The Token Cost Nobody Mentions

Here’s the trap. MCP servers often expose 10-30 tools. Connect three servers and the model sees 60+ tool definitions on every single call — schemas, descriptions, parameter docs. That can be 5,000-15,000 tokens of overhead before the user’s question even gets processed.

With raw function calling, you send exactly the tools relevant to the current task. With MCP, you often get everything the connected servers offer whether you need it or not.

The fix: don’t blindly connect every MCP server. Either connect only what a given agent needs, or put a routing layer in front that filters tools by relevance before they hit the model. This is the same token-discipline that shows up in agent memory architectures — load the right context, not all of it.

How to Actually Decide

Use raw function calling when:

  • The tool is specific to one application
  • You need the absolute lowest latency and token count
  • You want tight control over exactly what schemas reach the model
  • You’re prototyping and don’t want to stand up a server

Use MCP when:

  • The same capability is used by multiple agents, apps, or clients (Claude Desktop, Cursor, your backend)
  • You want to use pre-built community servers instead of writing integrations
  • You need standardized auth and transport across many tools
  • You’re building a platform where teams contribute tools independently

Use both (the common case): Most production systems use MCP for shared, reusable capabilities (database access, GitHub, internal APIs) and raw function calling for app-specific glue logic. Since MCP compiles down to function calls anyway, mixing them is natural.

FAQ

Does MCP replace function calling?

No. MCP sits on top of function calling. When an MCP client connects to a model, it converts the server’s tools into function-call definitions. The model still does plain function calling. MCP standardizes how those definitions are produced and shared.

Is MCP slower than function calling?

Marginally. There’s one extra network hop to the MCP server (or a process boundary for stdio). For local servers it’s a few milliseconds. The bigger cost is token overhead from injecting many tool schemas, not latency.

Can I use MCP tools with any model?

Any model that supports function calling, which is all the major ones (Claude, GPT-4o, Gemini, and most open-source models via their APIs). The MCP client handles converting tools into each model’s function-calling format. Through a gateway like SandBase you can switch models without touching tool code.

When is MCP overkill?

For a single app with a handful of private tools that no other client will ever use. Standing up an MCP server adds infrastructure for zero benefit. Raw function calling is simpler and faster there. MCP earns its keep the moment a second client needs the same tools.

Do MCP servers cost more in tokens?

They can, because servers tend to expose many tools and all those schemas get injected into context on every call. Connect only the servers an agent actually needs, or filter tools by relevance, to keep the overhead down.

Key Takeaways

  • Function calling is the model mechanism; MCP is the packaging and sharing layer on top of it. They’re different layers, not rivals.
  • MCP tools become function-call definitions before reaching the model, so the model always does function calling either way.
  • Raw function calling wins for one app’s private tools: lowest latency, lowest tokens, zero infrastructure.
  • MCP wins for tools shared across multiple agents or clients, with a growing ecosystem of pre-built servers, at the cost of token overhead you must manage.

You May Also Like