Function calling parity — three APIs, one capability — Cross-Model Mastery — The Prompt Engineering Path — Nerd Level Tech

Function calling (also called "tool use" on Anthropic's API and "tool calls" on Google's) is the modern replacement for prompt-engineering your way to JSON. Instead of asking the model "please return JSON, no markdown fence", you give the API a typed schema and it returns a structured object the model is forced to populate. This is the right primitive for any production code path that consumes structured data.

All three vendors support it. The naming, the shape of the request, and the shape of the response differ. The capability is roughly the same.

What "the same capability" actually means

Each API exposes a contract that lets you declare a function signature in JSON Schema and the model returns: which function it would call, with what arguments. Your code receives those arguments as a typed object, runs the actual function, and feeds the result back. This loop is the building block of agents.

Across the three:

Vendor	Field name on request	Field name on response
Anthropic	`tools`	`content[].type === "tool_use"`
OpenAI	`tools` (with `type: "function"`)	`tool_calls`
Google	`tools[].function_declarations`	`parts[].functionCall`

Function calling — three APIs, three response shapes

Anthropic (Claude)

OpenAI (GPT)

Google (Gemini)

The shape of the request is the easiest part to port. The shape of the response is where most porting work happens. Anthropic returns a content block with type tool_use. OpenAI returns a tool_calls array on the message. Google nests the call inside a parts array on the candidate. Three different access paths for the same conceptual thing.

Reliability across vendors

Reliability of function calling diverges roughly the same way prompt-only JSON did in the previous lesson, but the gap narrows. When the JSON schema is enforced by the API rather than just hinted at in a prompt:

Claude's "wraps in markdown fence" failure mode disappears entirely. Tool-use mode produces typed objects, not text.
GPT-4o-mini's strict-JSON mode is the most reliable of the three on simple schemas; it almost never returns malformed objects.
Gemini Flash's truncation pattern is reduced but not eliminated on long argument lists. Keep the schema small.

For Hagar's use case at the Cairo startup, the practical recommendation is: if you have a structured-data extraction task, do not prompt-engineer your way to JSON. Use function calling. The cost is one initial schema definition; the benefit is that you stop debugging output parsing and start debugging actual logic.

What does not transfer

Three things consistently break when porting function-calling logic between vendors:

Multi-tool selection. Different models pick different tools when given the same set. Claude tends to chain tools more aggressively. GPT calls one tool and stops. Gemini sometimes returns text saying "I would call tool X" instead of actually calling it. Audit your tool-selection traces during port.
Nested object support in arguments. Anthropic and OpenAI handle nested JSON Schema cleanly. Gemini's enforcement is shallower; deeply nested arguments sometimes come back partially populated. Flatten your arguments where possible.
Tool-result shape. Each vendor expects the function's return value back in a slightly different message shape. This is where most porting bugs live.

Next: schema differences in detail — the exact shape of tools across the three APIs. :::

Function calling parity — three APIs, one capability

What "the same capability" actually means

Function calling — three APIs, three response shapes

Reliability across vendors

What does not transfer

Quiz

Stay on the Nerd Track