Skip to content

Function Calling Overview

Language models generate text. But most real applications need structured data — not "the price is approximately fifty dollars" but {"price": 50.00, "currency": "USD"}. Function calling (also called tool use) is the mechanism that makes LLMs reliable producers of structured output and gives them the ability to invoke external capabilities.

Learning objectives

  • Understand what function calling actually does at the API level
  • Distinguish function calling from JSON mode and prompt-based extraction
  • Know the full request-response cycle: define → call → execute → respond
  • Recognize when tool use is the right approach vs simpler alternatives

The core idea

Without function calling, you parse LLM text hoping it formatted correctly. With function calling, you give the model a schema and it guarantees the output matches it.

Without tool use:
  Prompt: "Return JSON with name and price"
  Response: "Here's the JSON: {"name": "Widget", "price": "$49.99"}"
  Problem: must parse, strip text, handle malformed outputs

With tool use:
  Define: extract_product(name: str, price: float)
  Response: tool_call(name="Widget", price=49.99)
  Result: guaranteed valid Python objects

The four-step cycle

# Step 1: Define tools (JSON schema)
tools = [{"type": "function", "function": {"name": "...", "parameters": {...}}}]

# Step 2: Send to model with tools
response = client.chat.completions.create(model="gpt-4o-mini", messages=messages, tools=tools)

# Step 3: Check if model wants to call a tool
if response.choices[0].finish_reason == "tool_calls":
    tool_call = response.choices[0].message.tool_calls[0]
    # Execute the function
    result = execute(tool_call.function.name, json.loads(tool_call.function.arguments))

    # Step 4: Send result back to continue the conversation
    messages.append(response.choices[0].message)  # Assistant's tool call
    messages.append({"role": "tool", "tool_call_id": tool_call.id, "content": str(result)})
    final_response = client.chat.completions.create(model="gpt-4o-mini", messages=messages, tools=tools)

Tool use vs JSON mode vs prompting

Approach Reliability Flexibility Setup
Prompt-based extraction Low High None
JSON mode (response_format) Medium High Minimal
Function calling / tool use High Constrained to schema Schema definition
with_structured_output (LangChain) High Pydantic model Pydantic class

Prompt-based: "Return JSON with these fields." Works until the model decides to add commentary, nest differently, or omit optional fields. Breaks under distribution shift.

JSON mode: Guarantees valid JSON, but the keys and values can still be wrong. The model might return {"sentiment": "somewhat negative"} when you expected "negative".

Function calling: Guarantees the schema — correct field names, correct types, enum constraints respected. The model cannot return malformed output.

Use tool use for any structured output you'll programmatically consume

If the output feeds code (not a human), use function calling or with_structured_output. The reliability improvement over prompt-based extraction is dramatic, especially for edge cases.


When tool use is the right choice

USE_TOOL_CALLING = [
    "Extracting structured entities from unstructured text (NER++)",
    "Connecting LLMs to external APIs (search, databases, calculators)",
    "Multi-step agent loops where each step has a defined action schema",
    "Any output where you need field-level type guarantees",
    "Replacing fragile regex-based parsing pipelines",
]

DO_NOT_USE_TOOL_CALLING = [
    "Simple yes/no or short text answers",
    "Creative writing or open-ended generation",
    "When the structure isn't known in advance",
    "Situations where partial/incomplete output is acceptable",
]

Anatomy of a tool definition

# OpenAI format (also used by LangChain, LiteLLM)
tool = {
    "type": "function",
    "function": {
        "name": "extract_invoice",
        "description": "Extract structured data from an invoice document.",  # Critical: model uses this
        "parameters": {
            "type": "object",
            "properties": {
                "vendor_name": {
                    "type": "string",
                    "description": "Name of the vendor or supplier"
                },
                "total_amount": {
                    "type": "number",
                    "description": "Total invoice amount in USD"
                },
                "invoice_date": {
                    "type": "string",
                    "description": "Invoice date in YYYY-MM-DD format"
                },
                "line_items": {
                    "type": "array",
                    "items": {
                        "type": "object",
                        "properties": {
                            "description": {"type": "string"},
                            "amount": {"type": "number"}
                        },
                        "required": ["description", "amount"]
                    },
                    "description": "Individual line items on the invoice"
                },
                "status": {
                    "type": "string",
                    "enum": ["paid", "pending", "overdue"],
                    "description": "Current payment status"
                }
            },
            "required": ["vendor_name", "total_amount", "invoice_date"]
        }
    }
}

Write good descriptions — they're few-shot prompts for the schema

The description fields in your tool schema guide the model's extraction. "Total invoice amount in USD" is better than "amount" — it tells the model the unit and scope. Treat descriptions as micro-prompts.


00-agenda | 02-openai-tools