How to Build AI Agents with Gemma 4 Function Calling

Apr 7, 2026

Gemma 4 ships with native function calling support, which means you can build AI agents that don't just generate text — they take actions. Call APIs, query databases, search the web, run calculations. This guide shows you how to build a working agent from scratch using Gemma 4 and Python.

What Is Function Calling?

Function calling lets the model decide when to use external tools instead of guessing an answer. Instead of asking Gemma 4 "what's the weather in Tokyo?" and getting a hallucinated response, the model outputs a structured request to call your weather API, you execute it, feed the result back, and the model generates a natural language answer based on real data.

The flow looks like this:

User: "What's the weather in Tokyo?"
  → Model: {"function": "get_weather", "args": {"city": "Tokyo"}}
  → Your code calls the weather API → returns {"temp": 22, "condition": "sunny"}
  → Model: "It's 22°C and sunny in Tokyo right now."

Defining Tools with JSON Schema

First, you need to tell Gemma 4 what tools are available. Each tool is defined with a JSON schema that describes its name, purpose, and parameters:

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a city. Use this when the user asks about weather conditions.",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {
                        "type": "string",
                        "description": "City name, e.g. 'Tokyo', 'New York'"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "Temperature unit"
                    }
                },
                "required": ["city"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "calculate",
            "description": "Perform mathematical calculations. Use for any math the user asks about.",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {
                        "type": "string",
                        "description": "Math expression to evaluate, e.g. '2 + 2', 'sqrt(144)'"
                    }
                },
                "required": ["expression"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "web_search",
            "description": "Search the web for current information. Use when the user asks about recent events or facts you're unsure about.",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "Search query"
                    },
                    "num_results": {
                        "type": "integer",
                        "description": "Number of results to return (default: 3)"
                    }
                },
                "required": ["query"]
            }
        }
    }
]

Good tool descriptions are critical. The model uses them to decide which tool to call and when. Be specific about when each tool should be used.

Building the Agent Loop

Here's a complete Python agent that uses Gemma 4 through the Ollama API:

import json
import requests
import math

OLLAMA_URL = "http://localhost:11434/api/chat"
MODEL = "gemma4:12b"

# Define your tool implementations
def get_weather(city: str, unit: str = "celsius") -> dict:
    """Simulated weather API — replace with a real API call"""
    # In production, call OpenWeatherMap, WeatherAPI, etc.
    return {
        "city": city,
        "temperature": 22,
        "unit": unit,
        "condition": "sunny",
        "humidity": 45
    }

def calculate(expression: str) -> dict:
    """Safe math evaluation"""
    allowed_names = {
        "sqrt": math.sqrt,
        "sin": math.sin,
        "cos": math.cos,
        "pi": math.pi,
        "e": math.e,
        "abs": abs,
        "round": round
    }
    try:
        result = eval(expression, {"__builtins__": {}}, allowed_names)
        return {"expression": expression, "result": result}
    except Exception as e:
        return {"expression": expression, "error": str(e)}

def web_search(query: str, num_results: int = 3) -> dict:
    """Simulated web search — replace with real search API"""
    return {
        "query": query,
        "results": [
            {"title": f"Result about {query}", "snippet": "Relevant information..."}
        ]
    }

# Map function names to implementations
TOOL_MAP = {
    "get_weather": get_weather,
    "calculate": calculate,
    "web_search": web_search,
}

def call_gemma(messages: list, tools: list) -> dict:
    """Send a chat request to Gemma 4 via Ollama"""
    response = requests.post(OLLAMA_URL, json={
        "model": MODEL,
        "messages": messages,
        "tools": tools,
        "stream": False
    })
    return response.json()

def run_agent(user_input: str, tools: list, max_steps: int = 5):
    """Run the agent loop with multi-step tool use"""
    messages = [
        {
            "role": "system",
            "content": "You are a helpful assistant. Use the provided tools when needed. Always use tools for weather, calculations, and current information instead of guessing."
        },
        {"role": "user", "content": user_input}
    ]

    for step in range(max_steps):
        response = call_gemma(messages, tools)
        message = response["message"]

        # Check if the model wants to call a tool
        if "tool_calls" in message and message["tool_calls"]:
            # Process each tool call
            for tool_call in message["tool_calls"]:
                func_name = tool_call["function"]["name"]
                func_args = tool_call["function"]["arguments"]

                print(f"  → Calling {func_name}({func_args})")

                # Execute the tool
                if func_name in TOOL_MAP:
                    result = TOOL_MAP[func_name](**func_args)
                else:
                    result = {"error": f"Unknown tool: {func_name}"}

                # Add the tool call and result to messages
                messages.append(message)
                messages.append({
                    "role": "tool",
                    "content": json.dumps(result)
                })
        else:
            # Model gave a final text response
            print(f"Agent: {message['content']}")
            return message["content"]

    return "Agent reached maximum steps without completing."

# Run the agent
if __name__ == "__main__":
    run_agent("What's the weather in Tokyo and what's 15% of 8500?", tools)

This agent can handle multi-step queries. When you ask "What's the weather in Tokyo and what's 15% of 8500?", Gemma 4 will call get_weather and calculate separately, then combine both results into a single natural response.

Multi-Step Agent Patterns

Real agents often need to chain multiple tool calls where later calls depend on earlier results:

# Example: "Find restaurants near me and check the weather there"
# Step 1: Model calls web_search("restaurants near me")
# Step 2: Model sees results, calls get_weather(city from results)
# Step 3: Model combines both into a recommendation

# The agent loop handles this naturally — each tool result
# gets added to the conversation, and the model decides
# what to do next based on the full history

Structured Output for Reliable Tool Use

Gemma 4 supports structured output, which means you can force the model to respond in a specific JSON format. This is useful when you need the agent's final answer in a machine-readable structure:

response = requests.post(OLLAMA_URL, json={
    "model": MODEL,
    "messages": messages,
    "format": {
        "type": "object",
        "properties": {
            "answer": {"type": "string"},
            "confidence": {"type": "number"},
            "sources": {
                "type": "array",
                "items": {"type": "string"}
            }
        },
        "required": ["answer", "confidence"]
    },
    "stream": False
})

This guarantees the output is valid JSON matching your schema — no more parsing free-form text and hoping for the best.

Tips for Better Function Calling

TipWhy
Write detailed tool descriptionsThe model relies on descriptions to choose the right tool
Use required fields in schemasPrevents the model from omitting critical parameters
Limit tool count to 5-10Too many tools confuse the model
Include examples in descriptions"e.g., 'New York', 'London'" helps the model format args correctly
Handle errors gracefullyReturn error info so the model can try again or inform the user
Set a max_steps limitPrevents infinite loops if the model keeps calling tools

Common Pitfalls

The model ignores tools and answers directly: Make your system prompt explicitly tell the model to use tools. Add "NEVER guess weather/math/facts — always use the provided tools."

Wrong argument types: If the model sends "3" instead of 3 for an integer parameter, add type coercion in your tool implementations.

The model calls tools unnecessarily: Be specific in tool descriptions about when to use them. "Use ONLY when the user explicitly asks about weather" is better than "Get weather information."

Next Steps

  • Want to learn the Ollama API basics first? Check our Gemma 4 API Tutorial for a gentler introduction
  • Need to pick the right model for agent workloads? The 12B model is the sweet spot — see Which Gemma 4 Model?
  • Building multimodal agents? Read the Multimodal Guide to add vision capabilities to your agent

Function calling transforms Gemma 4 from a chatbot into an actual tool. Once you wire up real APIs — weather, search, databases, email — you've got an AI agent that can take meaningful actions in the real world, all running locally on your own hardware.

Gemma 4 AI

Gemma 4 AI

Related Guides