Gemma 4 ships with native function calling support, which means you can build AI agents that don't just generate text — they take actions. Call APIs, query databases, search the web, run calculations. This guide shows you how to build a working agent from scratch using Gemma 4 and Python.
What Is Function Calling?
Function calling lets the model decide when to use external tools instead of guessing an answer. Instead of asking Gemma 4 "what's the weather in Tokyo?" and getting a hallucinated response, the model outputs a structured request to call your weather API, you execute it, feed the result back, and the model generates a natural language answer based on real data.
The flow looks like this:
User: "What's the weather in Tokyo?"
→ Model: {"function": "get_weather", "args": {"city": "Tokyo"}}
→ Your code calls the weather API → returns {"temp": 22, "condition": "sunny"}
→ Model: "It's 22°C and sunny in Tokyo right now."Defining Tools with JSON Schema
First, you need to tell Gemma 4 what tools are available. Each tool is defined with a JSON schema that describes its name, purpose, and parameters:
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city. Use this when the user asks about weather conditions.",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "City name, e.g. 'Tokyo', 'New York'"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit"
}
},
"required": ["city"]
}
}
},
{
"type": "function",
"function": {
"name": "calculate",
"description": "Perform mathematical calculations. Use for any math the user asks about.",
"parameters": {
"type": "object",
"properties": {
"expression": {
"type": "string",
"description": "Math expression to evaluate, e.g. '2 + 2', 'sqrt(144)'"
}
},
"required": ["expression"]
}
}
},
{
"type": "function",
"function": {
"name": "web_search",
"description": "Search the web for current information. Use when the user asks about recent events or facts you're unsure about.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search query"
},
"num_results": {
"type": "integer",
"description": "Number of results to return (default: 3)"
}
},
"required": ["query"]
}
}
}
]Good tool descriptions are critical. The model uses them to decide which tool to call and when. Be specific about when each tool should be used.
Building the Agent Loop
Here's a complete Python agent that uses Gemma 4 through the Ollama API:
import json
import requests
import math
OLLAMA_URL = "http://localhost:11434/api/chat"
MODEL = "gemma4:12b"
# Define your tool implementations
def get_weather(city: str, unit: str = "celsius") -> dict:
"""Simulated weather API — replace with a real API call"""
# In production, call OpenWeatherMap, WeatherAPI, etc.
return {
"city": city,
"temperature": 22,
"unit": unit,
"condition": "sunny",
"humidity": 45
}
def calculate(expression: str) -> dict:
"""Safe math evaluation"""
allowed_names = {
"sqrt": math.sqrt,
"sin": math.sin,
"cos": math.cos,
"pi": math.pi,
"e": math.e,
"abs": abs,
"round": round
}
try:
result = eval(expression, {"__builtins__": {}}, allowed_names)
return {"expression": expression, "result": result}
except Exception as e:
return {"expression": expression, "error": str(e)}
def web_search(query: str, num_results: int = 3) -> dict:
"""Simulated web search — replace with real search API"""
return {
"query": query,
"results": [
{"title": f"Result about {query}", "snippet": "Relevant information..."}
]
}
# Map function names to implementations
TOOL_MAP = {
"get_weather": get_weather,
"calculate": calculate,
"web_search": web_search,
}
def call_gemma(messages: list, tools: list) -> dict:
"""Send a chat request to Gemma 4 via Ollama"""
response = requests.post(OLLAMA_URL, json={
"model": MODEL,
"messages": messages,
"tools": tools,
"stream": False
})
return response.json()
def run_agent(user_input: str, tools: list, max_steps: int = 5):
"""Run the agent loop with multi-step tool use"""
messages = [
{
"role": "system",
"content": "You are a helpful assistant. Use the provided tools when needed. Always use tools for weather, calculations, and current information instead of guessing."
},
{"role": "user", "content": user_input}
]
for step in range(max_steps):
response = call_gemma(messages, tools)
message = response["message"]
# Check if the model wants to call a tool
if "tool_calls" in message and message["tool_calls"]:
# Process each tool call
for tool_call in message["tool_calls"]:
func_name = tool_call["function"]["name"]
func_args = tool_call["function"]["arguments"]
print(f" → Calling {func_name}({func_args})")
# Execute the tool
if func_name in TOOL_MAP:
result = TOOL_MAP[func_name](**func_args)
else:
result = {"error": f"Unknown tool: {func_name}"}
# Add the tool call and result to messages
messages.append(message)
messages.append({
"role": "tool",
"content": json.dumps(result)
})
else:
# Model gave a final text response
print(f"Agent: {message['content']}")
return message["content"]
return "Agent reached maximum steps without completing."
# Run the agent
if __name__ == "__main__":
run_agent("What's the weather in Tokyo and what's 15% of 8500?", tools)This agent can handle multi-step queries. When you ask "What's the weather in Tokyo and what's 15% of 8500?", Gemma 4 will call get_weather and calculate separately, then combine both results into a single natural response.
Multi-Step Agent Patterns
Real agents often need to chain multiple tool calls where later calls depend on earlier results:
# Example: "Find restaurants near me and check the weather there"
# Step 1: Model calls web_search("restaurants near me")
# Step 2: Model sees results, calls get_weather(city from results)
# Step 3: Model combines both into a recommendation
# The agent loop handles this naturally — each tool result
# gets added to the conversation, and the model decides
# what to do next based on the full historyStructured Output for Reliable Tool Use
Gemma 4 supports structured output, which means you can force the model to respond in a specific JSON format. This is useful when you need the agent's final answer in a machine-readable structure:
response = requests.post(OLLAMA_URL, json={
"model": MODEL,
"messages": messages,
"format": {
"type": "object",
"properties": {
"answer": {"type": "string"},
"confidence": {"type": "number"},
"sources": {
"type": "array",
"items": {"type": "string"}
}
},
"required": ["answer", "confidence"]
},
"stream": False
})This guarantees the output is valid JSON matching your schema — no more parsing free-form text and hoping for the best.
Tips for Better Function Calling
| Tip | Why |
|---|---|
| Write detailed tool descriptions | The model relies on descriptions to choose the right tool |
Use required fields in schemas | Prevents the model from omitting critical parameters |
| Limit tool count to 5-10 | Too many tools confuse the model |
| Include examples in descriptions | "e.g., 'New York', 'London'" helps the model format args correctly |
| Handle errors gracefully | Return error info so the model can try again or inform the user |
| Set a max_steps limit | Prevents infinite loops if the model keeps calling tools |
Common Pitfalls
The model ignores tools and answers directly: Make your system prompt explicitly tell the model to use tools. Add "NEVER guess weather/math/facts — always use the provided tools."
Wrong argument types: If the model sends "3" instead of 3 for an integer parameter, add type coercion in your tool implementations.
The model calls tools unnecessarily: Be specific in tool descriptions about when to use them. "Use ONLY when the user explicitly asks about weather" is better than "Get weather information."
Next Steps
- Want to learn the Ollama API basics first? Check our Gemma 4 API Tutorial for a gentler introduction
- Need to pick the right model for agent workloads? The 12B model is the sweet spot — see Which Gemma 4 Model?
- Building multimodal agents? Read the Multimodal Guide to add vision capabilities to your agent
Function calling transforms Gemma 4 from a chatbot into an actual tool. Once you wire up real APIs — weather, search, databases, email — you've got an AI agent that can take meaningful actions in the real world, all running locally on your own hardware.



