让 Gemma 4 输出 JSON：结构化输出实战指南

如果你要把 Gemma 4 集成到应用里，就需要结构化输出——不是自由文本，而是每次都能解析的 JSON。

这是本地大模型最棘手的问题之一，但用对方法，Gemma 4 可以做到相当可靠。

为什么需要结构化输出

把 Gemma 4 当作系统组件用（不只是聊天）的时候，你需要可预测的输出：

# 你要的：
{"sentiment": "positive", "confidence": 0.92, "topics": ["定价", "客服"]}

# 你不要的：
"这段文本的情感是积极的，置信度大约92%..."

前者可以直接解析使用，后者还得再解析一遍，增加延迟和故障点。

方法一：系统提示词

最简单的方法——在 system prompt 里明确告诉模型你要什么：

import requests
import json

response = requests.post("http://localhost:11434/api/chat", json={
    "model": "gemma4:26b",
    "messages": [
        {
            "role": "system",
            "content": """你是一个只返回 JSON 的 API。
必须返回有效的 JSON，不要任何其他内容。
不要 markdown、不要解释、不要代码块——只要原始 JSON。

格式：
{
  "sentiment": "positive" | "negative" | "neutral",
  "confidence": 0到1之间的数字,
  "topics": 字符串数组,
  "summary": 一句话总结
}"""
        },
        {
            "role": "user",
            "content": "分析：'新版更新太棒了！界面更干净了，加载速度也快了。唯一的槽点是价格涨了。'"
        }
    ],
    "stream": False,
})

result = json.loads(response.json()["message"]["content"])
print(result)

大多数时候管用，但"大多数时候"在生产环境不够。模型偶尔会加一句"以下是 JSON："或者用 markdown 代码块包起来。

方法二：Ollama format 参数

Ollama 内置了 format 参数，强制输出有效 JSON：

response = requests.post("http://localhost:11434/api/chat", json={
    "model": "gemma4:26b",
    "messages": [
        {
            "role": "system",
            "content": "分析给定文本的情感。返回：sentiment（positive/negative/neutral）、confidence（0-1）、topics（列表）、summary（一句话）。"
        },
        {
            "role": "user",
            "content": "客服太差了但产品本身很棒。"
        }
    ],
    "format": "json",
    "stream": False,
})

# 保证是有效 JSON
result = response.json()["message"]["content"]
parsed = json.loads(result)

format: "json" 让 Ollama 在 token 生成时就约束输出只能是合法 JSON，比纯靠 prompt 可靠得多。

局限： 保证 JSON 语法正确，但不保证 schema 对。模型可能返回 {"answer": "positive"} 而不是你期望的格式。还需要校验。

方法三：Pydantic Schema 校验

生产代码用 Pydantic 定义 schema 并校验：

from pydantic import BaseModel, Field
from typing import Literal
import json
import requests

class SentimentResult(BaseModel):
    sentiment: Literal["positive", "negative", "neutral"]
    confidence: float = Field(ge=0, le=1)
    topics: list[str]
    summary: str

def analyze_sentiment(text: str) -> SentimentResult:
    schema_str = json.dumps(SentimentResult.model_json_schema(), indent=2)
    
    response = requests.post("http://localhost:11434/api/chat", json={
        "model": "gemma4:26b",
        "messages": [
            {
                "role": "system",
                "content": f"""只返回符合以下 schema 的 JSON：
{schema_str}

不要其他任何文字。只要有效的 JSON。"""
            },
            {
                "role": "user",
                "content": f"分析这段文字: {text}"
            }
        ],
        "format": "json",
        "stream": False,
    })
    
    raw = json.loads(response.json()["message"]["content"])
    return SentimentResult.model_validate(raw)

# 使用
result = analyze_sentiment("产品不错，就是快递太慢了。")
print(f"情感: {result.sentiment} ({result.confidence:.0%})")
print(f"话题: {', '.join(result.topics)}")

类型安全加校验。模型返回不对的东西，Pydantic 会报清晰的错误，不会悄悄污染你的数据。

方法四：校验 + 重试机制

追求最高可靠性，加重试循环：

from pydantic import BaseModel, ValidationError
import json
import requests
import time

def get_structured_output(
    prompt: str,
    schema_class: type[BaseModel],
    model: str = "gemma4:26b",
    max_retries: int = 3,
) -> BaseModel:
    schema_str = json.dumps(schema_class.model_json_schema(), indent=2)
    
    for attempt in range(max_retries):
        try:
            response = requests.post("http://localhost:11434/api/chat", json={
                "model": model,
                "messages": [
                    {
                        "role": "system",
                        "content": f"只返回符合以下 schema 的 JSON：\n{schema_str}"
                    },
                    {"role": "user", "content": prompt}
                ],
                "format": "json",
                "stream": False,
                "options": {
                    "temperature": 0.1 if attempt == 0 else 0.3,
                },
            })
            
            raw = json.loads(response.json()["message"]["content"])
            return schema_class.model_validate(raw)
            
        except (json.JSONDecodeError, ValidationError) as e:
            if attempt == max_retries - 1:
                raise ValueError(
                    f"{max_retries} 次尝试后仍无法获得有效输出: {e}"
                )
            time.sleep(0.5)
    
    raise ValueError("不可达")

# 使用
class ProductReview(BaseModel):
    rating: int = Field(ge=1, le=5)
    pros: list[str]
    cons: list[str]
    recommendation: bool

review = get_structured_output(
    "评价：'不错的笔记本，键盘手感好，续航差了点。4/5 会再买。'",
    ProductReview,
)

设计要点：

第一次用低温度（0.1）追求一致性，重试时提高到 0.3 增加多样性
用 format: "json" 保证语法合法
用 Pydantic 校验 schema 正确性
最多重试 3 次——3 次都失败说明 prompt 需要改

常见问题和解决方案

模型用 markdown 包裹 JSON：

解决：用 Ollama 的 format: "json"。如果不可用，手动去除 markdown：

def clean_json(text: str) -> str:
    text = text.strip()
    if text.startswith("```"):
        text = text.split("\n", 1)[1]
        text = text.rsplit("```", 1)[0]
    return text.strip()

模型返回多余字段： Pydantic 默认忽略多余字段。想严格限制用 model_config = ConfigDict(extra="forbid")。

类型不对： 模型可能返回 "0.92"（字符串）而不是 0.92（数字）。Pydantic 的 model_validate 自动处理大多数类型转换。

空值或 null： 可能为空的字段设为可选：

class Result(BaseModel):
    name: str
    email: str | None = None  # 可能找不到邮箱
    topics: list[str] = []    # 默认空列表

嵌套对象： Gemma 4 处理嵌套 JSON 没问题，但建议控制在 2-3 层以内：

class Address(BaseModel):
    city: str
    country: str

class Person(BaseModel):
    name: str
    age: int
    address: Address  # 一层嵌套——没问题

性能建议

低温度（0.1-0.3）输出 JSON 更稳定
精简 schema——别一次要 20 个字段
在 system prompt 里加 few-shot 示例能显著提高可靠性
26B 模型比 E4B 输出 JSON 好很多——看模型对比
思维模式处理复杂 schema 更好——看思维模式指南

下一步

在应用里用 JSON 输出：Ollama API 教程
部署 JSON API 服务器：vLLM + Docker
针对特定 JSON 格式微调 Gemma 4
复杂结构化任务用思维模式

gemma4 — interact

Stop reading. Start building.

~/gemma4 $ Get hands-on with the models discussed in this guide. No deployment, no friction, 100% free playground.

Launch Playground />

让 Gemma 4 输出 JSON：结构化输出实战指南

目录

为什么需要结构化输出

方法一：系统提示词

方法二：Ollama format 参数

方法三：Pydantic Schema 校验

方法四：校验 + 重试机制

常见问题和解决方案

性能建议

下一步

Stop reading. Start building.

相关教程

50 个最佳 Gemma 4 提示词：编程、写作、分析与多模态（2026）

2026 年最佳本地 AI 模型完整排名与对比

Aider 接入 Gemma 4：2026 最强开源 AI 结对编程本地搭建指南