Fix Gemma4ClippableLinear Not Supported: QLoRA + PEFT

You spin up a QLoRA run on Gemma 4, call get_peft_model(...), and instead of a training bar you get a wall of red ending in Target module Gemma4ClippableLinear(...) is not supported. The model loads fine, bitsandbytes quantizes fine, and then PEFT refuses to attach a single LoRA adapter.

This is a real, reproducible bug — it was filed as PEFT issue #3129 on day zero of Gemma 4's release and confirmed across several independent fine-tuning writeups. The good news: the root cause is narrow and there are three clean fixes depending on which PEFT version you can run. Let's break it down.

What the "Gemma4ClippableLinear is not supported" error looks like

The traceback shows up the moment PEFT tries to wrap your target modules:

ValueError: Target module Gemma4ClippableLinear(
  (linear): Linear4bit(in_features=768, out_features=768, bias=False)
) is not supported. Currently, only the following modules are supported:
`torch.nn.Linear`, `torch.nn.Embedding`, `torch.nn.Conv1d`, `torch.nn.Conv2d`,
`torch.nn.Conv3d`, `transformers.pytorch_utils.Conv1D`,
`torch.nn.MultiheadAttention`.

The Linear4bit you see inside is just the bitsandbytes 4-bit layer from QLoRA — in a full-precision LoRA run it would read Linear instead. Either way, the outer wrapper is Gemma4ClippableLinear, and that is the layer PEFT chokes on.

It typically fires when you fine-tune one of the Gemma 4 multimodal checkpoints — google/gemma-4-31B, gemma-4-E2B-it, gemma-4-E4B, or the gemma-4-26B-A4B MoE — with an explicit target_modules list that recurses into the vision or audio encoder. If you have hit other day-zero issues too, our Gemma 4 troubleshooting guide covers the broader set.

Why PEFT rejects Gemma4ClippableLinear: nn.Module vs nn.Linear

Gemma 4 introduced a custom layer called Gemma4ClippableLinear inside its vision and audio encoders. It wraps an ordinary nn.Linear (a Linear4bit under QLoRA) and adds optional input/output clamping for numerical stability. The catch is one line in its class definition: it subclasses nn.Module, not nn.Linear.

PEFT injects LoRA through a dispatch step (_create_new_module in LoraModel) that runs a strict allow-list check on every module your target_modules matches. Only nn.Linear, nn.Embedding, the nn.ConvNd family, the transformers Conv1D, and nn.MultiheadAttention pass. Gemma4ClippableLinear is none of those, so PEFT raises immediately.

The nasty part: this type check runs before exclude_modules is applied. So even if you try to exclude the vision tower explicitly, you can't — PEFT rejects the layer before it ever reaches the exclusion logic. That is why people who only want to train the text decoder still trip over a vision/audio layer they never intended to touch.

The three working fixes, compared

Fix	Needs source edit?	PEFT version	Keeps Google's clip bounds?	Risk of LoRA-ing the vision/audio tower
1. Upgrade PEFT to 0.19+	No	≥ 0.19.0	Yes	None (regex limits to LM layers)
2. Unwrap to inner `.linear`	No (runtime patch)	Any	No (clamp dropped)	None — you target the inner linear
3. Monkey-patch the class	No (runtime patch)	Any	Yes, if you copy the clamp	Depends on your `target_modules`

Pick based on the PEFT version you can actually install: on 0.19.0 or newer, use Fix 1 and stop reading. If you are pinned to an older PEFT (or an older torch), Fix 2 is the simplest, and Fix 3 preserves clamping if you need it.

Fix 1 — Upgrade PEFT to 0.19+ and let it pick target modules

As of PEFT 0.19.0 (released April 2026), the library ships built-in default target_modules for Gemma 4 — a regex that scopes adapters to the language-model layers and skips the vision/audio ClippableLinear wrappers entirely. The cleanest fix is to upgrade and simply omit target_modules so the defaults kick in:

pip install -U "peft>=0.19.0"

from peft import LoraConfig

lora = LoraConfig(
    r=16,
    lora_alpha=16,
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM",
    # no target_modules — PEFT's Gemma 4 defaults scope to the LM layers
)

This is the official, lowest-friction path and it leaves the clamp logic untouched. One caveat worth verifying yourself: some reports note PEFT 0.19.0 wants torch >= 2.7, so if you are stuck on torch 2.6 you may need Fix 2 or 3 instead. Always skim the PEFT 0.19.0 release notes before pinning.

Fix 2 — Unwrap ClippableLinear to its inner linear (no upgrade)

If you can't upgrade, the most direct workaround is to walk the loaded model and replace every Gemma4ClippableLinear with the plain nn.Linear (or Linear4bit) it already holds inside. Do this after from_pretrained and before get_peft_model:

from transformers.models.gemma4.modeling_gemma4 import Gemma4ClippableLinear

for name, module in list(model.named_modules()):
    if isinstance(module, Gemma4ClippableLinear):
        parent = model
        parts = name.split(".")
        for p in parts[:-1]:
            parent = getattr(parent, p)
        setattr(parent, parts[-1], module.linear)  # swap in the inner linear

model = get_peft_model(model, lora_config)

PEFT now sees a normal linear and attaches adapters happily. The trade-off: unwrapping discards the clamp behavior. For pure text-decoder fine-tuning that is usually harmless, but if you train or run the vision/audio path, the missing numerical clamp can affect stability. This approach comes straight from a real Gemma 4 fine-tuning pipeline writeup.

Fix 3 — Monkey-patch ClippableLinear to inherit nn.Linear

The third option keeps clamping by redefining the class so it subclasses nn.Linear, which makes PEFT's type check pass. You must patch before loading the model:

import torch
import torch.nn as nn
from transformers.models.gemma4 import modeling_gemma4

class PatchedClippableLinear(nn.Linear):
    def __init__(self, in_features, out_features, bias=False):
        super().__init__(in_features, out_features, bias=bias)
        # re-register the clamp buffers exactly as the upstream layer defines them

    def forward(self, x):
        # replicate the upstream input/output clamp here, then:
        return super().forward(x)

# must run before AutoModelForCausalLM.from_pretrained(...)
modeling_gemma4.Gemma4ClippableLinear = PatchedClippableLinear

This is the bridge fix used in the original issue #3129 thread. It works without touching PEFT, but it is fragile: the constructor signature and clamp buffer names must match the upstream modeling_gemma4.py exactly, and a transformers upgrade can break it. Copy the real forward/clamp logic from the installed source — the snippet above is a skeleton, not a drop-in.

Minimal reproducible example

Use this to confirm the error, then apply any fix above and re-run:

import torch
from transformers import AutoModelForCausalLM, BitsAndBytesConfig
from peft import LoraConfig, get_peft_model

bnb = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)
model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-4-31B",          # or gemma-4-E2B-it
    quantization_config=bnb,
    device_map="auto",
)

lora = LoraConfig(
    r=16, lora_alpha=16,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj",
                    "gate_proj", "up_proj", "down_proj"],
    task_type="CAUSAL_LM",
)
model = get_peft_model(model, lora)        # ← ValueError fires here on a broken setup
model.print_trainable_parameters()         # ← prints once a fix is in place

Environment that reproduced it in issue #3129: peft 0.18.2.dev0, transformers 5.5.0.dev0, torch 2.8.0+cu128, bitsandbytes 0.44.x.

What about target_modules="all-linear"?

You will see advice to just set target_modules="all-linear". Treat it with caution. In principle the all-linear macro scans recursively and should target inner linears — but because PEFT's type check runs first and all-linear matches even more layers, it can still hit the same Gemma4ClippableLinear wrapper and raise the identical ValueError, and it may pull the vision/audio towers into training too. There is no clean code-level confirmation that it dodges this specific bug, so don't rely on it as your primary fix — prefer Fix 1, 2, or 3.

Edge case: when you actually want to LoRA the vision or audio towers

The PEFT 0.19 defaults deliberately skip the multimodal towers. If your goal is to adapt the vision or audio encoder, the defaults won't reach those layers. You'll need an explicit target_modules regex that points at each ClippableLinear's inner .linear (combine the unwrap loop from Fix 2 with a regex naming the encoder submodules), accepting that you take on the clamp trade-off yourself.

Day-zero Gemma 4 fine-tuning tends to surface this layer error alongside two siblings:

"transformers does not recognize the gemma4 architecture" — your transformers is too old; upgrade to a build that ships the Gemma 4 modeling files.
"mm_token_type_ids is required" — a multimodal collator/processor mismatch; pass the multimodal token type ids your processor produces.

Clear all three and your QLoRA run starts. For the full setup around LoRA, QLoRA, and Unsloth, see our Gemma 4 fine-tuning guide, and for loading checkpoints correctly, the Hugging Face guide.

FAQ

Why doesn't exclude_modules skip Gemma4ClippableLinear? Because PEFT's type-check runs before exclusion is applied. The layer is rejected before exclude_modules ever gets a chance to filter it out.

Which Gemma 4 sizes are affected? Any multimodal Gemma 4 checkpoint with the vision/audio encoder — 31B dense, E2B/E4B, and the 26B-A4B MoE — when an explicit target_modules recurses into those encoders.

Does the monkey-patch hurt inference accuracy? It can, if you don't faithfully copy the clamp logic. The clamp exists for numerical stability; drop it and the multimodal path may behave differently. For text-only fine-tuning the impact is usually negligible.

What's the safest version combo? PEFT ≥ 0.19.0 with a transformers build that ships Gemma 4 and a matching torch. That lets you skip target_modules entirely and avoid runtime patches — confirm exact minimums in the PEFT and transformers release notes for your install.

What target_modules list should I use for text-only LoRA? The standard attention + MLP projections — q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj — but on PEFT 0.19+ let the defaults handle scoping so you don't accidentally reach the towers.

gemma4 — interact