fix: add support for gemma 4 by MoonRide303 · Pull Request #287 · p-e-w/heretic

MoonRide303 · 2026-04-07T17:10:39Z

fix proposal for #278

tested locally on gemma-4-E2B-it and Llama-3.2-3B-Instruct

gemini-code-assist

Code Review

This pull request updates the LoRA adapter initialization to use full module names for target identification and improves the logging of target types. A high-severity issue was identified where the sorted list of target modules is immediately overwritten by an unsorted list, which should be corrected to ensure deterministic behavior.

src/heretic/model.py

p-e-w · 2026-04-08T02:58:47Z

What is the difference between this PR and #285? Which approach do you think is better?

MoonRide303 · 2026-04-08T06:04:28Z

@p-e-w About the same idea, I didn't notice this #285 exist (it wasn't linked in the issue). That said, I am not sure about that change in try_add was in case of #285 (it doesn't look necessary to me, #287 worked fine without it), and then adjusting "LoRA adapters initialized" output in case of #287 is good thing to have (otherwise you will get dump of a lot of layers), so I'd lean towards this variant.

p-e-w · 2026-04-08T13:23:48Z

But how can this PR work if it doesn't extract the .linear submodule from the Gemma4ClippableLinear module?

MoonRide303 · 2026-04-08T16:24:59Z

But selected modules are Linear, already. After adding print(f"{full_name} -> {type(module).__name__}") debug line just after full_name = module_id_to_full_name.get(id(module)) I get this:

> heretic --n-trials 2 --n-startup-trials 1 --row-normalization PRE --batch-size 128 --dtypes bfloat16 --model .\gemma-4-E2B-it
W0408 18:11:59.357000 32456 site-packages\torch\distributed\elastic\multiprocessing\redirects.py:29] NOTE: Redirects are currently not supported in Windows or MacOs.
█░█░█▀▀░█▀▄░█▀▀░▀█▀░█░█▀▀  v1.2.0
█▀█░█▀▀░█▀▄░█▀▀░░█░░█░█░░
▀░▀░▀▀▀░▀░▀░▀▀▀░░▀░░▀░▀▀▀  https://github.com/p-e-w/heretic

Detected 1 CUDA device(s) (15.99 GB total VRAM):
* GPU 0: NVIDIA GeForce RTX 4080 (15.99 GB)

You have already processed this model. You can show the results from the previous run, allowing you to export models or to run additional trials. Alternatively, you can ignore the previous run and start from
scratch. This will delete the checkpoint file and all results from the previous run.

? How would you like to proceed? Ignore the previous run and start from scratch

Loading model .\gemma-4-E2B-it...
* Trying dtype bfloat16...
model.language_model.layers.0.self_attn.o_proj -> Linear
model.language_model.layers.0.mlp.down_proj -> Linear
model.language_model.layers.1.self_attn.o_proj -> Linear
model.language_model.layers.1.mlp.down_proj -> Linear
model.language_model.layers.2.self_attn.o_proj -> Linear
model.language_model.layers.2.mlp.down_proj -> Linear
model.language_model.layers.3.self_attn.o_proj -> Linear
model.language_model.layers.3.mlp.down_proj -> Linear
model.language_model.layers.4.self_attn.o_proj -> Linear
model.language_model.layers.4.mlp.down_proj -> Linear
model.language_model.layers.5.self_attn.o_proj -> Linear
model.language_model.layers.5.mlp.down_proj -> Linear
model.language_model.layers.6.self_attn.o_proj -> Linear
model.language_model.layers.6.mlp.down_proj -> Linear
model.language_model.layers.7.self_attn.o_proj -> Linear
model.language_model.layers.7.mlp.down_proj -> Linear
model.language_model.layers.8.self_attn.o_proj -> Linear
(...)

Model made with heretic --batch-size 128 --dtypes bfloat16 --model google/gemma-4-E2B-it using heretic-llm from my gemma-4 branch: https://huggingface.co/MoonRide/gemma-4-E2B-it-heretic

p-e-w · 2026-04-09T14:43:50Z

Doesn't that directly contradict the findings in #278? Their output contained things like

model.audio_tower.layers.{0...11}.self_attn.q_proj.linear.weight

Sorry for not digging deeper into this myself, I don't have time for this right now and I'm really confused by the different approaches between this PR and #285, both of which are claiming to be the better solution.

MoonRide303 · 2026-04-09T16:42:54Z

@p-e-w Old (current master / ara) code was using leaf module names, which PEFT then matched across entire model, also picking up Gemma4ClippableLinear wrappers from vision sub-model. When we use full module paths, it no longer happens.

p-e-w · 2026-04-11T15:33:26Z

But where is the code that actually registers the inner Linear module under self_attn.o_proj?

For traditional models, we have this:

heretic/src/heretic/model.py

Lines 355 to 357 in 077e31f

    
           # Standard self-attention out-projection (most models). 
        
           with suppress(Exception): 
        
               try_add("attn.o_proj", layer.self_attn.o_proj)  # ty:ignore[possibly-missing-attribute]

This code is required for abliteration to happen, because only modules returned by get_layer_modules will actually be targeted (see abliterate, which uses module.weight, which Gemma4ClippableLinear doesn't have).

For Gemma 4, the relevant module is not under self_attn.o_proj, but under self_attn.o_proj.linear. So how does this PR actually work?

MoonRide303 · 2026-04-11T17:08:02Z

Unwrapping via .linear would only be needed if we wanted to process Gemma4ClippableLinear, which are inside vision and audio modules. In case of text modules o_proj and down_proj are plain nn.Linear (see Gemma4TextAttention.o_proj and Gemma4TextMLP.down_proj). get_layers() resolves multimodal models to model.language_model.layers (no audio / vision towers), so get_layer_modules() will return only nn.Linear modules there. So for the modules heretic actually targets, the relevant module is under self_attn.o_proj / mlp.down_proj, NOT under .linear.

p-e-w · 2026-04-12T07:01:41Z

Ah, indeed. We definitely don't want to touch the vision and audio towers.

You have convinced me. Please fix CI so this can be merged.

MoonRide303 · 2026-04-12T07:12:18Z

Please fix CI so this can be merged.

Done.

p-e-w · 2026-04-12T07:19:01Z

Splendid, merged! Thanks for the PR, this is a solid improvement that I imagine will make Heretic more likely to support other, future architectures out of the box.

MoonRide303 · 2026-04-12T08:25:21Z

@p-e-w One more thing worth doing for full gemma 4 support out of the box - bumping transformers in dependencies to 5.5.0 or newer (I tested on 5.5.1 and 5.5.3).

Side note: models saved with transformers 5.5.2 or newer might require updating model loader code (5.5.2 no longer saves shared weights). I'd start with 5.5.1 in deps, and when more downstream apps will update their dependencies (llama.cpp b8751+, etc.) it will be okay to update to newer releases.

gemini-code-assist bot reviewed Apr 7, 2026

View reviewed changes

src/heretic/model.py Outdated Show resolved Hide resolved

MoonRide303 mentioned this pull request Apr 7, 2026

Add support for Gemma4 (Gemma4ClippableLinear) #278

Open

MoonRide303 force-pushed the gemma-4 branch from dbcce63 to 2d4fdc9 Compare April 7, 2026 17:21

p-e-w mentioned this pull request Apr 8, 2026

fix: support Gemma 4 wrapped linear modules #285

Closed

fix: support for gemma 4

22e3787

MoonRide303 force-pushed the gemma-4 branch from 2d4fdc9 to 22e3787 Compare April 12, 2026 07:10

p-e-w merged commit e2c74bf into p-e-w:master Apr 12, 2026
4 checks passed

MoonRide303 deleted the gemma-4 branch April 12, 2026 07:28

MoonRide303 added a commit to MoonRide303/heretic that referenced this pull request Apr 12, 2026

fix: support for gemma 4 (p-e-w#287)

a5183f0

0xA50C1A1 pushed a commit to 0xA50C1A1/heretic that referenced this pull request Apr 12, 2026

fix: support for gemma 4 (p-e-w#287)

5dd153d

Conversation

MoonRide303 commented Apr 7, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

p-e-w commented Apr 8, 2026

Uh oh!

MoonRide303 commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

p-e-w commented Apr 8, 2026

Uh oh!

MoonRide303 commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

p-e-w commented Apr 9, 2026

Uh oh!

MoonRide303 commented Apr 9, 2026

Uh oh!

p-e-w commented Apr 11, 2026

Uh oh!

MoonRide303 commented Apr 11, 2026

Uh oh!

p-e-w commented Apr 12, 2026

Uh oh!

MoonRide303 commented Apr 12, 2026

Uh oh!

Uh oh!

p-e-w commented Apr 12, 2026

Uh oh!

MoonRide303 commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MoonRide303 commented Apr 8, 2026 •

edited

Loading

MoonRide303 commented Apr 8, 2026 •

edited

Loading