Name and Version
BeeLlama v0.3.0 (Windows x64, CUDA 12 build)
Operating systems
Windows
GGML backends
CUDA
Hardware
CPU: Ryzen 7950X3D
GPU: 3080ti
RAM: 96GB
VRAM: 12GB
Models
Model:
- gemma-4-12b-it-UD-Q5_K_XL.gguf
Multimodal projector:
- gemma-4-12b-it-mmproj-BF16.gguf
from: https://huggingface.co/unsloth/gemma-4-12b-it-GGUF
Problem description & steps to reproduce
When loading Gemma 4 12B with its multimodal projector in BeeLlama, model loading fails before inference starts.
The server exits during multimodal / projector initialization with:
load_hparams: unknown projector type: gemma4uv
This appears to be a compatibility issue with the current runtime not recognizing the Gemma 4 projector format / architecture. Gemma 4 includes newer multimodal capabilities, and recent ecosystem notes suggest support has been evolving across runtimes. [web:503][web:512]
Steps to reproduce:
- Configure BeeLlama to load:
- gemma-4-12b-it-UD-Q5_K_XL.gguf
- gemma-4-12b-it-mmproj-BF16.gguf
- Start the server / load the model.
- Wait for initialization.
- Loading fails at clip_init / mtmd_init_from_file and the server exits.
Expected behavior:
The model should load successfully with multimodal support enabled.
Actual behavior:
The server crashes during projector loading and exits before the model becomes available.
Workaround:
If I remove / disable the mmproj entry, the text model can be loaded without multimodal support.
First Bad Commit
No response
Relevant log output
0.44.589.181 I srv ensure_model: waiting until model name=gemma-4-12b-it-UD-Q5_K_XL is fully loaded...
[50179] 0.03.748.921 W llama_context: n_ctx_seq (65536) < n_ctx_train (131072) -- the full capacity of the model will not be utilized
[50179] 0.03.867.438 I common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
[50179] TCQ decode: context-adaptive V alpha enabled
[50179] 0.04.222.949 E clip_init: failed to load model 'F:\AI\LLM\models\gemma-4-12b-it-mmproj-BF16.gguf': load_hparams: unknown projector type: gemma4uv
[50179]
[50179] 0.04.223.000 E mtmd_init_from_file: error: Failed to load CLIP model from F:\AI\LLM\models\gemma-4-12b-it-mmproj-BF16.gguf
[50179]
[50179] 0.04.223.009 E srv load_model: failed to load multimodal model, 'F:\AI\LLM\models\gemma-4-12b-it-mmproj-BF16.gguf'
[50179] 0.04.223.014 I srv operator(): operator(): cleaning up before exit...
[50179] 0.04.223.950 E srv llama_server: exiting due to model loading error
Name and Version
BeeLlama v0.3.0 (Windows x64, CUDA 12 build)
Operating systems
Windows
GGML backends
CUDA
Hardware
CPU: Ryzen 7950X3D
GPU: 3080ti
RAM: 96GB
VRAM: 12GB
Models
Model:
Multimodal projector:
from: https://huggingface.co/unsloth/gemma-4-12b-it-GGUF
Problem description & steps to reproduce
When loading Gemma 4 12B with its multimodal projector in BeeLlama, model loading fails before inference starts.
The server exits during multimodal / projector initialization with:
load_hparams: unknown projector type: gemma4uv
This appears to be a compatibility issue with the current runtime not recognizing the Gemma 4 projector format / architecture. Gemma 4 includes newer multimodal capabilities, and recent ecosystem notes suggest support has been evolving across runtimes. [web:503][web:512]
Steps to reproduce:
Expected behavior:
The model should load successfully with multimodal support enabled.
Actual behavior:
The server crashes during projector loading and exits before the model becomes available.
Workaround:
If I remove / disable the mmproj entry, the text model can be loaded without multimodal support.
First Bad Commit
No response
Relevant log output
0.44.589.181 I srv ensure_model: waiting until model name=gemma-4-12b-it-UD-Q5_K_XL is fully loaded...
[50179] 0.03.748.921 W llama_context: n_ctx_seq (65536) < n_ctx_train (131072) -- the full capacity of the model will not be utilized
[50179] 0.03.867.438 I common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
[50179] TCQ decode: context-adaptive V alpha enabled
[50179] 0.04.222.949 E clip_init: failed to load model 'F:\AI\LLM\models\gemma-4-12b-it-mmproj-BF16.gguf': load_hparams: unknown projector type: gemma4uv
[50179]
[50179] 0.04.223.000 E mtmd_init_from_file: error: Failed to load CLIP model from F:\AI\LLM\models\gemma-4-12b-it-mmproj-BF16.gguf
[50179]
[50179] 0.04.223.009 E srv load_model: failed to load multimodal model, 'F:\AI\LLM\models\gemma-4-12b-it-mmproj-BF16.gguf'
[50179] 0.04.223.014 I srv operator(): operator(): cleaning up before exit...
[50179] 0.04.223.950 E srv llama_server: exiting due to model loading error