diff --git a/README.md b/README.md index 3f801285e..8f7b65e83 100644 --- a/README.md +++ b/README.md @@ -535,9 +535,11 @@ Below are the supported multi-modal models and their respective chat handlers (P | [llama-3-vision-alpha](https://huggingface.co/abetlen/llama-3-vision-alpha-gguf) | `Llama3VisionAlphaChatHandler` | `llama-3-vision-alpha` | | [minicpm-v-2.6](https://huggingface.co/openbmb/MiniCPM-V-2_6-gguf) | `MiniCPMv26ChatHandler` | `minicpm-v-2.6` | | [qwen2.5-vl](https://huggingface.co/unsloth/Qwen2.5-VL-3B-Instruct-GGUF) | `Qwen25VLChatHandler` | `qwen2.5-vl` | -| [gemma-4](https://huggingface.co/unsloth/gemma-4-E4B-it-GGUF) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/abetlen/llama-cpp-python/blob/main/examples/colab/notebook.ipynb) | `Gemma4ChatHandler` | `gemma4` | +| [gemma-4](https://huggingface.co/unsloth/gemma-4-E4B-it-GGUF) | `Gemma4ChatHandler` | `gemma4` | | GGUF models with an mtmd projector and embedded chat template | `MTMDChatHandler` | `mtmd` | +Try Gemma 4 12B in Google Colab -> [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/abetlen/llama-cpp-python/blob/main/examples/colab/notebook.ipynb) + Then you'll need to use a custom chat handler to load the clip model and process the chat messages and images. ```python diff --git a/examples/colab/notebook.ipynb b/examples/colab/notebook.ipynb index c9b8d8dcb..8e258b9c0 100644 --- a/examples/colab/notebook.ipynb +++ b/examples/colab/notebook.ipynb @@ -81,7 +81,7 @@ " messages=[\n", " {\n", " \"role\": \"user\",\n", - " \"content\": \"Write the exact string `` and nothing else.\",\n", + " \"content\": \"What is the capital of France? Answer in one sentence.\",\n", " }\n", " ],\n", " max_tokens=32,\n", @@ -99,7 +99,7 @@ "source": [ "from IPython.display import Image, display\n", "\n", - "IMAGE_URL = \"https://raw.githubusercontent.com/abetlen/llama-cpp-python/main/vendor/llama.cpp/tools/mtmd/test-1.jpeg\"\n", + "IMAGE_URL = \"https://raw.githubusercontent.com/ggml-org/llama.cpp/master/tools/mtmd/test-1.jpeg\"\n", "\n", "display(Image(url=IMAGE_URL, width=320))\n" ]