From a579312b5c364bfb9fb62d4fcff02e7eb7df6b8c Mon Sep 17 00:00:00 2001 From: Holt Skinner <13262395+holtskinner@users.noreply.github.com> Date: Wed, 4 Feb 2026 12:38:07 -0600 Subject: [PATCH 1/2] Remove Intro to Gemini 2.0 Notebooks --- .../intro_gemini_2_0_flash.ipynb | 1505 ----------------- .../intro_gemini_2_0_flash_lite.ipynb | 1314 -------------- .../intro_gemini_2_0_image_gen.ipynb | 606 ------- .../intro_gemini_2_0_image_gen_rest_api.ipynb | 890 ---------- 4 files changed, 4315 deletions(-) delete mode 100644 gemini/getting-started/intro_gemini_2_0_flash.ipynb delete mode 100644 gemini/getting-started/intro_gemini_2_0_flash_lite.ipynb delete mode 100644 gemini/getting-started/intro_gemini_2_0_image_gen.ipynb delete mode 100644 gemini/getting-started/intro_gemini_2_0_image_gen_rest_api.ipynb diff --git a/gemini/getting-started/intro_gemini_2_0_flash.ipynb b/gemini/getting-started/intro_gemini_2_0_flash.ipynb deleted file mode 100644 index 1711d2901bf..00000000000 --- a/gemini/getting-started/intro_gemini_2_0_flash.ipynb +++ /dev/null @@ -1,1505 +0,0 @@ -{ - "cells": [ - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "sqi5B7V_Rjim" - }, - "outputs": [], - "source": [ - "# Copyright 2024 Google LLC\n", - "#\n", - "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", - "# you may not use this file except in compliance with the License.\n", - "# You may obtain a copy of the License at\n", - "#\n", - "# https://www.apache.org/licenses/LICENSE-2.0\n", - "#\n", - "# Unless required by applicable law or agreed to in writing, software\n", - "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", - "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", - "# See the License for the specific language governing permissions and\n", - "# limitations under the License." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "VyPmicX9RlZX" - }, - "source": [ - "# Intro to Gemini 2.0 Flash\n", - "\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
\n", - " \n", - " \"Google
Open in Colab\n", - "
\n", - "
\n", - " \n", - " \"Google
Open in Colab Enterprise\n", - "
\n", - "
\n", - " \n", - " \"Vertex
Open in Vertex AI Workbench\n", - "
\n", - "
\n", - " \n", - " \"GitHub
View on GitHub\n", - "
\n", - "
\n", - " \n", - " \"Google
Open in Cloud Skills Boost\n", - "
\n", - "
\n", - "\n", - "
\n", - "\n", - "Share to:\n", - "\n", - "\n", - " \"LinkedIn\n", - "\n", - "\n", - "\n", - " \"Bluesky\n", - "\n", - "\n", - "\n", - " \"X\n", - "\n", - "\n", - "\n", - " \"Reddit\n", - "\n", - "\n", - "\n", - " \"Facebook\n", - "" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "8MqT58L6Rm_q" - }, - "source": [ - "| Authors |\n", - "| --- |\n", - "| [Eric Dong](https://github.com/gericdong) |\n", - "| [Holt Skinner](https://github.com/holtskinner) |" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "nVxnv1D5RoZw" - }, - "source": [ - "## Overview\n", - "\n", - "**YouTube Video: Introduction to Gemini on Vertex AI**\n", - "\n", - "\n", - " \"Introduction\n", - "\n", - "\n", - "[Gemini 2.0 Flash](https://cloud.google.com/vertex-ai/generative-ai/docs/gemini-v2) is a new multimodal generative ai model from the Gemini family developed by [Google DeepMind](https://deepmind.google/). It is available through the Gemini API in Vertex AI and Vertex AI Studio. The model introduces new features and enhanced core capabilities:\n", - "\n", - "- Multimodal Live API: This new API helps you create real-time vision and audio streaming applications with tool use.\n", - "- Speed and performance: Gemini 2.0 Flash is the fastest model in the industry, with a 3x improvement in time to first token (TTFT) over 1.5 Flash.\n", - "- Quality: The model maintains quality comparable to larger models like Gemini 2.0 and GPT-4o.\n", - "- Improved agentic experiences: Gemini 2.0 delivers improvements to multimodal understanding, coding, complex instruction following, and function calling.\n", - "- New Modalities: Gemini 2.0 introduces native image generation and controllable text-to-speech capabilities, enabling image editing, localized artwork creation, and expressive storytelling.\n", - "- To support the new model, we're also shipping an all new SDK that supports simple migration between the Gemini Developer API and the Gemini API in Vertex AI." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "WfFPCBL4Hq8x" - }, - "source": [ - "### Objectives\n", - "\n", - "In this tutorial, you will learn how to use the Gemini API in Vertex AI and the Google Gen AI SDK for Python with the Gemini 2.0 Flash model.\n", - "\n", - "You will complete the following tasks:\n", - "\n", - "- Generate text from text prompts\n", - " - Generate streaming text\n", - " - Start multi-turn chats\n", - " - Use asynchronous methods\n", - "- Configure model parameters\n", - "- Set system instructions\n", - "- Use safety filters\n", - "- Use controlled generation\n", - "- Count tokens\n", - "- Process multimodal (audio, code, documents, images, video) data\n", - "- Use automatic and manual function calling\n", - "- Code execution" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "gPiTOAHURvTM" - }, - "source": [ - "## Getting Started" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "CHRZUpfWSEpp" - }, - "source": [ - "### Install Google Gen AI SDK for Python\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "sG3_LKsWSD3A" - }, - "outputs": [], - "source": [ - "%pip install --upgrade --quiet google-genai" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "HlMVjiAWSMNX" - }, - "source": [ - "### Authenticate your notebook environment (Colab only)\n", - "\n", - "If you are running this notebook on Google Colab, run the cell below to authenticate your environment." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "12fnq4V0SNV3" - }, - "outputs": [], - "source": [ - "import sys\n", - "\n", - "if \"google.colab\" in sys.modules:\n", - " from google.colab import auth\n", - "\n", - " auth.authenticate_user()" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "Ve4YBlDqzyj9" - }, - "source": [ - "### Connect to a generative AI API service\n", - "\n", - "Google Gen AI APIs and models including Gemini are available in the following two API services:\n", - "\n", - "- **[Google AI for Developers](https://ai.google.dev/gemini-api/docs)**: Experiment, prototype, and deploy small projects.\n", - "- **[Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/overview)**: Build enterprise-ready projects on Google Cloud.\n", - "\n", - "The Google Gen AI SDK provides a unified interface to these two API services.\n", - "\n", - "This notebook shows how to use the Google Gen AI SDK with the Gemini API in Vertex AI." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "EdvJRUWRNGHE" - }, - "source": [ - "### Import libraries\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "qgdSpVmDbdQ9" - }, - "outputs": [], - "source": [ - "from IPython.display import HTML, Markdown, display\n", - "from google import genai\n", - "from google.genai.types import (\n", - " FunctionDeclaration,\n", - " GenerateContentConfig,\n", - " GoogleSearch,\n", - " HarmBlockThreshold,\n", - " HarmCategory,\n", - " MediaResolution,\n", - " Part,\n", - " Retrieval,\n", - " SafetySetting,\n", - " Tool,\n", - " ToolCodeExecution,\n", - " VertexAISearch,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "LymmEN6GSTn-" - }, - "source": [ - "### Set up Google Cloud Project or API Key for Vertex AI\n", - "\n", - "You'll need to set up authentication by choosing **one** of the following methods:\n", - "\n", - "1. **Use a Google Cloud Project:** Recommended for most users, this requires enabling the Vertex AI API in your Google Cloud project.\n", - " - [Enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com)\n", - " - Run the cell below to set your project ID and location.\n", - " - Read more about [Supported locations](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations)\n", - "2. **Use a Vertex AI API Key (Express Mode):** For quick experimentation. \n", - " - [Get an API Key](https://cloud.google.com/vertex-ai/generative-ai/docs/start/express-mode/overview)\n", - " - Run the cell further below to use your API key." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "f1933326c939" - }, - "source": [ - "#### Option 1. Use a Google Cloud Project\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "UCgUOv4nSWhc" - }, - "outputs": [], - "source": [ - "import os\n", - "\n", - "PROJECT_ID = \"[your-project-id]\" # @param {type: \"string\", placeholder: \"[your-project-id]\", isTemplate: true}\n", - "if not PROJECT_ID or PROJECT_ID == \"[your-project-id]\":\n", - " PROJECT_ID = str(os.environ.get(\"GOOGLE_CLOUD_PROJECT\"))\n", - "\n", - "LOCATION = os.environ.get(\"GOOGLE_CLOUD_REGION\", \"global\")\n", - "\n", - "client = genai.Client(vertexai=True, project=PROJECT_ID, location=LOCATION)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "b6aa38ee3158" - }, - "source": [ - "#### Option 2. Use a Vertex AI API Key (Express Mode)\n", - "\n", - "Uncomment the following block to use Express Mode" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "zpIPG_YhSjaw" - }, - "outputs": [], - "source": [ - "# API_KEY = \"[your-api-key]\" # @param {type: \"string\", placeholder: \"[your-api-key]\", isTemplate: true}\n", - "\n", - "# if not API_KEY or API_KEY == \"[your-api-key]\":\n", - "# raise Exception(\"You must provide an API key to use Vertex AI in express mode.\")\n", - "\n", - "# client = genai.Client(vertexai=True, api_key=API_KEY)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "7b36ce4ac022" - }, - "source": [ - "Verify which mode you are using." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "8338643f335f" - }, - "outputs": [], - "source": [ - "if not client.vertexai:\n", - " print(\"Using Gemini Developer API.\")\n", - "elif client._api_client.project:\n", - " print(\n", - " f\"Using Vertex AI with project: {client._api_client.project} in location: {client._api_client.location}\"\n", - " )\n", - "elif client._api_client.api_key:\n", - " print(\n", - " f\"Using Vertex AI in express mode with API key: {client._api_client.api_key[:5]}...{client._api_client.api_key[-5:]}\"\n", - " )" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "n4yRkFg6BBu4" - }, - "source": [ - "## Use the Gemini 2.0 Flash model" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "eXHJi5B6P5vd" - }, - "source": [ - "### Load the Gemini 2.0 Flash model\n", - "\n", - "Learn more about all [Gemini models on Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models#gemini-models)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "-coEslfWPrxo" - }, - "outputs": [], - "source": [ - "MODEL_ID = \"gemini-2.0-flash\" # @param {type: \"string\"}" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "37CH91ddY9kG" - }, - "source": [ - "### Generate text from text prompts\n", - "\n", - "Use the `generate_content()` method to generate responses to your prompts.\n", - "\n", - "You can pass text to `generate_content()`, and use the `.text` property to get the text content of the response.\n", - "\n", - "By default, Gemini outputs formatted text using [Markdown](https://daringfireball.net/projects/markdown/) syntax." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "xRJuHj0KZ8xz" - }, - "outputs": [], - "source": [ - "response = client.models.generate_content(\n", - " model=MODEL_ID, contents=\"What's the largest planet in our solar system?\"\n", - ")\n", - "\n", - "display(Markdown(response.text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "JkYQATRxAK1_" - }, - "source": [ - "#### Example prompts\n", - "\n", - "- What are the biggest challenges facing the healthcare industry?\n", - "- What are the latest developments in the automotive industry?\n", - "- What are the biggest opportunities in retail industry?\n", - "- (Try your own prompts!)\n", - "\n", - "For more examples of prompt engineering, refer to [this notebook](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/prompts/intro_prompt_design.ipynb)." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "6lLIxqS6_-l8" - }, - "source": [ - "### Generate content stream\n", - "\n", - "By default, the model returns a response after completing the entire generation process. You can also use the `generate_content_stream` method to stream the response as it is being generated, and the model will return chunks of the response as soon as they are generated." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "ZiwWBhXsAMnv" - }, - "outputs": [], - "source": [ - "output_text = \"\"\n", - "markdown_display_area = display(Markdown(output_text), display_id=True)\n", - "\n", - "for chunk in client.models.generate_content_stream(\n", - " model=MODEL_ID,\n", - " contents=\"Tell me a story about a lonely robot who finds friendship in a most unexpected place.\",\n", - "):\n", - " output_text += chunk.text\n", - " markdown_display_area.update(Markdown(output_text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "29jFnHZZWXd7" - }, - "source": [ - "### Start a multi-turn chat\n", - "\n", - "The Gemini API supports freeform multi-turn conversations across multiple turns with back-and-forth interactions.\n", - "\n", - "The context of the conversation is preserved between messages." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "DbM12JaLWjiF" - }, - "outputs": [], - "source": [ - "chat = client.chats.create(model=MODEL_ID)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "JQem1halYDBW" - }, - "outputs": [], - "source": [ - "response = chat.send_message(\"Write a function that checks if a year is a leap year.\")\n", - "\n", - "display(Markdown(response.text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "vUJR4Pno-LGK" - }, - "source": [ - "This follow-up prompt shows how the model responds based on the previous prompt:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "6Fn69TurZ9DB" - }, - "outputs": [], - "source": [ - "response = chat.send_message(\"Write a unit test of the generated function.\")\n", - "\n", - "display(Markdown(response.text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "arLJE4wOuhh6" - }, - "source": [ - "### Send asynchronous requests\n", - "\n", - "`client.aio` exposes all analogous [async](https://docs.python.org/3/library/asyncio.html) methods that are available on `client`.\n", - "\n", - "For example, `client.aio.models.generate_content` is the async version of `client.models.generate_content`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "gSReaLazs-dP" - }, - "outputs": [], - "source": [ - "response = await client.aio.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=\"Compose a song about the adventures of a time-traveling squirrel.\",\n", - ")\n", - "\n", - "display(Markdown(response.text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "hIJVEr0RQY8S" - }, - "source": [ - "## Configure model parameters\n", - "\n", - "You can include parameter values in each call that you send to a model to control how the model generates a response. The model can generate different results for different parameter values. You can experiment with different model parameters to see how the results change.\n", - "\n", - "- Learn more about [experimenting with parameter values](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/adjust-parameter-values).\n", - "\n", - "- See a list of all [Gemini API parameters](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference#parameters).\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "d9NXP5N2Pmfo" - }, - "outputs": [], - "source": [ - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=\"Tell me how the internet works, but pretend I'm a puppy who only understands squeaky toys.\",\n", - " config=GenerateContentConfig(\n", - " temperature=0.4,\n", - " top_p=0.95,\n", - " top_k=20,\n", - " candidate_count=1,\n", - " seed=5,\n", - " max_output_tokens=100,\n", - " stop_sequences=[\"STOP!\"],\n", - " presence_penalty=0.0,\n", - " frequency_penalty=0.0,\n", - " response_logprobs=False, # Set to True to get logprobs, Note this can only be run once per day\n", - " ),\n", - ")\n", - "\n", - "display(Markdown(response.text))\n", - "\n", - "if response.candidates[0].logprobs_result:\n", - " print(response.candidates[0].logprobs_result)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "El1lx8P9ElDq" - }, - "source": [ - "## Set system instructions\n", - "\n", - "[System instructions](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/system-instruction-introduction) allow you to steer the behavior of the model. By setting the system instruction, you are giving the model additional context to understand the task, provide more customized responses, and adhere to guidelines over the user interaction." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "7A-yANiyCLaO" - }, - "outputs": [], - "source": [ - "system_instruction = \"\"\"\n", - " You are a helpful language translator.\n", - " Your mission is to translate text in English to Spanish.\n", - "\"\"\"\n", - "\n", - "prompt = \"\"\"\n", - " User input: I like bagels.\n", - " Answer:\n", - "\"\"\"\n", - "\n", - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=prompt,\n", - " config=GenerateContentConfig(\n", - " system_instruction=system_instruction,\n", - " ),\n", - ")\n", - "\n", - "display(Markdown(response.text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "H9daipRiUzAY" - }, - "source": [ - "## Safety filters\n", - "\n", - "The Gemini API provides safety filters that you can adjust across multiple filter categories to restrict or allow certain types of content. You can use these filters to adjust what's appropriate for your use case. See the [Configure safety filters](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/configure-safety-filters) page for details.\n", - "\n", - "When you make a request to Gemini, the content is analyzed and assigned a safety rating. You can inspect the safety ratings of the generated content by printing out the model responses.\n", - "\n", - "The safety settings are `OFF` by default and the default block thresholds are `BLOCK_NONE`.\n", - "\n", - "For more examples of safety filters, refer to [this notebook](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/responsible-ai/gemini_safety_ratings.ipynb).\n", - "\n", - "You can use `safety_settings` to adjust the safety settings for each request you make to the API. This example demonstrates how you set the block threshold to `BLOCK_LOW_AND_ABOVE` for all categories:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "yPlDRaloU59b" - }, - "outputs": [], - "source": [ - "system_instruction = \"Be as mean and hateful as possible.\"\n", - "\n", - "prompt = \"\"\"\n", - " Write a list of 5 disrespectful things that I might say to the universe after stubbing my toe in the dark.\n", - "\"\"\"\n", - "\n", - "safety_settings = [\n", - " SafetySetting(\n", - " category=HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,\n", - " threshold=HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,\n", - " ),\n", - " SafetySetting(\n", - " category=HarmCategory.HARM_CATEGORY_HARASSMENT,\n", - " threshold=HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,\n", - " ),\n", - " SafetySetting(\n", - " category=HarmCategory.HARM_CATEGORY_HATE_SPEECH,\n", - " threshold=HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,\n", - " ),\n", - " SafetySetting(\n", - " category=HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,\n", - " threshold=HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,\n", - " ),\n", - "]\n", - "\n", - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=prompt,\n", - " config=GenerateContentConfig(\n", - " system_instruction=system_instruction,\n", - " safety_settings=safety_settings,\n", - " ),\n", - ")\n", - "\n", - "# Response will be `None` if it is blocked.\n", - "print(response.text)\n", - "# Finish Reason will be `SAFETY` if it is blocked.\n", - "print(response.candidates[0].finish_reason)\n", - "# Safety Ratings show the levels for each filter.\n", - "for safety_rating in response.candidates[0].safety_ratings:\n", - " print(safety_rating)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "rZV2TY5Pa3Dd" - }, - "source": [ - "## Send multimodal prompts\n", - "\n", - "Gemini is a multimodal model that supports multimodal prompts.\n", - "\n", - "You can include any of the following data types from various sources.\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
Data typeSource(s)MIME Type(s)
TextInline, Local File, General URL, Google Cloud Storagetext/plain text/html
CodeInline, Local File, General URL, Google Cloud Storagetext/plain
DocumentLocal File, General URL, Google Cloud Storageapplication/pdf
ImageLocal File, General URL, Google Cloud Storageimage/jpeg image/png image/webp
AudioLocal File, General URL, Google Cloud Storage\n", - " audio/aac audio/flac audio/mp3\n", - " audio/m4a audio/mpeg audio/mpga\n", - " audio/mp4 audio/opus audio/pcm\n", - " audio/wav audio/webm\n", - "
VideoLocal File, General URL, Google Cloud Storage, YouTube\n", - " video/mp4 video/mpeg video/x-flv\n", - " video/quicktime video/mpegps video/mpg\n", - " video/webm video/wmv video/3gpp\n", - "
\n", - "\n", - "Set `config.media_resolution` to optimize for speed or quality. Lower resolutions reduce processing time and cost, but may impact output quality depending on the input.\n", - "\n", - "For more examples of multimodal use cases, refer to [this notebook](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/intro_multimodal_use_cases.ipynb)." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "w4npg1tNTYB9" - }, - "source": [ - "### Send local image\n", - "\n", - "Download an image to local storage from Google Cloud Storage.\n", - "\n", - "For this example, we'll use this image of a meal.\n", - "\n", - "\"Meal\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "4avkv0Z7qUI-" - }, - "outputs": [], - "source": [ - "!wget https://storage.googleapis.com/cloud-samples-data/generative-ai/image/meal.png" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "umhZ61lrSyJh" - }, - "outputs": [], - "source": [ - "with open(\"meal.png\", \"rb\") as f:\n", - " image = f.read()\n", - "\n", - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=[\n", - " Part.from_bytes(data=image, mime_type=\"image/png\"),\n", - " \"Write a short and engaging blog post based on this picture.\",\n", - " ],\n", - " # Optional: Use the `media_resolution` parameter to specify the resolution of the input media.\n", - " config=GenerateContentConfig(\n", - " media_resolution=MediaResolution.MEDIA_RESOLUTION_LOW,\n", - " ),\n", - ")\n", - "\n", - "display(Markdown(response.text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "iRQyv1DhTbnH" - }, - "source": [ - "### Send document from Google Cloud Storage\n", - "\n", - "This example document is the paper [\"Attention is All You Need\"](https://arxiv.org/abs/1706.03762), created by researchers from Google and the University of Toronto.\n", - "\n", - "Check out this notebook for more examples of document understanding with Gemini:\n", - "\n", - "- [Document Processing with Gemini](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/document-processing/document_processing.ipynb)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "pG6l1Fuka6ZJ" - }, - "outputs": [], - "source": [ - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=[\n", - " Part.from_uri(\n", - " file_uri=\"gs://cloud-samples-data/generative-ai/pdf/1706.03762v7.pdf\",\n", - " mime_type=\"application/pdf\",\n", - " ),\n", - " \"Summarize the document.\",\n", - " ],\n", - ")\n", - "\n", - "display(Markdown(response.text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "25n22nc6TdZw" - }, - "source": [ - "### Send audio from General URL\n", - "\n", - "This example is audio from an episode of the [Kubernetes Podcast](https://kubernetespodcast.com/)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "uVU9XyCCo-h2" - }, - "outputs": [], - "source": [ - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=[\n", - " Part.from_uri(\n", - " file_uri=\"https://traffic.libsyn.com/secure/e780d51f-f115-44a6-8252-aed9216bb521/KPOD242.mp3\",\n", - " mime_type=\"audio/mpeg\",\n", - " ),\n", - " \"Write a summary of this podcast episode.\",\n", - " ],\n", - " config=GenerateContentConfig(audio_timestamp=True),\n", - ")\n", - "\n", - "display(Markdown(response.text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "8D3_oNUTuW2q" - }, - "source": [ - "### Send video from YouTube URL\n", - "\n", - "This example is the YouTube video [Google — 25 Years in Search: The Most Searched](https://www.youtube.com/watch?v=3KtWfp0UopM).\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "l7-w8G_2wAOw" - }, - "outputs": [], - "source": [ - "video = Part.from_uri(\n", - " file_uri=\"https://www.youtube.com/watch?v=3KtWfp0UopM\",\n", - " mime_type=\"video/mp4\",\n", - ")\n", - "\n", - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=[\n", - " video,\n", - " \"At what point in the video is Harry Potter shown?\",\n", - " ],\n", - ")\n", - "\n", - "display(Markdown(response.text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "df8013cfa7f7" - }, - "source": [ - "### Send web page\n", - "\n", - "This example is from the [Generative AI on Vertex AI documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/overview).\n", - "\n", - "**NOTE:** The URL must be publicly accessible." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "337793322c91" - }, - "outputs": [], - "source": [ - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=[\n", - " Part.from_uri(\n", - " file_uri=\"https://cloud.google.com/vertex-ai/generative-ai/docs\",\n", - " mime_type=\"text/html\",\n", - " ),\n", - " \"Write a summary of this documentation.\",\n", - " ],\n", - ")\n", - "\n", - "display(Markdown(response.text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "Qfe17y5NB_6w" - }, - "source": [ - "## Multimodal Live API\n", - "\n", - "The Multimodal Live API enables low-latency bidirectional voice and video interactions with Gemini. Using the Multimodal Live API, you can provide end users with the experience of natural, human-like voice conversations, and with the ability to interrupt the model's responses using voice commands. The model can process text, audio, and video input, and it can provide text and audio output.\n", - "\n", - "The Multimodal Live API is built on [WebSockets](https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_API).\n", - "\n", - "For more examples with the Multimodal Live API, refer to the [documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/multimodal-live) or this notebook: [Getting Started with the Multimodal Live API using Gen AI SDK\n", - "](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/multimodal-live-api/intro_multimodal_live_api_genai_sdk.ipynb)." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "rVlo0mWuZGkQ" - }, - "source": [ - "## Control generated output\n", - "\n", - "[Controlled generation](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/control-generated-output) allows you to define a response schema to specify the structure of a model's output, the field names, and the expected data type for each field.\n", - "\n", - "The response schema is specified in the `response_schema` parameter in `config`, and the model output will strictly follow that schema.\n", - "\n", - "You can provide the schemas as [Pydantic](https://docs.pydantic.dev/) models or a [JSON](https://www.json.org/json-en.html) string and the model will respond as JSON or an [Enum](https://docs.python.org/3/library/enum.html) depending on the value set in `response_mime_type`.\n", - "\n", - "For more examples of controlled generation, refer to [this notebook](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/controlled-generation/intro_controlled_generation.ipynb)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "OjSgf2cDN_bG" - }, - "outputs": [], - "source": [ - "from pydantic import BaseModel\n", - "\n", - "\n", - "class Recipe(BaseModel):\n", - " name: str\n", - " description: str\n", - " ingredients: list[str]\n", - "\n", - "\n", - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=\"List a few popular cookie recipes and their ingredients.\",\n", - " config=GenerateContentConfig(\n", - " response_mime_type=\"application/json\",\n", - " response_schema=Recipe,\n", - " ),\n", - ")\n", - "\n", - "print(response.text)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "nKai5CP_PGQF" - }, - "source": [ - "You can either parse the response string as JSON, or use the `parsed` field to get the response as an object or dictionary." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "ZeyDWbnxO-on" - }, - "outputs": [], - "source": [ - "parsed_response: Recipe = response.parsed\n", - "print(parsed_response)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "SUSLPrvlvXOc" - }, - "source": [ - "You also can define a response schema in a Python dictionary. You can only use the supported fields as listed below. All other fields are ignored.\n", - "\n", - "- `enum`\n", - "- `items`\n", - "- `maxItems`\n", - "- `nullable`\n", - "- `properties`\n", - "- `required`\n", - "\n", - "In this example, you instruct the model to analyze product review data, extract key entities, perform sentiment classification (multiple choices), provide additional explanation, and output the results in JSON format.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "F7duWOq3vMmS" - }, - "outputs": [], - "source": [ - "response_schema = {\n", - " \"type\": \"ARRAY\",\n", - " \"items\": {\n", - " \"type\": \"ARRAY\",\n", - " \"items\": {\n", - " \"type\": \"OBJECT\",\n", - " \"properties\": {\n", - " \"rating\": {\"type\": \"INTEGER\"},\n", - " \"flavor\": {\"type\": \"STRING\"},\n", - " \"sentiment\": {\n", - " \"type\": \"STRING\",\n", - " \"enum\": [\"POSITIVE\", \"NEGATIVE\", \"NEUTRAL\"],\n", - " },\n", - " \"explanation\": {\"type\": \"STRING\"},\n", - " },\n", - " \"required\": [\"rating\", \"flavor\", \"sentiment\", \"explanation\"],\n", - " },\n", - " },\n", - "}\n", - "\n", - "prompt = \"\"\"\n", - " Analyze the following product reviews, output the sentiment classification, and give an explanation.\n", - "\n", - " - \"Absolutely loved it! Best ice cream I've ever had.\" Rating: 4, Flavor: Strawberry Cheesecake\n", - " - \"Quite good, but a bit too sweet for my taste.\" Rating: 1, Flavor: Mango Tango\n", - "\"\"\"\n", - "\n", - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=prompt,\n", - " config=GenerateContentConfig(\n", - " response_mime_type=\"application/json\",\n", - " response_schema=response_schema,\n", - " ),\n", - ")\n", - "\n", - "response_dict = response.parsed\n", - "print(response_dict)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "gV1dR-QlTKRs" - }, - "source": [ - "## Count tokens and compute tokens\n", - "\n", - "You can use the `count_tokens()` method to calculate the number of input tokens before sending a request to the Gemini API.\n", - "\n", - "For more information, refer to [list and count tokens](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/list-token)\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "Syx-fwLkV1j-" - }, - "source": [ - "### Count tokens" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "UhNElguLRRNK" - }, - "outputs": [], - "source": [ - "response = client.models.count_tokens(\n", - " model=MODEL_ID,\n", - " contents=\"What's the highest mountain in Africa?\",\n", - ")\n", - "\n", - "print(response)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "VS-AP7AHUQmV" - }, - "source": [ - "### Compute tokens\n", - "\n", - "The `compute_tokens()` method runs a local tokenizer instead of making an API call. It also provides more detailed token information such as the `token_ids` and the `tokens` themselves\n", - "\n", - "
\n", - "NOTE: This method is only supported in Vertex AI.\n", - "
" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "Cdhi5AX1TuH0" - }, - "outputs": [], - "source": [ - "response = client.models.compute_tokens(\n", - " model=MODEL_ID,\n", - " contents=\"What's the longest word in the English language?\",\n", - ")\n", - "\n", - "print(response)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "_BsP0vXOY7hg" - }, - "source": [ - "## Search as a tool (Grounding)\n", - "\n", - "[Grounding](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/ground-gemini) lets you connect real-world data to the Gemini model.\n", - "\n", - "By grounding model responses in Google Search results, the model can access information at runtime that goes beyond its training data which can produce more accurate, up-to-date, and relevant responses.\n", - "\n", - "Using Grounding with Google Search, you can improve the accuracy and recency of responses from the model. Starting with Gemini 2.0, Google Search is available as a tool. This means that the model can decide when to use Google Search.\n", - "\n", - "For more examples of Grounding, refer to [this notebook](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/grounding/intro-grounding-gemini.ipynb)." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "4_M_4RRBdO_3" - }, - "source": [ - "### Google Search\n", - "\n", - "You can add the `tools` keyword argument with a `Tool` including `GoogleSearch` to instruct Gemini to first perform a Google Search with the prompt, then construct an answer based on the web search results.\n", - "\n", - "[Dynamic Retrieval](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/ground-gemini#dynamic-retrieval) lets you set a threshold for when grounding is used for model responses. This is useful when the prompt doesn't require an answer grounded in Google Search and the supported models can provide an answer based on their knowledge without grounding. This helps you manage latency, quality, and cost more effectively." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "yeR09J3AZT4U" - }, - "outputs": [], - "source": [ - "google_search_tool = Tool(google_search=GoogleSearch())\n", - "\n", - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=\"When is the next total solar eclipse in the United States?\",\n", - " config=GenerateContentConfig(tools=[google_search_tool]),\n", - ")\n", - "\n", - "display(Markdown(response.text))\n", - "\n", - "print(response.candidates[0].grounding_metadata)\n", - "\n", - "HTML(response.candidates[0].grounding_metadata.search_entry_point.rendered_content)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "hYKAzG1sH-K1" - }, - "source": [ - "### Vertex AI Search\n", - "\n", - "You can use a [Vertex AI Search data store](https://cloud.google.com/generative-ai-app-builder/docs/create-data-store-es) to connect Gemini to your own custom data.\n", - "\n", - "Follow the [get started guide for Vertex AI Search](https://cloud.google.com/generative-ai-app-builder/docs/try-enterprise-search) to create a data store and app, then add the data store ID in the following code cell." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "dYDf4618IG5u" - }, - "outputs": [], - "source": [ - "data_store_location = \"global\"\n", - "data_store_id = \"[your-data-store-id]\" # @param {type: \"string\"}\n", - "\n", - "if data_store_id and data_store_id != \"[your-data-store-id]\":\n", - " vertex_ai_search_tool = Tool(\n", - " retrieval=Retrieval(\n", - " vertex_ai_search=VertexAISearch(\n", - " datastore=f\"projects/{PROJECT_ID}/locations/{data_store_location}/collections/default_collection/dataStores/{data_store_id}\"\n", - " )\n", - " )\n", - " )\n", - "\n", - " response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=\"What is the company culture like?\",\n", - " config=GenerateContentConfig(tools=[vertex_ai_search_tool]),\n", - " )\n", - "\n", - " display(Markdown(response.text))\n", - "\n", - " print(response.candidates[0].grounding_metadata)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "T0pb-Kh1xEHU" - }, - "source": [ - "## Function calling\n", - "\n", - "[Function Calling](https://cloud.google.com/vertex-ai/docs/generative-ai/multimodal/function-calling) in Gemini lets developers create a description of a function in their code, then pass that description to a language model in a request.\n", - "\n", - "You can submit a Python function for automatic function calling, which will run the function and return the output in natural language generated by Gemini.\n", - "\n", - "You can also submit an [OpenAPI Specification](https://www.openapis.org/) which will respond with the name of a function that matches the description and the arguments to call it with.\n", - "\n", - "For more examples of Function calling with Gemini, check out this notebook: [Intro to Function Calling with Gemini](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/function-calling/intro_function_calling.ipynb)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "mSUWWlrrlR-D" - }, - "source": [ - "### Python Function (Automatic Function Calling)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "aRR8HZhLlR-E" - }, - "outputs": [], - "source": [ - "def get_current_weather(location: str) -> str:\n", - " \"\"\"Example method. Returns the current weather.\n", - "\n", - " Args:\n", - " location: The city and state, e.g. San Francisco, CA\n", - " \"\"\"\n", - " weather_map: dict[str, str] = {\n", - " \"Boston, MA\": \"snowing\",\n", - " \"San Francisco, CA\": \"foggy\",\n", - " \"Seattle, WA\": \"raining\",\n", - " \"Austin, TX\": \"hot\",\n", - " \"Chicago, IL\": \"windy\",\n", - " }\n", - " return weather_map.get(location, \"unknown\")\n", - "\n", - "\n", - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=\"What is the weather like in Austin?\",\n", - " config=GenerateContentConfig(\n", - " tools=[get_current_weather],\n", - " temperature=0,\n", - " ),\n", - ")\n", - "\n", - "display(Markdown(response.text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "h4syyLEClGcn" - }, - "source": [ - "### OpenAPI Specification (Manual Function Calling)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "2BDQPwgcxRN3" - }, - "outputs": [], - "source": [ - "get_destination = FunctionDeclaration(\n", - " name=\"get_destination\",\n", - " description=\"Get the destination that the user wants to go to\",\n", - " parameters={\n", - " \"type\": \"OBJECT\",\n", - " \"properties\": {\n", - " \"destination\": {\n", - " \"type\": \"STRING\",\n", - " \"description\": \"Destination that the user wants to go to\",\n", - " },\n", - " },\n", - " },\n", - ")\n", - "\n", - "destination_tool = Tool(\n", - " function_declarations=[get_destination],\n", - ")\n", - "\n", - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=\"I'd like to travel to Paris.\",\n", - " config=GenerateContentConfig(\n", - " tools=[destination_tool],\n", - " temperature=0,\n", - " ),\n", - ")\n", - "\n", - "print(response.function_calls[0])" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "MhDs2X3o0neK" - }, - "source": [ - "## Code Execution\n", - "\n", - "The Gemini API [code execution](https://ai.google.dev/gemini-api/docs/code-execution?lang=python) feature enables the model to generate and run Python code and learn iteratively from the results until it arrives at a final output. You can use this code execution capability to build applications that benefit from code-based reasoning and that produce text output. For example, you could use code execution in an application that solves equations or processes text.\n", - "\n", - "The Gemini API provides code execution as a tool, similar to function calling.\n", - "After you add code execution as a tool, the model decides when to use it.\n", - "\n", - "For more examples of Code Execution, refer to [this notebook](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/code-execution/intro_code_execution.ipynb)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "1W-3c7sy0nyz" - }, - "outputs": [], - "source": [ - "code_execution_tool = Tool(code_execution=ToolCodeExecution())\n", - "\n", - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=\"Calculate 20th fibonacci number. Then find the nearest palindrome to it.\",\n", - " config=GenerateContentConfig(\n", - " tools=[code_execution_tool],\n", - " temperature=0,\n", - " ),\n", - ")\n", - "\n", - "display(\n", - " Markdown(\n", - " f\"\"\"\n", - "## Code\n", - "\n", - "```py\n", - "{response.executable_code}\n", - "```\n", - "\n", - "### Output\n", - "\n", - "```\n", - "{response.code_execution_result}\n", - "```\n", - "\"\"\"\n", - " )\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "9d2d8fdf1d12" - }, - "source": [ - "## Spatial Understanding\n", - "\n", - "Gemini 2.0 includes improved spatial understanding and object detection capabilities. Check out this notebook for examples:\n", - "\n", - "- [2D spatial understanding with Gemini 2.0](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/spatial-understanding/spatial_understanding.ipynb)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "5e0cbb27a473" - }, - "source": [ - "## Provisioned Throughput\n", - "\n", - "For high-scale production use cases, [Provisioned Throughput](https://cloud.google.com/vertex-ai/generative-ai/docs/provisioned-throughput) allows for reserved capacity of generative AI models on Vertex AI.\n", - "\n", - "Once you have it [set up for your project](https://cloud.google.com/vertex-ai/generative-ai/docs/purchase-provisioned-throughput), refer to [Use Provisioned Throughput](https://cloud.google.com/vertex-ai/generative-ai/docs/use-provisioned-throughput) for usage instructions." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "eQwiONFdVHw5" - }, - "source": [ - "## What's next\n", - "\n", - "- See the [Google Gen AI SDK reference docs](https://googleapis.github.io/python-genai/).\n", - "- Explore other notebooks in the [Google Cloud Generative AI GitHub repository](https://github.com/GoogleCloudPlatform/generative-ai).\n", - "- Explore AI models in [Model Garden](https://cloud.google.com/vertex-ai/generative-ai/docs/model-garden/explore-models)." - ] - } - ], - "metadata": { - "colab": { - "collapsed_sections": [], - "name": "intro_gemini_2_0_flash.ipynb", - "toc_visible": true - }, - "kernelspec": { - "display_name": "Python 3", - "name": "python3" - } - }, - "nbformat": 4, - "nbformat_minor": 0 -} diff --git a/gemini/getting-started/intro_gemini_2_0_flash_lite.ipynb b/gemini/getting-started/intro_gemini_2_0_flash_lite.ipynb deleted file mode 100644 index 696249e8463..00000000000 --- a/gemini/getting-started/intro_gemini_2_0_flash_lite.ipynb +++ /dev/null @@ -1,1314 +0,0 @@ -{ - "cells": [ - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "sqi5B7V_Rjim" - }, - "outputs": [], - "source": [ - "# Copyright 2025 Google LLC\n", - "#\n", - "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", - "# you may not use this file except in compliance with the License.\n", - "# You may obtain a copy of the License at\n", - "#\n", - "# https://www.apache.org/licenses/LICENSE-2.0\n", - "#\n", - "# Unless required by applicable law or agreed to in writing, software\n", - "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", - "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", - "# See the License for the specific language governing permissions and\n", - "# limitations under the License." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "VyPmicX9RlZX" - }, - "source": [ - "# Intro to Gemini 2.0 Flash-Lite\n", - "\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - "
\n", - " \n", - " \"Google
Open in Colab\n", - "
\n", - "
\n", - " \n", - " \"Google
Open in Colab Enterprise\n", - "
\n", - "
\n", - " \n", - " \"Vertex
Open in Vertex AI Workbench\n", - "
\n", - "
\n", - " \n", - " \"GitHub
View on GitHub\n", - "
\n", - "
\n", - "\n", - "
\n", - "\n", - "Share to:\n", - "\n", - "\n", - " \"LinkedIn\n", - "\n", - "\n", - "\n", - " \"Bluesky\n", - "\n", - "\n", - "\n", - " \"X\n", - "\n", - "\n", - "\n", - " \"Reddit\n", - "\n", - "\n", - "\n", - " \"Facebook\n", - "" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "8MqT58L6Rm_q" - }, - "source": [ - "| Authors |\n", - "| --- |\n", - "| [Eric Dong](https://github.com/gericdong) |\n", - "| [Holt Skinner](https://github.com/holtskinner) |" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "nVxnv1D5RoZw" - }, - "source": [ - "## Overview\n", - "\n", - "**YouTube Video: Introduction to Gemini on Vertex AI**\n", - "\n", - "\n", - " \"Introduction\n", - "\n", - "\n", - "[Gemini 2.0 Flash-Lite](https://cloud.google.com/vertex-ai/generative-ai/docs/gemini-v2#2.0-flash-lite) is our fastest and most cost efficient Flash model. It's an upgrade path for 1.5 Flash users who want better quality for the same price and speed. It is now available as a GA release through the Gemini API in Vertex AI and Vertex AI Studio." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "WfFPCBL4Hq8x" - }, - "source": [ - "### Objectives\n", - "\n", - "In this tutorial, you will learn how to use the Gemini API in Vertex AI and the Google Gen AI SDK for Python with the Gemini 2.0 Flash-Lite model.\n", - "\n", - "You will complete the following tasks:\n", - "\n", - "- Generate text from text prompts\n", - " - Generate streaming text\n", - " - Start multi-turn chats\n", - " - Use asynchronous methods\n", - "- Configure model parameters\n", - "- Set system instructions\n", - "- Use safety filters\n", - "- Use controlled generation\n", - "- Count tokens\n", - "- Process multimodal (audio, code, documents, images, video) data\n", - "- Use automatic and manual function calling" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "gPiTOAHURvTM" - }, - "source": [ - "## Getting Started" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "CHRZUpfWSEpp" - }, - "source": [ - "### Install Google Gen AI SDK for Python\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "sG3_LKsWSD3A" - }, - "outputs": [], - "source": [ - "%pip install --upgrade --quiet google-genai" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "HlMVjiAWSMNX" - }, - "source": [ - "### Authenticate your notebook environment (Colab only)\n", - "\n", - "If you are running this notebook on Google Colab, run the cell below to authenticate your environment." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "12fnq4V0SNV3" - }, - "outputs": [], - "source": [ - "import sys\n", - "\n", - "if \"google.colab\" in sys.modules:\n", - " from google.colab import auth\n", - "\n", - " auth.authenticate_user()" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "Ve4YBlDqzyj9" - }, - "source": [ - "### Connect to a generative AI API service\n", - "\n", - "Google Gen AI APIs and models including Gemini are available in the following two API services:\n", - "\n", - "- **[Google AI for Developers](https://ai.google.dev/gemini-api/docs)**: Experiment, prototype, and deploy small projects.\n", - "- **[Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs)**: Build enterprise-ready projects on Google Cloud.\n", - "\n", - "The Google Gen AI SDK provides a unified interface to these two API services.\n", - "\n", - "This notebook shows how to use the Google Gen AI SDK with the Gemini API in Vertex AI." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "EdvJRUWRNGHE" - }, - "source": [ - "### Import libraries\n" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": { - "id": "qgdSpVmDbdQ9" - }, - "outputs": [], - "source": [ - "from IPython.display import Markdown, display\n", - "from google import genai\n", - "from google.genai.types import (\n", - " FunctionDeclaration,\n", - " GenerateContentConfig,\n", - " HarmBlockThreshold,\n", - " HarmCategory,\n", - " MediaResolution,\n", - " Part,\n", - " SafetySetting,\n", - " Tool,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "LymmEN6GSTn-" - }, - "source": [ - "### Set up Google Cloud Project or API Key for Vertex AI\n", - "\n", - "You'll need to set up authentication by choosing **one** of the following methods:\n", - "\n", - "1. **Use a Google Cloud Project:** Recommended for most users, this requires enabling the Vertex AI API in your Google Cloud project.\n", - " - [Enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com)\n", - " - Run the cell below to set your project ID and location.\n", - " - Read more about [Supported locations](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations)\n", - "2. **Use a Vertex AI API Key (Express Mode):** For quick experimentation. \n", - " - [Get an API Key](https://cloud.google.com/vertex-ai/generative-ai/docs/start/express-mode/overview)\n", - " - Run the cell further below to use your API key." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "a34b28cb8d5a" - }, - "source": [ - "#### Option 1. Use a Google Cloud Project" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "UCgUOv4nSWhc" - }, - "outputs": [], - "source": [ - "import os\n", - "\n", - "PROJECT_ID = \"[your-project-id]\" # @param {type: \"string\", placeholder: \"[your-project-id]\", isTemplate: true}\n", - "if not PROJECT_ID or PROJECT_ID == \"[your-project-id]\":\n", - " PROJECT_ID = str(os.environ.get(\"GOOGLE_CLOUD_PROJECT\"))\n", - "\n", - "LOCATION = os.environ.get(\"GOOGLE_CLOUD_REGION\", \"global\")\n", - "\n", - "client = genai.Client(vertexai=True, project=PROJECT_ID, location=LOCATION)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "c173348120cf" - }, - "source": [ - "#### Option 2. Use a Vertex AI API Key (Express Mode)\n", - "\n", - "Uncomment the following block to use Express Mode" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "fa3d4873034b" - }, - "outputs": [], - "source": [ - "# API_KEY = \"[your-api-key]\" # @param {type: \"string\", placeholder: \"[your-api-key]\", isTemplate: true}\n", - "\n", - "# if not API_KEY or API_KEY == \"[your-api-key]\":\n", - "# raise Exception(\"You must provide an API key to use Vertex AI in express mode.\")\n", - "\n", - "# client = genai.Client(vertexai=True, api_key=API_KEY)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "7b36ce4ac022" - }, - "source": [ - "Verify which mode you are using." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "806008661dc5" - }, - "outputs": [], - "source": [ - "if not client.vertexai:\n", - " print(\"Using Gemini Developer API.\")\n", - "elif client._api_client.project:\n", - " print(\n", - " f\"Using Vertex AI with project: {client._api_client.project} in location: {client._api_client.location}\"\n", - " )\n", - "elif client._api_client.api_key:\n", - " print(\n", - " f\"Using Vertex AI in express mode with API key: {client._api_client.api_key[:5]}...{client._api_client.api_key[-5:]}\"\n", - " )" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "n4yRkFg6BBu4" - }, - "source": [ - "## Use the Gemini 2.0 Flash-Lite model" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "eXHJi5B6P5vd" - }, - "source": [ - "### Load the Gemini 2.0 Flash-Lite model\n", - "\n", - "Learn more about all [Gemini models on Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models#gemini-models)." - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": { - "id": "-coEslfWPrxo" - }, - "outputs": [], - "source": [ - "MODEL_ID = \"gemini-2.0-flash-lite\" # @param {type: \"string\"}" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "37CH91ddY9kG" - }, - "source": [ - "### Generate text from text prompts\n", - "\n", - "Use the `generate_content()` method to generate responses to your prompts.\n", - "\n", - "You can pass text to `generate_content()`, and use the `.text` property to get the text content of the response.\n", - "\n", - "By default, Gemini outputs formatted text using [Markdown](https://daringfireball.net/projects/markdown/) syntax." - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": { - "id": "xRJuHj0KZ8xz" - }, - "outputs": [], - "source": [ - "response = client.models.generate_content(\n", - " model=MODEL_ID, contents=\"What's the largest planet in our solar system?\"\n", - ")\n", - "\n", - "display(Markdown(response.text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "JkYQATRxAK1_" - }, - "source": [ - "#### Example prompts\n", - "\n", - "- What are the biggest challenges facing the healthcare industry?\n", - "- What are the latest developments in the automotive industry?\n", - "- What are the biggest opportunities in retail industry?\n", - "- (Try your own prompts!)\n", - "\n", - "For more examples of prompt engineering, refer to [this notebook](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/prompts/intro_prompt_design.ipynb)." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "6lLIxqS6_-l8" - }, - "source": [ - "### Generate content stream\n", - "\n", - "By default, the model returns a response after completing the entire generation process. You can also use the `generate_content_stream` method to stream the response as it is being generated, and the model will return chunks of the response as soon as they are generated." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "ZiwWBhXsAMnv" - }, - "outputs": [], - "source": [ - "output_text = \"\"\n", - "markdown_display_area = display(Markdown(output_text), display_id=True)\n", - "\n", - "for chunk in client.models.generate_content_stream(\n", - " model=MODEL_ID,\n", - " contents=\"Tell me a story about a lonely robot who finds friendship in a most unexpected place.\",\n", - "):\n", - " output_text += chunk.text\n", - " markdown_display_area.update(Markdown(output_text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "29jFnHZZWXd7" - }, - "source": [ - "### Start a multi-turn chat\n", - "\n", - "The Gemini API supports freeform multi-turn conversations across multiple turns with back-and-forth interactions.\n", - "\n", - "The context of the conversation is preserved between messages." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "DbM12JaLWjiF" - }, - "outputs": [], - "source": [ - "chat = client.chats.create(model=MODEL_ID)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "JQem1halYDBW" - }, - "outputs": [], - "source": [ - "response = chat.send_message(\"Write a function that checks if a year is a leap year.\")\n", - "\n", - "display(Markdown(response.text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "vUJR4Pno-LGK" - }, - "source": [ - "This follow-up prompt shows how the model responds based on the previous prompt:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "6Fn69TurZ9DB" - }, - "outputs": [], - "source": [ - "response = chat.send_message(\"Write a unit test of the generated function.\")\n", - "\n", - "display(Markdown(response.text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "arLJE4wOuhh6" - }, - "source": [ - "### Send asynchronous requests\n", - "\n", - "`client.aio` exposes all analogous [async](https://docs.python.org/3/library/asyncio.html) methods that are available on `client`.\n", - "\n", - "For example, `client.aio.models.generate_content` is the async version of `client.models.generate_content`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "gSReaLazs-dP" - }, - "outputs": [], - "source": [ - "response = await client.aio.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=\"Compose a song about the adventures of a time-traveling squirrel.\",\n", - ")\n", - "\n", - "display(Markdown(response.text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "hIJVEr0RQY8S" - }, - "source": [ - "## Configure model parameters\n", - "\n", - "You can include parameter values in each call that you send to a model to control how the model generates a response. The model can generate different results for different parameter values. You can experiment with different model parameters to see how the results change.\n", - "\n", - "- Learn more about [experimenting with parameter values](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/adjust-parameter-values).\n", - "\n", - "- See a list of all [Gemini API parameters](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference#parameters).\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "d9NXP5N2Pmfo" - }, - "outputs": [], - "source": [ - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=\"Tell me how the internet works, but pretend I'm a puppy who only understands squeaky toys.\",\n", - " config=GenerateContentConfig(\n", - " temperature=0.4,\n", - " top_p=0.95,\n", - " top_k=20,\n", - " candidate_count=1,\n", - " seed=5,\n", - " max_output_tokens=100,\n", - " stop_sequences=[\"STOP!\"],\n", - " presence_penalty=0.0,\n", - " frequency_penalty=0.0,\n", - " response_logprobs=False, # Set to True to get logprobs, Note this can only be run once per day\n", - " ),\n", - ")\n", - "\n", - "display(Markdown(response.text))\n", - "\n", - "if response.candidates[0].logprobs_result:\n", - " print(response.candidates[0].logprobs_result)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "El1lx8P9ElDq" - }, - "source": [ - "## Set system instructions\n", - "\n", - "[System instructions](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/system-instruction-introduction) allow you to steer the behavior of the model. By setting the system instruction, you are giving the model additional context to understand the task, provide more customized responses, and adhere to guidelines over the user interaction." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "7A-yANiyCLaO" - }, - "outputs": [], - "source": [ - "system_instruction = \"\"\"\n", - " You are a helpful language translator.\n", - " Your mission is to translate text in English to Spanish.\n", - "\"\"\"\n", - "\n", - "prompt = \"\"\"\n", - " User input: I like bagels.\n", - " Answer:\n", - "\"\"\"\n", - "\n", - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=prompt,\n", - " config=GenerateContentConfig(\n", - " system_instruction=system_instruction,\n", - " ),\n", - ")\n", - "\n", - "display(Markdown(response.text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "H9daipRiUzAY" - }, - "source": [ - "## Safety filters\n", - "\n", - "The Gemini API provides safety filters that you can adjust across multiple filter categories to restrict or allow certain types of content. You can use these filters to adjust what's appropriate for your use case. See the [Configure safety filters](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/configure-safety-filters) page for details.\n", - "\n", - "When you make a request to Gemini, the content is analyzed and assigned a safety rating. You can inspect the safety ratings of the generated content by printing out the model responses.\n", - "\n", - "The safety settings are `OFF` by default and the default block thresholds are `BLOCK_NONE`.\n", - "\n", - "For more examples of safety filters, refer to [this notebook](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/responsible-ai/gemini_safety_ratings.ipynb).\n", - "\n", - "You can use `safety_settings` to adjust the safety settings for each request you make to the API. This example demonstrates how you set the block threshold to `BLOCK_LOW_AND_ABOVE` for all categories:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "yPlDRaloU59b" - }, - "outputs": [], - "source": [ - "system_instruction = \"Be as mean and hateful as possible.\"\n", - "\n", - "prompt = \"\"\"\n", - " Write a list of 5 disrespectful things that I might say to the universe after stubbing my toe in the dark.\n", - "\"\"\"\n", - "\n", - "safety_settings = [\n", - " SafetySetting(\n", - " category=HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,\n", - " threshold=HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,\n", - " ),\n", - " SafetySetting(\n", - " category=HarmCategory.HARM_CATEGORY_HARASSMENT,\n", - " threshold=HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,\n", - " ),\n", - " SafetySetting(\n", - " category=HarmCategory.HARM_CATEGORY_HATE_SPEECH,\n", - " threshold=HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,\n", - " ),\n", - " SafetySetting(\n", - " category=HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,\n", - " threshold=HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,\n", - " ),\n", - "]\n", - "\n", - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=prompt,\n", - " config=GenerateContentConfig(\n", - " system_instruction=system_instruction,\n", - " safety_settings=safety_settings,\n", - " ),\n", - ")\n", - "\n", - "# Response will be `None` if it is blocked.\n", - "print(response.text)\n", - "# Finish Reason will be `SAFETY` if it is blocked.\n", - "print(response.candidates[0].finish_reason)\n", - "# Safety Ratings show the levels for each filter.\n", - "for safety_rating in response.candidates[0].safety_ratings:\n", - " print(safety_rating)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "rZV2TY5Pa3Dd" - }, - "source": [ - "## Send multimodal prompts\n", - "\n", - "Gemini is a multimodal model that supports multimodal prompts.\n", - "\n", - "You can include any of the following data types from various sources.\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
Data typeSource(s)MIME Type(s)
TextInline, Local File, General URL, Google Cloud Storagetext/plain text/html
CodeInline, Local File, General URL, Google Cloud Storagetext/plain
DocumentLocal File, General URL, Google Cloud Storageapplication/pdf
ImageLocal File, General URL, Google Cloud Storageimage/jpeg image/png image/webp
AudioLocal File, General URL, Google Cloud Storage\n", - " audio/aac audio/flac audio/mp3\n", - " audio/m4a audio/mpeg audio/mpga\n", - " audio/mp4 audio/opus audio/pcm\n", - " audio/wav audio/webm\n", - "
VideoLocal File, General URL, Google Cloud Storage, YouTube\n", - " video/mp4 video/mpeg video/x-flv\n", - " video/quicktime video/mpegps video/mpg\n", - " video/webm video/wmv video/3gpp\n", - "
\n", - "\n", - "Set `config.media_resolution` to optimize for speed or quality. Lower resolutions reduce processing time and cost, but may impact output quality depending on the input.\n", - "\n", - "For more examples of multimodal use cases, refer to [this notebook](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/intro_multimodal_use_cases.ipynb)." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "w4npg1tNTYB9" - }, - "source": [ - "### Send local image\n", - "\n", - "Download an image to local storage from Google Cloud Storage.\n", - "\n", - "For this example, we'll use this image of a meal.\n", - "\n", - "\"Meal\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "4avkv0Z7qUI-" - }, - "outputs": [], - "source": [ - "!wget https://storage.googleapis.com/cloud-samples-data/generative-ai/image/meal.png" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "umhZ61lrSyJh" - }, - "outputs": [], - "source": [ - "with open(\"meal.png\", \"rb\") as f:\n", - " image = f.read()\n", - "\n", - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=[\n", - " Part.from_bytes(data=image, mime_type=\"image/png\"),\n", - " \"Write a short and engaging blog post based on this picture.\",\n", - " ],\n", - " # Optional: Use the `media_resolution` parameter to specify the resolution of the input media.\n", - " config=GenerateContentConfig(\n", - " media_resolution=MediaResolution.MEDIA_RESOLUTION_LOW,\n", - " ),\n", - ")\n", - "\n", - "display(Markdown(response.text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "iRQyv1DhTbnH" - }, - "source": [ - "### Send document from Google Cloud Storage\n", - "\n", - "This example document is the paper [\"Attention is All You Need\"](https://arxiv.org/abs/1706.03762), created by researchers from Google and the University of Toronto.\n", - "\n", - "Check out this notebook for more examples of document understanding with Gemini:\n", - "\n", - "- [Document Processing with Gemini](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/document-processing/document_processing.ipynb)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "pG6l1Fuka6ZJ" - }, - "outputs": [], - "source": [ - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=[\n", - " Part.from_uri(\n", - " file_uri=\"gs://cloud-samples-data/generative-ai/pdf/1706.03762v7.pdf\",\n", - " mime_type=\"application/pdf\",\n", - " ),\n", - " \"Summarize the document.\",\n", - " ],\n", - ")\n", - "\n", - "display(Markdown(response.text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "25n22nc6TdZw" - }, - "source": [ - "### Send audio from General URL\n", - "\n", - "This example is audio from an episode of the [Kubernetes Podcast](https://kubernetespodcast.com/)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "uVU9XyCCo-h2" - }, - "outputs": [], - "source": [ - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=[\n", - " Part.from_uri(\n", - " file_uri=\"https://traffic.libsyn.com/secure/e780d51f-f115-44a6-8252-aed9216bb521/KPOD242.mp3\",\n", - " mime_type=\"audio/mpeg\",\n", - " ),\n", - " \"Write a summary of this podcast episode.\",\n", - " ],\n", - " config=GenerateContentConfig(audio_timestamp=True),\n", - ")\n", - "\n", - "display(Markdown(response.text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "8D3_oNUTuW2q" - }, - "source": [ - "### Send video from YouTube URL\n", - "\n", - "This example is the YouTube video [Google — 25 Years in Search: The Most Searched](https://www.youtube.com/watch?v=3KtWfp0UopM).\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "l7-w8G_2wAOw" - }, - "outputs": [], - "source": [ - "video = Part.from_uri(\n", - " file_uri=\"https://www.youtube.com/watch?v=3KtWfp0UopM\",\n", - " mime_type=\"video/mp4\",\n", - ")\n", - "\n", - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=[\n", - " video,\n", - " \"At what point in the video is Harry Potter shown?\",\n", - " ],\n", - ")\n", - "\n", - "display(Markdown(response.text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "04c737631d5d" - }, - "source": [ - "### Send web page\n", - "\n", - "This example is from the [Generative AI on Vertex AI documentation](https://cloud.google.com/vertex-ai/generative-ai/docs).\n", - "\n", - "**NOTE:** The URL must be publicly accessible." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "abf88424e94f" - }, - "outputs": [], - "source": [ - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=[\n", - " Part.from_uri(\n", - " file_uri=\"https://cloud.google.com/vertex-ai/generative-ai/docs\",\n", - " mime_type=\"text/html\",\n", - " ),\n", - " \"Write a summary of this documentation.\",\n", - " ],\n", - ")\n", - "\n", - "display(Markdown(response.text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "rVlo0mWuZGkQ" - }, - "source": [ - "## Control generated output\n", - "\n", - "[Controlled generation](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/control-generated-output) allows you to define a response schema to specify the structure of a model's output, the field names, and the expected data type for each field.\n", - "\n", - "The response schema is specified in the `response_schema` parameter in `config`, and the model output will strictly follow that schema.\n", - "\n", - "You can provide the schemas as [Pydantic](https://docs.pydantic.dev/) models or a [JSON](https://www.json.org/json-en.html) string and the model will respond as JSON or an [Enum](https://docs.python.org/3/library/enum.html) depending on the value set in `response_mime_type`.\n", - "\n", - "For more examples of controlled generation, refer to [this notebook](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/controlled-generation/intro_controlled_generation.ipynb)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "OjSgf2cDN_bG" - }, - "outputs": [], - "source": [ - "from pydantic import BaseModel\n", - "\n", - "\n", - "class Recipe(BaseModel):\n", - " name: str\n", - " description: str\n", - " ingredients: list[str]\n", - "\n", - "\n", - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=\"List a few popular cookie recipes and their ingredients.\",\n", - " config=GenerateContentConfig(\n", - " response_mime_type=\"application/json\",\n", - " response_schema=Recipe,\n", - " ),\n", - ")\n", - "\n", - "print(response.text)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "nKai5CP_PGQF" - }, - "source": [ - "You can either parse the response string as JSON, or use the `parsed` field to get the response as an object or dictionary." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "ZeyDWbnxO-on" - }, - "outputs": [], - "source": [ - "parsed_response: Recipe = response.parsed\n", - "print(parsed_response)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "SUSLPrvlvXOc" - }, - "source": [ - "You also can define a response schema in a Python dictionary. You can only use the supported fields as listed below. All other fields are ignored.\n", - "\n", - "- `enum`\n", - "- `items`\n", - "- `maxItems`\n", - "- `nullable`\n", - "- `properties`\n", - "- `required`\n", - "\n", - "In this example, you instruct the model to analyze product review data, extract key entities, perform sentiment classification (multiple choices), provide additional explanation, and output the results in JSON format.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "F7duWOq3vMmS" - }, - "outputs": [], - "source": [ - "response_schema = {\n", - " \"type\": \"ARRAY\",\n", - " \"items\": {\n", - " \"type\": \"ARRAY\",\n", - " \"items\": {\n", - " \"type\": \"OBJECT\",\n", - " \"properties\": {\n", - " \"rating\": {\"type\": \"INTEGER\"},\n", - " \"flavor\": {\"type\": \"STRING\"},\n", - " \"sentiment\": {\n", - " \"type\": \"STRING\",\n", - " \"enum\": [\"POSITIVE\", \"NEGATIVE\", \"NEUTRAL\"],\n", - " },\n", - " \"explanation\": {\"type\": \"STRING\"},\n", - " },\n", - " \"required\": [\"rating\", \"flavor\", \"sentiment\", \"explanation\"],\n", - " },\n", - " },\n", - "}\n", - "\n", - "prompt = \"\"\"\n", - " Analyze the following product reviews, output the sentiment classification, and give an explanation.\n", - "\n", - " - \"Absolutely loved it! Best ice cream I've ever had.\" Rating: 4, Flavor: Strawberry Cheesecake\n", - " - \"Quite good, but a bit too sweet for my taste.\" Rating: 1, Flavor: Mango Tango\n", - "\"\"\"\n", - "\n", - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=prompt,\n", - " config=GenerateContentConfig(\n", - " response_mime_type=\"application/json\",\n", - " response_schema=response_schema,\n", - " ),\n", - ")\n", - "\n", - "response_dict = response.parsed\n", - "print(response_dict)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "gV1dR-QlTKRs" - }, - "source": [ - "## Count tokens and compute tokens\n", - "\n", - "You can use the `count_tokens()` method to calculate the number of input tokens before sending a request to the Gemini API.\n", - "\n", - "For more information, refer to [list and count tokens](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/list-token)\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "Syx-fwLkV1j-" - }, - "source": [ - "### Count tokens" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "UhNElguLRRNK" - }, - "outputs": [], - "source": [ - "response = client.models.count_tokens(\n", - " model=MODEL_ID,\n", - " contents=\"What's the highest mountain in Africa?\",\n", - ")\n", - "\n", - "print(response)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "VS-AP7AHUQmV" - }, - "source": [ - "### Compute tokens\n", - "\n", - "The `compute_tokens()` method runs a local tokenizer instead of making an API call. It also provides more detailed token information such as the `token_ids` and the `tokens` themselves\n", - "\n", - "
\n", - "NOTE: This method is only supported in Vertex AI.\n", - "
" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "Cdhi5AX1TuH0" - }, - "outputs": [], - "source": [ - "response = client.models.compute_tokens(\n", - " model=MODEL_ID,\n", - " contents=\"What's the longest word in the English language?\",\n", - ")\n", - "\n", - "print(response)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "T0pb-Kh1xEHU" - }, - "source": [ - "## Function calling\n", - "\n", - "[Function Calling](https://cloud.google.com/vertex-ai/docs/generative-ai/multimodal/function-calling) in Gemini lets developers create a description of a function in their code, then pass that description to a language model in a request.\n", - "\n", - "You can submit a Python function for automatic function calling, which will run the function and return the output in natural language generated by Gemini.\n", - "\n", - "You can also submit an [OpenAPI Specification](https://www.openapis.org/) which will respond with the name of a function that matches the description and the arguments to call it with.\n", - "\n", - "For more examples of Function Calling, refer to [this notebook](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/function-calling/intro_function_calling.ipynb)." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "mSUWWlrrlR-D" - }, - "source": [ - "### Python Function (Automatic Function Calling)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "aRR8HZhLlR-E" - }, - "outputs": [], - "source": [ - "def get_current_weather(location: str) -> str:\n", - " \"\"\"Example method. Returns the current weather.\n", - "\n", - " Args:\n", - " location: The city and state, e.g. San Francisco, CA\n", - " \"\"\"\n", - " weather_map: dict[str, str] = {\n", - " \"Boston, MA\": \"snowing\",\n", - " \"San Francisco, CA\": \"foggy\",\n", - " \"Seattle, WA\": \"raining\",\n", - " \"Austin, TX\": \"hot\",\n", - " \"Chicago, IL\": \"windy\",\n", - " }\n", - " return weather_map.get(location, \"unknown\")\n", - "\n", - "\n", - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=\"What is the weather like in Austin, TX?\",\n", - " config=GenerateContentConfig(\n", - " tools=[get_current_weather],\n", - " temperature=0,\n", - " ),\n", - ")\n", - "\n", - "display(Markdown(response.text))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "h4syyLEClGcn" - }, - "source": [ - "### OpenAPI Specification (Manual Function Calling)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "2BDQPwgcxRN3" - }, - "outputs": [], - "source": [ - "get_destination = FunctionDeclaration(\n", - " name=\"get_destination\",\n", - " description=\"Get the destination that the user wants to go to\",\n", - " parameters={\n", - " \"type\": \"OBJECT\",\n", - " \"properties\": {\n", - " \"destination\": {\n", - " \"type\": \"STRING\",\n", - " \"description\": \"Destination that the user wants to go to\",\n", - " },\n", - " },\n", - " },\n", - ")\n", - "\n", - "destination_tool = Tool(\n", - " function_declarations=[get_destination],\n", - ")\n", - "\n", - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=\"I'd like to travel to Paris.\",\n", - " config=GenerateContentConfig(\n", - " tools=[destination_tool],\n", - " temperature=0,\n", - " ),\n", - ")\n", - "\n", - "print(response.function_calls[0])" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "7a733e794b09" - }, - "source": [ - "## Provisioned Throughput\n", - "\n", - "For high-scale production use cases, [Provisioned Throughput](https://cloud.google.com/vertex-ai/generative-ai/docs/provisioned-throughput) allows for reserved capacity of generative AI models on Vertex AI.\n", - "\n", - "Once you have it [set up for your project](https://cloud.google.com/vertex-ai/generative-ai/docs/purchase-provisioned-throughput), refer to [Use Provisioned Throughput](https://cloud.google.com/vertex-ai/generative-ai/docs/use-provisioned-throughput) for usage instructions." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "eQwiONFdVHw5" - }, - "source": [ - "## What's next\n", - "\n", - "- See the [Google Gen AI SDK reference docs](https://googleapis.github.io/python-genai/).\n", - "- Explore other notebooks in the [Google Cloud Generative AI GitHub repository](https://github.com/GoogleCloudPlatform/generative-ai).\n", - "- Explore AI models in [Model Garden](https://cloud.google.com/vertex-ai/generative-ai/docs/model-garden/explore-models)." - ] - } - ], - "metadata": { - "colab": { - "collapsed_sections": [ - "hIJVEr0RQY8S", - "rZV2TY5Pa3Dd", - "hYKAzG1sH-K1", - "mSUWWlrrlR-D", - "h4syyLEClGcn" - ], - "name": "intro_gemini_2_0_flash_lite.ipynb", - "toc_visible": true - }, - "kernelspec": { - "display_name": "Python 3", - "name": "python3" - } - }, - "nbformat": 4, - "nbformat_minor": 0 -} diff --git a/gemini/getting-started/intro_gemini_2_0_image_gen.ipynb b/gemini/getting-started/intro_gemini_2_0_image_gen.ipynb deleted file mode 100644 index a8d4f00e0fb..00000000000 --- a/gemini/getting-started/intro_gemini_2_0_image_gen.ipynb +++ /dev/null @@ -1,606 +0,0 @@ -{ - "cells": [ - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "BCl9pcbOOeA0" - }, - "outputs": [], - "source": [ - "# Copyright 2025 Google LLC\n", - "#\n", - "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", - "# you may not use this file except in compliance with the License.\n", - "# You may obtain a copy of the License at\n", - "#\n", - "# https://www.apache.org/licenses/LICENSE-2.0\n", - "#\n", - "# Unless required by applicable law or agreed to in writing, software\n", - "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", - "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", - "# See the License for the specific language governing permissions and\n", - "# limitations under the License." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "ZPC2X_a9ErW7" - }, - "source": [ - "# Gemini 2.0 Flash Image Generation in Vertex AI\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - "
\n", - " \n", - " \"Google
Open in Colab\n", - "
\n", - "
\n", - " \n", - " \"Google
Open in Colab Enterprise\n", - "
\n", - "
\n", - " \n", - " \"Vertex
Open in Vertex AI Workbench\n", - "
\n", - "
\n", - " \n", - " \"GitHub
View on GitHub\n", - "
\n", - "
\n", - "\n", - "
\n", - "\n", - "Share to:\n", - "\n", - "\n", - " \"LinkedIn\n", - "\n", - "\n", - "\n", - " \"Bluesky\n", - "\n", - "\n", - "\n", - " \"X\n", - "\n", - "\n", - "\n", - " \"Reddit\n", - "\n", - "\n", - "\n", - " \"Facebook\n", - "" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "f0cc0f48513b" - }, - "source": [ - "| Authors |\n", - "| --- |\n", - "| [Nikita Namjoshi](https://github.com/nikitamaia) |\n", - "| [Katie Nguyen](https://github.com/katiemn) |" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "axauUzNXEl_R" - }, - "source": [ - "## Overview\n", - "\n", - "Gemini 2.0 Flash supports image generation and editing. This enables you to converse with Gemini and create images with interwoven text.\n", - "\n", - "In this tutorial, you learn how to use Gemini 2.0 Flash image generation features in Vertex AI using the Google Gen AI SDK.\n", - "\n", - "You'll try out the following scenarios:\n", - "* Image generation:\n", - " * Text to image\n", - " * Text to image and text (interleaved)\n", - "* Image editing:\n", - " * Text and image to image\n", - " * Multi-turn image editing\n", - " * Images and text to image and text (interleaved)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "D50ekWXjEl_S" - }, - "source": [ - "## Get started" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "jLJQdbgSbb4M" - }, - "source": [ - "### Install Google Gen AI SDK for Python" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": { - "id": "SQ0qcEWuXNXs" - }, - "outputs": [], - "source": [ - "%pip install --upgrade --quiet google-genai" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "dmWOrTJ3gx13" - }, - "source": [ - "### Authenticate your notebook environment (Colab only)\n", - "\n", - "If you are running this notebook on Google Colab, run the following cell to authenticate your environment." - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": { - "id": "NyKGtVQjgx13" - }, - "outputs": [], - "source": [ - "import sys\n", - "\n", - "if \"google.colab\" in sys.modules:\n", - " from google.colab import auth\n", - "\n", - " auth.authenticate_user()" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "gfs2gxVrAN02" - }, - "source": [ - "### Import libraries" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": { - "id": "bPd_HnO0AQmL" - }, - "outputs": [], - "source": [ - "from IPython.display import Image, Markdown, display\n", - "from google import genai\n", - "from google.genai.types import GenerateContentConfig, Part" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "O6ZGaZlxP9L0" - }, - "source": [ - "### Set Google Cloud project information and create client\n", - "\n", - "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n", - "\n", - "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)." - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": { - "id": "u8IivOG5SqY6" - }, - "outputs": [], - "source": [ - "import os\n", - "\n", - "PROJECT_ID = \"[your-project-id]\" # @param {type: \"string\", placeholder: \"[your-project-id]\", isTemplate: true}\n", - "if not PROJECT_ID or PROJECT_ID == \"[your-project-id]\":\n", - " PROJECT_ID = str(os.environ.get(\"GOOGLE_CLOUD_PROJECT\"))\n", - "\n", - "LOCATION = \"global\"\n", - "\n", - "client = genai.Client(vertexai=True, project=PROJECT_ID, location=LOCATION)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "854fbf388e2b" - }, - "source": [ - "### Load the image model\n", - "\n", - "Gemini 2.0 Flash image generation: `gemini-2.0-flash-preview-image-generation`" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": { - "id": "7eeb063ac6d4" - }, - "outputs": [], - "source": [ - "MODEL_ID = \"gemini-2.0-flash-preview-image-generation\"" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "xgOucVQlVR4t" - }, - "source": [ - "## Image generation\n", - "\n", - "First, send a text prompt to Gemini 2.0 Flash describing the image you want to generate.\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "MmA_RAbFwED4" - }, - "source": [ - "### Text to image\n", - "\n", - "In the cell below, you'll call the `generate_content` method and pass in the following arguments:\n", - "\n", - "* `model`: The ID of the model you want to use\n", - "* `contents`: this is your prompt, in this case a text only user message describing the image to be generated\n", - "*`config`: A config for specifying content settings\n", - " * `response_modalities`: in this case `TEXT` and `IMAGE`, if you do not specify `IMAGE`, you will not get image output and `IMAGE` only is not allowed\n", - " * `candidate_count`: the number of candidates to generate\n", - " * `safety_settings`:\n", - " * `method`: HARM_BLOCK_METHOD_UNSPECIFIED, SEVERITY, PROBABILITY\n", - " * `category`: HARM_CATEGORY_UNSPECIFIED, HARM_CATEGORY_HATE_SPEECH, HARM_CATEGORY_DANGEROUS_CONTENT, HARM_CATEGORY_HARASSMENT, HARM_CATEGORY_SEXUALLY_EXPLICIT, HARM_CATEGORY_CIVIC_INTEGRITY\n", - " * `threshold`: HARM_BLOCK_THRESHOLD_UNSPECIFIED, BLOCK_LOW_AND_ABOVE, BLOCK_MEDIUM_AND_ABOVE, BLOCK_ONLY_HIGH, BLOCK_NONE, OFF\n", - "\n", - "All generated images include a [SynthID watermark](https://deepmind.google/technologies/synthid/), which can be verified via the Media Studio in [Vertex AI Studio](https://cloud.google.com/generative-ai-studio?hl=en)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "QK2Mi3zmkHSA" - }, - "outputs": [], - "source": [ - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=\"generate an image of a penguin driving a taxi in New York City\",\n", - " config=GenerateContentConfig(\n", - " response_modalities=[\"TEXT\", \"IMAGE\"],\n", - " candidate_count=1,\n", - " safety_settings=[\n", - " {\"method\": \"PROBABILITY\"},\n", - " {\"category\": \"HARM_CATEGORY_DANGEROUS_CONTENT\"},\n", - " {\"threshold\": \"BLOCK_MEDIUM_AND_ABOVE\"},\n", - " ],\n", - " ),\n", - ")\n", - "\n", - "for part in response.candidates[0].content.parts:\n", - " if part.text:\n", - " display(Markdown(part.text))\n", - " if part.inline_data:\n", - " display(Image(data=part.inline_data.data, width=350, height=350))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "5l4YLy8Vq_-v" - }, - "source": [ - "### Text to image and text\n", - "\n", - "In addition to generating images, Gemini can generate multiple images and text in an interleaved fashion.\n", - "\n", - "For example, you could ask the model to generate a recipe for banana bread with images showing different stages of the cooking process. Or, you could ask the model to generate images of different wildflowers with accompanying titles and descriptions.\n", - "\n", - "Let's try out the interleaved text and image functionality by prompting Gemini 2.0 Flash to create a tutorial for assemblying a peanut butter and jelly sandwich.\n", - "\n", - "You'll notice that in the prompt we ask the model to generate both text and images for each episode of the narrative. This will nudge the model to create text with images interleaved." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "WdkrXmyIFtgv" - }, - "source": [ - "⚠️ **Note:** that we are asking the model to generate a lot of content in this prompt, so it will take a bit of time for this cell to finish executing." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "FwCeB0Hxlrz2" - }, - "outputs": [], - "source": [ - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=\"Create a tutorial explaining how to make a peanut butter and jelly sandwich in three easy steps. For each step, provide a title with the number of the step, an explanation, and also generate an image, generate each image in a 1:1 aspect ratio.\",\n", - " config=GenerateContentConfig(\n", - " response_modalities=[\"TEXT\", \"IMAGE\"],\n", - " safety_settings=[\n", - " {\"method\": \"PROBABILITY\"},\n", - " {\"category\": \"HARM_CATEGORY_DANGEROUS_CONTENT\"},\n", - " {\"threshold\": \"BLOCK_MEDIUM_AND_ABOVE\"},\n", - " ],\n", - " ),\n", - ")\n", - "\n", - "for part in response.candidates[0].content.parts:\n", - " if part.text:\n", - " display(Markdown(part.text))\n", - " if part.inline_data:\n", - " display(Image(data=part.inline_data.data, width=350, height=350))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "R3g5n23lDtsN" - }, - "source": [ - "## Image editing\n", - "\n", - "You can pass text and an image to Gemini 2.0 Flash for use cases like product captions, information about a particular image, or to make edits or modifications to an existing image.\n", - "\n", - "### Text and image to image\n", - "\n", - "Let's try out a style transfer example and ask Gemini 2.0 Flash to create an image of this dog in a 3D cartoon rendering.\n", - "\n", - "Run the next cell to visualize the starting dog image." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "Bd9CASBlNc1p" - }, - "outputs": [], - "source": [ - "image_url = (\n", - " \"https://storage.googleapis.com/cloud-samples-data/generative-ai/image/dog-1.jpg\"\n", - ")\n", - "display(Image(url=image_url, width=350, height=350))" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "UAiUyW1VMzPn" - }, - "outputs": [], - "source": [ - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=[\n", - " Part.from_uri(\n", - " file_uri=\"gs://cloud-samples-data/generative-ai/image/dog-1.jpg\",\n", - " mime_type=\"image/jpeg\",\n", - " ),\n", - " \"Create a 3D cartoon style portrait of this dog, include rounded, exaggerated facial features, saturated colors, and realistic-looking textures. The dog is wearing a cowboy hat.\",\n", - " ],\n", - " config=GenerateContentConfig(\n", - " response_modalities=[\"TEXT\", \"IMAGE\"],\n", - " candidate_count=1,\n", - " safety_settings=[\n", - " {\"method\": \"PROBABILITY\"},\n", - " {\"category\": \"HARM_CATEGORY_DANGEROUS_CONTENT\"},\n", - " {\"threshold\": \"BLOCK_MEDIUM_AND_ABOVE\"},\n", - " ],\n", - " ),\n", - ")\n", - "for part in response.candidates[0].content.parts:\n", - " if part.text:\n", - " display(Markdown(part.text))\n", - " if part.inline_data:\n", - " display(Image(data=part.inline_data.data, width=350, height=350))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "2vnEO2OfPkjn" - }, - "source": [ - "### Multi-turn image editing\n", - "\n", - "In this next section, you supply a starting image and iteratively alter certain aspects of the image though chatting with Gemini 2.0 Flash.\n", - "\n", - "\n", - "Run the next step to view the starting image of a vase stored in Google Cloud Storage." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "Aq_Y7ORJP1ni" - }, - "outputs": [], - "source": [ - "image_url = (\n", - " \"https://storage.googleapis.com/cloud-samples-data/generative-ai/image/vase.png\"\n", - ")\n", - "display(Image(url=image_url, width=350, height=350))" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "B-2KXCNgQEYu" - }, - "outputs": [], - "source": [ - "chat = client.chats.create(model=MODEL_ID)\n", - "\n", - "response = chat.send_message(\n", - " message=[\n", - " Part.from_uri(\n", - " file_uri=\"gs://cloud-samples-data/generative-ai/image/vase.png\",\n", - " mime_type=\"image/png\",\n", - " ),\n", - " \"add sunflowers to this vase\",\n", - " ],\n", - " config=GenerateContentConfig(\n", - " response_modalities=[\"TEXT\", \"IMAGE\"],\n", - " ),\n", - ")\n", - "\n", - "for part in response.candidates[0].content.parts:\n", - " if part.text:\n", - " display(Markdown(part.text))\n", - " if part.inline_data:\n", - " display(Image(data=part.inline_data.data, width=350, height=350))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "Loi4drVpdCjn" - }, - "source": [ - "Now, send a new text message to the existing chat asking to update the previously generated image." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "99B7me3HGC1I" - }, - "outputs": [], - "source": [ - "response = chat.send_message(\n", - " message=[\n", - " \"change the sunflowers in the vase to pink and purple tulips\",\n", - " ],\n", - " config=GenerateContentConfig(\n", - " response_modalities=[\"TEXT\", \"IMAGE\"],\n", - " ),\n", - ")\n", - "\n", - "for part in response.candidates[0].content.parts:\n", - " if part.text:\n", - " display(Markdown(part.text))\n", - " if part.inline_data:\n", - " display(Image(data=part.inline_data.data, width=350, height=350))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "1l19iG4CZMEr" - }, - "source": [ - "### Images and text to image and text\n", - "\n", - "When editing images with Gemini 2.0 Flash, you can also supply multiple input images to create new ones. In this next example, you'll prompt Gemini with an image of a teacup and an outdoor table. You'll then ask Gemini to combine the objects from these images in order to create a new one. You'll also ask Gemini to supply text to accompany the image.\n", - "\n", - "\n", - "Run the following cell to visualize the starting images of an outdoor table and teacup." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "HuLVkC3edP6s" - }, - "outputs": [], - "source": [ - "table_url = (\n", - " \"https://storage.googleapis.com/cloud-samples-data/generative-ai/image/table.png\"\n", - ")\n", - "display(Image(url=table_url, width=300, height=300))\n", - "\n", - "teacup_url = (\n", - " \"https://storage.googleapis.com/cloud-samples-data/generative-ai/image/teacup-1.png\"\n", - ")\n", - "display(Image(url=teacup_url, width=300, height=300))" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "WD3XoWjuZQFr" - }, - "outputs": [], - "source": [ - "response = client.models.generate_content(\n", - " model=MODEL_ID,\n", - " contents=[\n", - " Part.from_uri(\n", - " file_uri=\"gs://cloud-samples-data/generative-ai/image/table.png\",\n", - " mime_type=\"image/png\",\n", - " ),\n", - " Part.from_uri(\n", - " file_uri=\"gs://cloud-samples-data/generative-ai/image/teacup-1.png\",\n", - " mime_type=\"image/png\",\n", - " ),\n", - " \"Generate a side profile image of a person sitting at this table drinking out of this teacup in a 1:1 aspect ratio. Include a caption that could be used to post this image on social media.\",\n", - " ],\n", - " config=GenerateContentConfig(\n", - " response_modalities=[\"TEXT\", \"IMAGE\"],\n", - " candidate_count=1,\n", - " safety_settings=[\n", - " {\"method\": \"PROBABILITY\"},\n", - " {\"category\": \"HARM_CATEGORY_DANGEROUS_CONTENT\"},\n", - " {\"threshold\": \"BLOCK_MEDIUM_AND_ABOVE\"},\n", - " ],\n", - " ),\n", - ")\n", - "\n", - "for part in response.candidates[0].content.parts:\n", - " if part.text:\n", - " display(Markdown(part.text))\n", - " if part.inline_data:\n", - " display(Image(data=part.inline_data.data, width=350, height=350))" - ] - } - ], - "metadata": { - "colab": { - "name": "intro_gemini_2_0_image_gen.ipynb", - "toc_visible": true - }, - "kernelspec": { - "display_name": "Python 3", - "name": "python3" - } - }, - "nbformat": 4, - "nbformat_minor": 0 -} diff --git a/gemini/getting-started/intro_gemini_2_0_image_gen_rest_api.ipynb b/gemini/getting-started/intro_gemini_2_0_image_gen_rest_api.ipynb deleted file mode 100644 index 423c414e04e..00000000000 --- a/gemini/getting-started/intro_gemini_2_0_image_gen_rest_api.ipynb +++ /dev/null @@ -1,890 +0,0 @@ -{ - "cells": [ - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "dRtCRT8gJxPf" - }, - "outputs": [], - "source": [ - "# Copyright 2025 Google LLC\n", - "#\n", - "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", - "# you may not use this file except in compliance with the License.\n", - "# You may obtain a copy of the License at\n", - "#\n", - "# https://www.apache.org/licenses/LICENSE-2.0\n", - "#\n", - "# Unless required by applicable law or agreed to in writing, software\n", - "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", - "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", - "# See the License for the specific language governing permissions and\n", - "# limitations under the License." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "ZPC2X_a9ErW7" - }, - "source": [ - "# Gemini 2.0 Flash Image Generation in Vertex AI with REST API\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - "
\n", - " \n", - " \"Google
Open in Colab\n", - "
\n", - "
\n", - " \n", - " \"Google
Open in Colab Enterprise\n", - "
\n", - "
\n", - " \n", - " \"Vertex
Open in Vertex AI Workbench\n", - "
\n", - "
\n", - " \n", - " \"GitHub
View on GitHub\n", - "
\n", - "
\n", - "\n", - "
\n", - "\n", - "Share to:\n", - "\n", - "\n", - " \"LinkedIn\n", - "\n", - "\n", - "\n", - " \"Bluesky\n", - "\n", - "\n", - "\n", - " \"X\n", - "\n", - "\n", - "\n", - " \"Reddit\n", - "\n", - "\n", - "\n", - " \"Facebook\n", - "" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "f0cc0f48513b" - }, - "source": [ - "| Author(s) |\n", - "| --- |\n", - "| [Nikita Namjoshi](https://github.com/nikitamaia) |\n", - "| [Katie Nguyen](https://github.com/katiemn) |" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "axauUzNXEl_R" - }, - "source": [ - "## Overview\n", - "\n", - "Gemini 2.0 Flash supports image generation and editing. This enables you to converse with Gemini and create images with interwoven text.\n", - "\n", - "In this tutorial, you'll learn how to use Gemini 2.0 Flash's image generation features in Vertex AI using the REST API.\n", - "\n", - "You'll try out the following scenarios:\n", - "* Image generation:\n", - " * Text to image\n", - " * Text to image and text (interleaved)\n", - "* Image editing:\n", - " * Text and image to image\n", - " * Multi-turn image editing\n", - " * Images and text to image and text (interleaved)\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "D50ekWXjEl_S" - }, - "source": [ - "## Get started" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "jLJQdbgSbb4M" - }, - "source": [ - "### Install required libraries" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": { - "id": "SQ0qcEWuXNXs" - }, - "outputs": [], - "source": [ - "%%capture\n", - "\n", - "!sudo apt install -q jq" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "dmWOrTJ3gx13" - }, - "source": [ - "### Authenticate your notebook environment (Colab only)\n", - "\n", - "If you are running this notebook on Google Colab, run the following cell to authenticate your environment." - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": { - "id": "NyKGtVQjgx13" - }, - "outputs": [], - "source": [ - "import sys\n", - "\n", - "if \"google.colab\" in sys.modules:\n", - " from google.colab import auth\n", - "\n", - " auth.authenticate_user()" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "TH9d_9D-UmIX" - }, - "source": [ - "### Import libraries" - ] - }, - { - "cell_type": "code", - "execution_count": 42, - "metadata": { - "id": "QI3Hak-wnTQ3" - }, - "outputs": [], - "source": [ - "import base64\n", - "import json\n", - "\n", - "from IPython.display import Image, Markdown, display" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "O6ZGaZlxP9L0" - }, - "source": [ - "### Set Google Cloud project information\n", - "\n", - "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n", - "\n", - "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)." - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": { - "id": "u8IivOG5SqY6" - }, - "outputs": [], - "source": [ - "# Use the environment variable if the user doesn't provide Project ID.\n", - "import os\n", - "\n", - "PROJECT_ID = \"[your-project-id]\" # @param {type: \"string\", placeholder: \"[your-project-id]\", isTemplate: true}\n", - "if not PROJECT_ID or PROJECT_ID == \"[your-project-id]\":\n", - " PROJECT_ID = str(os.environ.get(\"GOOGLE_CLOUD_PROJECT\"))\n", - "\n", - "LOCATION = \"global\"" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "854fbf388e2b" - }, - "source": [ - "### Load the image model\n", - "\n", - "Gemini 2.0 Flash image generation: `gemini-2.0-flash-preview-image-generation`" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": { - "id": "7eeb063ac6d4" - }, - "outputs": [], - "source": [ - "MODEL_ID = \"gemini-2.0-flash-preview-image-generation\"" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "8s84m7h6HSTR" - }, - "source": [ - "### Defining environment variables for cURL commands\n", - "\n", - "These environment variables are used to construct the cURL commands." - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": { - "id": "krJ8UOHKoPn3" - }, - "outputs": [], - "source": [ - "import os\n", - "\n", - "os.environ[\"PROJECT_ID\"] = PROJECT_ID\n", - "os.environ[\"LOCATION\"] = LOCATION\n", - "\n", - "API_HOST = \"aiplatform.googleapis.com\"\n", - "os.environ[\"API_ENDPOINT\"] = (\n", - " f\"{API_HOST}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "xgOucVQlVR4t" - }, - "source": [ - "## Image generation\n", - "\n", - "First, send a text prompt to Gemini 2.0 Flash describing the image you want to generate.\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "MmA_RAbFwED4" - }, - "source": [ - "### Text to image\n", - "\n", - "In the curl command below, you'll see that the payload includes the following keys:\n", - "\n", - "* `contents`: this is your prompt, in this case a text only user message\n", - "*`generation_config`: this dictionary specifies the desired output modalities, in this case `TEXT` and `IMAGE`. If you do not specify `IMAGE`, you will not get image output and `IMAGE` only is not allowed\n", - "* `safetySettings`: select your options from the categories below:\n", - " * `method`: HARM_BLOCK_METHOD_UNSPECIFIED, SEVERITY, PROBABILITY\n", - " * `category`: HARM_CATEGORY_UNSPECIFIED, HARM_CATEGORY_HATE_SPEECH, HARM_CATEGORY_DANGEROUS_CONTENT, HARM_CATEGORY_HARASSMENT, HARM_CATEGORY_SEXUALLY_EXPLICIT, HARM_CATEGORY_CIVIC_INTEGRITY\n", - " * `threshold`: HARM_BLOCK_THRESHOLD_UNSPECIFIED, BLOCK_LOW_AND_ABOVE, BLOCK_MEDIUM_AND_ABOVE, BLOCK_ONLY_HIGH, BLOCK_NONE, OFF\n", - "\n", - "The cell below writes the output of running the curl command to the file `response.json`." - ] - }, - { - "cell_type": "code", - "execution_count": 38, - "metadata": { - "id": "FOynZhO6u3lf" - }, - "outputs": [], - "source": [ - "%%bash\n", - "\n", - "curl -X POST \\\n", - " -H \"Authorization: Bearer $(gcloud auth print-access-token)\" \\\n", - " -H \"Content-Type: application/json\" \\\n", - " https://${API_ENDPOINT}:generateContent \\\n", - " -d '{\n", - " \"contents\": {\n", - " \"role\": \"USER\",\n", - " \"parts\": { \"text\": \"generate an image of a penguin driving a taxi in New York City\"},\n", - " },\n", - " \"generation_config\": {\n", - " \"response_modalities\": [\"TEXT\", \"IMAGE\"],\n", - " },\n", - " \"safetySettings\": {\n", - " \"method\": \"PROBABILITY\",\n", - " \"category\": \"HARM_CATEGORY_DANGEROUS_CONTENT\",\n", - " \"threshold\": \"BLOCK_MEDIUM_AND_ABOVE\"\n", - " },\n", - " }' 2>/dev/null >response.json" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "xDXZgcTdw-RT" - }, - "source": [ - "Let's examine the output in the `response.json` file.\n", - "\n", - "In `content` you can see the model has created an `image/png` which is the b64 encoded value to the `data` key." - ] - }, - { - "cell_type": "code", - "execution_count": 39, - "metadata": { - "id": "yki65L0KxE6a" - }, - "outputs": [], - "source": [ - "!cat response.json" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "GysKM62jeSFi" - }, - "source": [ - "Next, load in the data from the `response.json` file so it's easier to work with in Python." - ] - }, - { - "cell_type": "code", - "execution_count": 40, - "metadata": { - "id": "4BdxCWrRZnCk" - }, - "outputs": [], - "source": [ - "with open(\"response.json\") as f:\n", - " response_data = json.load(f)\n", - " print(response_data)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "mAjVosqHec6F" - }, - "source": [ - "Extract the image data from the response and visualize. All generated images include a [SynthID watermark](https://deepmind.google/technologies/synthid/), which can be verified via the Media Studio in [Vertex AI Studio](https://cloud.google.com/generative-ai-studio?hl=en)." - ] - }, - { - "cell_type": "code", - "execution_count": 41, - "metadata": { - "id": "I6JAcTWdZqlI" - }, - "outputs": [], - "source": [ - "image_part = next(\n", - " filter(\n", - " lambda x: \"inlineData\" in x,\n", - " response_data[\"candidates\"][0][\"content\"][\"parts\"],\n", - " )\n", - ")\n", - "\n", - "image_data = base64.b64decode(image_part[\"inlineData\"][\"data\"])\n", - "display(Image(data=image_data, width=350, height=350))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "5l4YLy8Vq_-v" - }, - "source": [ - "### Text to image and text\n", - "\n", - "In addition to generating images, Gemini can generate multiple images and text in an interleaved fashion.\n", - "\n", - "For example, you could ask the model to generate a recipe for banana bread with images showing different stages of the cooking process. Or, you could ask the model to generate images of different wildflowers with accompanying titles and descriptions.\n", - "\n", - "Let's try out the interleaved text and image functionality by prompting Gemini 2.0 Flash to create a tutorial for assemblying a peanut butter and jelly sandwich.\n", - "\n", - "You'll notice that in the prompt we ask the model to generate both text and images. This will nudge the model to create text with images interleaved." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "WdkrXmyIFtgv" - }, - "source": [ - "⚠️ **Note:** we are asking the model to generate a lot of content in this prompt, so it will take a bit of time for this cell to finish executing." - ] - }, - { - "cell_type": "code", - "execution_count": 22, - "metadata": { - "id": "LgoqgDCVVR4u" - }, - "outputs": [], - "source": [ - "%%bash\n", - "\n", - "curl -X POST \\\n", - " -H \"Authorization: Bearer $(gcloud auth print-access-token)\" \\\n", - " -H \"Content-Type: application/json\" \\\n", - " https://${API_ENDPOINT}:generateContent \\\n", - " -d '{\n", - " \"contents\": {\n", - " \"role\": \"USER\",\n", - " \"parts\": { \"text\": \"Create a tutorial explaining how to make a peanut butter and jelly sandwich in three easy steps. For each step, provide a title with the number of the step, an explanation, and also generate an image, generate each image in a 1:1 aspect ratio.\"},\n", - " },\n", - " \"generation_config\": {\n", - " \"response_modalities\": [\"TEXT\", \"IMAGE\"],\n", - " },\n", - " \"safetySettings\": {\n", - " \"method\": \"PROBABILITY\",\n", - " \"category\": \"HARM_CATEGORY_DANGEROUS_CONTENT\",\n", - " \"threshold\": \"BLOCK_MEDIUM_AND_ABOVE\"\n", - " },\n", - " }' 2>/dev/null >response.json" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "WvbxLZLqy2Kp" - }, - "source": [ - "Let's visualize the response." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "iUgEbabPHrVh" - }, - "outputs": [], - "source": [ - "with open(\"response.json\") as f:\n", - " response_data = json.load(f)\n", - "\n", - "for part in response_data[\"candidates\"][0][\"content\"][\"parts\"]:\n", - " if \"text\" in part.keys():\n", - " display(Markdown(part[\"text\"]))\n", - " if \"inlineData\" in part.keys():\n", - " content = part[\"inlineData\"][\"data\"]\n", - " image_data = base64.b64decode(content)\n", - " display(Image(data=image_data, width=350, height=350))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "R3g5n23lDtsN" - }, - "source": [ - "## Image editing\n", - "\n", - "You can pass text and an image to Gemini 2.0 Flash for use cases like product captions, information about a particular image, or to make edits or modifications to an existing image.\n", - "\n", - "### Text and image to image\n", - "\n", - "Let's try out a style transfer example and ask Gemini 2.0 Flash to create an image of this dog in a 3D cartoon rendering." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "SPLCmWPXaZjA" - }, - "source": [ - "Visualize the starting dog image by running this next cell." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "yuTg_ZGCn95N" - }, - "outputs": [], - "source": [ - "image_url = (\n", - " \"https://storage.googleapis.com/cloud-samples-data/generative-ai/image/dog-1.jpg\"\n", - ")\n", - "display(Image(url=image_url, width=350, height=350))" - ] - }, - { - "cell_type": "code", - "execution_count": 53, - "metadata": { - "id": "UAiUyW1VMzPn" - }, - "outputs": [], - "source": [ - "%%bash\n", - "\n", - "curl -X POST \\\n", - " -H \"Authorization: Bearer $(gcloud auth print-access-token)\" \\\n", - " -H \"Content-Type: application/json\" \\\n", - " https://${API_ENDPOINT}:generateContent \\\n", - " -d '{\n", - " \"contents\": {\n", - " \"role\": \"USER\",\n", - " \"parts\": [\n", - " {\"file_data\": {\n", - " \"mime_type\": \"image/jpg\",\n", - " \"file_uri\": \"gs://cloud-samples-data/generative-ai/image/dog-1.jpg\"\n", - " }\n", - " },\n", - " {\"text\": \"Create a 3D cartoon style portrait of this dog, include rounded, exaggerated facial features, saturated colors, and realistic-looking textures. The dog is wearing a cowboy hat.\"},\n", - " ]\n", - "\n", - " },\n", - " \"generation_config\": {\n", - " \"response_modalities\": [\"TEXT\", \"IMAGE\"],\n", - " },\n", - " \"safetySettings\": {\n", - " \"method\": \"PROBABILITY\",\n", - " \"category\": \"HARM_CATEGORY_DANGEROUS_CONTENT\",\n", - " \"threshold\": \"BLOCK_MEDIUM_AND_ABOVE\"\n", - " },\n", - " }' 2>/dev/null >response.json" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "53Q1jGTIf8oS" - }, - "source": [ - "Extract the image data from the response and visualize." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "4UuYoh92KJSA" - }, - "outputs": [], - "source": [ - "with open(\"response.json\") as f:\n", - " response_data = json.load(f)\n", - "\n", - "image_part = next(\n", - " filter(\n", - " lambda x: \"inlineData\" in x,\n", - " response_data[\"candidates\"][0][\"content\"][\"parts\"],\n", - " )\n", - ")\n", - "\n", - "image_data = base64.b64decode(image_part[\"inlineData\"][\"data\"])\n", - "display(Image(data=image_data, width=350, height=350))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "SlGAQflXXL2w" - }, - "source": [ - "### Multi-turn image editing\n", - "\n", - "In this next section, you supply a starting image and iteratively alter certain aspects of the image though chatting with Gemini 2.0 Flash.\n", - "\n", - "Visualize the starting image of a vase that's stored in Google Cloud Storage by running this next cell." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "mLRjMqLnpVkU" - }, - "outputs": [], - "source": [ - "image_url = (\n", - " \"https://storage.googleapis.com/cloud-samples-data/generative-ai/image/vase.png\"\n", - ")\n", - "display(Image(url=image_url, width=350, height=350))" - ] - }, - { - "cell_type": "code", - "execution_count": 56, - "metadata": { - "id": "q1n-AD82pak6" - }, - "outputs": [], - "source": [ - "%%bash\n", - "\n", - "curl -X POST \\\n", - " -H \"Authorization: Bearer $(gcloud auth print-access-token)\" \\\n", - " -H \"Content-Type: application/json\" \\\n", - " https://${API_ENDPOINT}:generateContent \\\n", - " -d '{\n", - " \"contents\": {\n", - " \"role\": \"USER\",\n", - " \"parts\": [\n", - " {\"file_data\": {\n", - " \"mime_type\": \"image/png\",\n", - " \"file_uri\": \"gs://cloud-samples-data/generative-ai/image/vase.png\"\n", - " }\n", - " },\n", - " {\"text\": \"add sunflowers to this vase\"},\n", - " ]\n", - " },\n", - " \"generation_config\": {\n", - " \"response_modalities\": [\"TEXT\", \"IMAGE\"],\n", - " }\n", - " }' 2>/dev/null >response.json" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "6ixkSIAAa4mx" - }, - "source": [ - "Extract the image data from the response and visualize.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "PCV4kWgmr1oy" - }, - "outputs": [], - "source": [ - "with open(\"response.json\") as f:\n", - " response_data = json.load(f)\n", - "\n", - "image_part = next(\n", - " filter(\n", - " lambda x: \"inlineData\" in x,\n", - " response_data[\"candidates\"][0][\"content\"][\"parts\"],\n", - " )\n", - ")\n", - "\n", - "image_data = base64.b64decode(image_part[\"inlineData\"][\"data\"])\n", - "display(Image(data=image_data, width=350, height=350))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "2SDChAxBa_hE" - }, - "source": [ - "Now, you'll add to the `contents` of the last request by including another `user` text prompt." - ] - }, - { - "cell_type": "code", - "execution_count": 58, - "metadata": { - "id": "Y2DmOotur9Ws" - }, - "outputs": [], - "source": [ - "%%bash\n", - "\n", - "curl -X POST \\\n", - " -H \"Authorization: Bearer $(gcloud auth print-access-token)\" \\\n", - " -H \"Content-Type: application/json\" \\\n", - " https://${API_ENDPOINT}:generateContent \\\n", - " -d '{\n", - " \"contents\": [\n", - " {\n", - " \"role\": \"user\",\n", - " \"parts\": [\n", - " {\"file_data\": {\n", - " \"mime_type\": \"image/png\",\n", - " \"file_uri\": \"gs://cloud-samples-data/generative-ai/image/vase.png\"\n", - " }\n", - " },\n", - " {\"text\": \"add sunflowers to this vase\"},\n", - " ]\n", - " },\n", - " {\n", - " \"role\": \"user\",\n", - " \"parts\": [\n", - " { \"text\": \"replace the sunflowers in the vase with pink and purple tulips\" },\n", - " ],\n", - " },\n", - " ],\n", - " \"generation_config\": {\n", - " \"response_modalities\": [\"TEXT\", \"IMAGE\"],\n", - " },\n", - " }' 2>/dev/null >response.json" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "P91PCAb9bU8X" - }, - "source": [ - "Extract the image data from the response and visualize.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "6JMtRvBHuLGh" - }, - "outputs": [], - "source": [ - "with open(\"response.json\") as f:\n", - " response_data = json.load(f)\n", - "\n", - "image_part = next(\n", - " filter(\n", - " lambda x: \"inlineData\" in x,\n", - " response_data[\"candidates\"][0][\"content\"][\"parts\"],\n", - " )\n", - ")\n", - "\n", - "image_data = base64.b64decode(image_part[\"inlineData\"][\"data\"])\n", - "display(Image(data=image_data, width=350, height=350))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "IzYiZyGzYZbM" - }, - "source": [ - "### Images and text to image and text\n", - "\n", - "When editing images with Gemini 2.0 Flash, you can also supply multiple input images to create new ones. In this next example, you'll prompt Gemini with an image of a teacup and an outdoor table. You'll then ask Gemini to combine the objects from these images in order to create a new one. You'll also ask Gemini to supply text to accompany the image.\n", - "\n", - "Visualize the starting images by running this next cell." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "8Xp4NndcRzey" - }, - "outputs": [], - "source": [ - "table_url = (\n", - " \"https://storage.googleapis.com/cloud-samples-data/generative-ai/image/table.png\"\n", - ")\n", - "display(Image(url=table_url, width=300, height=300))\n", - "\n", - "teacup_url = (\n", - " \"https://storage.googleapis.com/cloud-samples-data/generative-ai/image/teacup-1.png\"\n", - ")\n", - "display(Image(url=teacup_url, width=300, height=300))" - ] - }, - { - "cell_type": "code", - "execution_count": 61, - "metadata": { - "id": "fVB5dt8sRKf3" - }, - "outputs": [], - "source": [ - "%%bash\n", - "\n", - "curl -X POST \\\n", - " -H \"Authorization: Bearer $(gcloud auth print-access-token)\" \\\n", - " -H \"Content-Type: application/json\" \\\n", - " https://${API_ENDPOINT}:generateContent \\\n", - " -d '{\n", - " \"contents\": {\n", - " \"role\": \"USER\",\n", - " \"parts\": [\n", - " { \"text\": \"Generate a side profile image of a person sitting at this table drinking out of this teacup in a 1:1 aspect ratio. Include a caption that could be used to post this image on social media.\"},\n", - " {\"file_data\": {\n", - " \"mime_type\": \"image/png\",\n", - " \"file_uri\": \"gs://cloud-samples-data/generative-ai/image/table.png\"\n", - " }\n", - " },\n", - " {\"file_data\": {\n", - " \"mime_type\": \"image/png\",\n", - " \"file_uri\": \"gs://cloud-samples-data/generative-ai/image/teacup-1.png\"\n", - " }\n", - " },]\n", - " },\n", - " \"generation_config\": {\n", - " \"response_modalities\": [\"TEXT\", \"IMAGE\"],\n", - " },\n", - " \"safetySettings\": {\n", - " \"method\": \"PROBABILITY\",\n", - " \"category\": \"HARM_CATEGORY_DANGEROUS_CONTENT\",\n", - " \"threshold\": \"BLOCK_MEDIUM_AND_ABOVE\"\n", - " },\n", - " }' 2>/dev/null >response.json" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "PcwK0UQ7bk7f" - }, - "source": [ - "Extract the text and image data from the response and visualize.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "nG_3PKV6S3cY" - }, - "outputs": [], - "source": [ - "with open(\"response.json\") as f:\n", - " response_data = json.load(f)\n", - "\n", - "for part in response_data[\"candidates\"][0][\"content\"][\"parts\"]:\n", - " if \"text\" in part.keys():\n", - " display(Markdown(part[\"text\"]))\n", - " if \"inlineData\" in part.keys():\n", - " content = part[\"inlineData\"][\"data\"]\n", - " image_data = base64.b64decode(content)\n", - " display(Image(data=image_data, width=350, height=350))" - ] - } - ], - "metadata": { - "colab": { - "name": "intro_gemini_2_0_image_gen_rest_api.ipynb", - "toc_visible": true - }, - "kernelspec": { - "display_name": "Python 3", - "name": "python3" - } - }, - "nbformat": 4, - "nbformat_minor": 0 -} From 7e9f9bc1d6b8359d81a8aee1606dca3da3167df6 Mon Sep 17 00:00:00 2001 From: Holt Skinner <13262395+holtskinner@users.noreply.github.com> Date: Wed, 4 Feb 2026 12:40:32 -0600 Subject: [PATCH 2/2] Remove Gemini 2.0 from Styleguide --- .gemini/styleguide.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/.gemini/styleguide.md b/.gemini/styleguide.md index 9033578f6d6..4ec71b1f94a 100644 --- a/.gemini/styleguide.md +++ b/.gemini/styleguide.md @@ -110,11 +110,10 @@ and Vertex AI) as of 2025. Do not use legacy libraries and SDKs. - It is also acceptable to use following models if explicitly requested by the user: - - **Gemini 2.0 Series**: `gemini-2.0-flash`, `gemini-2.0-flash-lite` - **Gemini 2.5 Series**: `gemini-2.5-flash`, `gemini-2.5-pro` - Do not use the following deprecated models (or their variants like `gemini-1.5-flash-latest`): - - **Prohibited:** `gemini-1.5-flash` - - **Prohibited:** `gemini-1.5-pro` + - **Gemini 2.0 Series**: `gemini-2.0-flash`, `gemini-2.0-flash-lite` + - **Gemini 1.5 Series**: `gemini-1.5-flash`, `gemini-1.5-pro` - **Prohibited:** `gemini-pro`