Parallel function calling as the ONLY mode is not sufficient - need a sequential function calling mode #657

jflam · 2024-12-26T01:27:09Z

Description of the bug:

This is the official Google example from the SDK with my change added to it:

import google.generativeai as genai

def add(a: float, b: float):
    """returns a + b."""
    return a + b

def subtract(a: float, b: float):
    """returns a - b."""
    return a - b

def multiply(a: float, b: float):
    """returns a * b."""
    return a * b

def divide(a: float, b: float):
    """returns a / b."""
    return a / b

model = genai.GenerativeModel(
    model_name="gemini-1.5-flash", tools=[add, subtract, multiply, divide]
)
chat = model.start_chat(enable_automatic_function_calling=True)
response = chat.send_message(
    # "I have 57 cats, each owns 44 mittens, how many mittens is that in total?"
    "What is 3+4*7?"
)
print(response.text)

Actual vs expected behavior:

The actual behavior is a runtime error:

➜  uv run gemini_calc.py
Traceback (most recent call last):
  File "/Users/jflam/src/promptscript/scripts/gemini_calc.py", line 23, in <module>
    response = chat.send_message(
               ^^^^^^^^^^^^^^^^^^
  File "/Users/jflam/src/promptscript/.venv/lib/python3.12/site-packages/google/generativeai/generative_models.py", line 591, in send_message
    self.history, content, response = self._handle_afc(
                                      ^^^^^^^^^^^^^^^^^
  File "/Users/jflam/src/promptscript/.venv/lib/python3.12/site-packages/google/generativeai/generative_models.py", line 647, in _handle_afc
    fr = tools_lib(fc)
         ^^^^^^^^^^^^^
  File "/Users/jflam/src/promptscript/.venv/lib/python3.12/site-packages/google/generativeai/types/content_types.py", line 867, in __call__
    response = declaration(fc)
               ^^^^^^^^^^^^^^^
  File "/Users/jflam/src/promptscript/.venv/lib/python3.12/site-packages/google/generativeai/types/content_types.py", line 627, in __call__
    result = self.function(**fc.args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jflam/src/promptscript/scripts/gemini_calc.py", line 5, in add
    return a + b
           ~~^~~
TypeError: unsupported operand type(s) for +: 'float' and 'NoneType'
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1735176076.235002 11326435 init.cc:229] grpc_wait_for_shutdown_with_timeout() timed out.

The expected behavior would be like how Anthropic and OpenAI handle this case: it loops, calling one function at a time until the result (31) is computed.

Right now I have no idea how to bypass this, as the service complains with a 400 error if I try to just execute the first function and return nothing for the second function.

I think the SDK is trying to be too cute with the parallel calls. Please add a sequential execution mode.

Any other information you'd like to share?

No response

Gunand3043 · 2024-12-26T07:41:05Z

Hi @jflam,

The SDK also supports sequential function calling. Here is the tutorial notebook link where you can find function calling chain example. Notebook

The issue you are facing is due to model quality. Try using the advanced models: gemini-1.5-pro or gemini-2.0-flash-exp. It should work.

Thanks

jflam · 2024-12-26T16:54:29Z

Definitely not the model. Running with 1.5-pro yields the same result. Incidentally, how would I get fix the root cause of the warning that is output at the bottom - this is unsettling.

Traceback (most recent call last):
  File "/Users/jflam/src/promptscript/scripts/gemini_calc.py", line 23, in <module>
    response = chat.send_message(
               ^^^^^^^^^^^^^^^^^^
  File "/Users/jflam/src/promptscript/.venv/lib/python3.12/site-packages/google/generativeai/generative_models.py", line 591, in send_message
    self.history, content, response = self._handle_afc(
                                      ^^^^^^^^^^^^^^^^^
  File "/Users/jflam/src/promptscript/.venv/lib/python3.12/site-packages/google/generativeai/generative_models.py", line 647, in _handle_afc
    fr = tools_lib(fc)
         ^^^^^^^^^^^^^
  File "/Users/jflam/src/promptscript/.venv/lib/python3.12/site-packages/google/generativeai/types/content_types.py", line 867, in __call__
    response = declaration(fc)
               ^^^^^^^^^^^^^^^
  File "/Users/jflam/src/promptscript/.venv/lib/python3.12/site-packages/google/generativeai/types/content_types.py", line 627, in __call__
    result = self.function(**fc.args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jflam/src/promptscript/scripts/gemini_calc.py", line 5, in add
    return a + b
           ~~^~~
TypeError: unsupported operand type(s) for +: 'float' and 'NoneType'
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1735227752.959785 12562432 init.cc:229] grpc_wait_for_shutdown_with_timeout() timed out.

Here is a model program that I wrote a few days ago which does work. I have no idea why it works and my (more complex) system does not.

#!/usr/bin/env python3

import os
from typing import Any, Dict
from rich.pretty import pretty_repr

# Requires: pip install google-generativeai
import google.generativeai as genai
from google.protobuf.struct_pb2 import Struct
from pydantic import BaseModel, ValidationError

#
# 1. Define our Pydantic model
#
class CalculationResult(BaseModel):
    answer: float
    explanation: str

#
# 2. Define local Python "tools" (functions)
#
def add_numbers(a: float, b: float) -> float:
    """Add two numbers."""
    return a + b

def subtract_numbers(a: float, b: float) -> float:
    """Subtract second number from first."""
    return a - b

def multiply_numbers(a: float, b: float) -> float:
    """Multiply two numbers."""
    return a * b

def divide_numbers(a: float, b: float) -> float:
    """Divide a by b."""
    return a / b

#
# We'll define a special "return_calculation_result" function
# so we can force the final answer to be returned as JSON
#
def return_calculation_result(answer: float, explanation: str):
    """
    Return final structured JSON matching CalculationResult.
    Args:
      answer: numeric answer
      explanation: text explanation
    """
    return {"answer": answer, "explanation": explanation}

# Map from function name to the Python callable
FUNCTIONS = {
    "add_numbers": add_numbers,
    "subtract_numbers": subtract_numbers,
    "multiply_numbers": multiply_numbers,
    "divide_numbers": divide_numbers,
    "return_calculation_result": return_calculation_result,
}

def call_function(function_name: str, args: Dict[str, Any]) -> Any:
    """
    Dispatch a function call to one of our local tools.
    """
    if function_name not in FUNCTIONS:
        raise ValueError(f"Unknown function: {function_name}")
    func = FUNCTIONS[function_name]
    return func(**args)

#
# 3. Main logic
#
def main():
    # For a local environment, you could just do:
    GEMINI_API_KEY = os.environ.get("GEMINI_API_KEY")
    if not GEMINI_API_KEY:
        raise RuntimeError("Please set GEMINI_API_KEY in environment.")
    genai.configure(api_key=GEMINI_API_KEY)

    # 3b) Create a GenerativeModel that includes our Python tools
    model = genai.GenerativeModel(
        model_name="gemini-1.5-flash",  # or another Gemini model
        tools=FUNCTIONS.values(),      # pass the function objects
    )

    # 3c) We'll keep a conversation in a "messages" format
    # Each entry: {"role": "user"/"model", "parts": [...]}
    messages = [
        {
            "role": "user",
            "parts": ["Compute the answer to: 3 + 4 * 7. Then explain how you got that result."]
        }
    ]

    # 3d) We'll do a simple while loop:
    while True:
        # 1) Call the model with the conversation so far
        response = model.generate_content(messages)

        # 2) The model's top candidate
        parts = response.candidates[0].content.parts
        print(f"\n[INFO] Model response:\n{pretty_repr(parts)}")

        # 3) We store the model's reply in our conversation
        new_message = {"role": "model", "parts": parts}
        messages.append(new_message)

        # 4) Check for function calls or text
        function_calls = [p.function_call for p in parts if p.function_call]
        text_blocks = [p.text for p in parts if p.text]

        if function_calls:
            # The model wants to call one or more functions
            for fn_call in function_calls:
                fn_name = fn_call.name
                # fn_call.args is a dict-like object (MapComposite).
                # Convert it into a real Python dict:
                fn_args = dict(fn_call.args)  # or {k: v for k, v in fn_call.args.items()}

                print(f"[DEBUG] Model wants to call: {fn_name}({fn_args})")

                # Execute the function locally
                try:
                    result = call_function(fn_name, fn_args)
                except Exception as ex:
                    result = {"error": str(ex)}

                # 5) Send back a function_response
                s = Struct()
                # 'result' might be scalar, dict, etc. 
                # The .update() method requires a dict at top level, so wrap if needed:
                s.update({"result": result})

                function_response_part = genai.protos.Part(
                    function_response=genai.protos.FunctionResponse(
                        name=fn_name,
                        response=s
                    )
                )

                # We'll append this function_response as a new user message
                # (the "user" is providing the function result)
                user_msg = {
                    "role": "user",
                    "parts": [function_response_part]
                }
                messages.append(user_msg)

        else:
            # If there's no function_call, then presumably the model is done with tools
            # or just wants to provide a final text answer. 
            if text_blocks:
                print("[DEBUG] Model returned text:\n", "\n".join(text_blocks))

            # Let's now do the final pass to get a structured JSON:
            final_pass(messages, model)
            break

    print("\n[INFO] Done.")

def final_pass(messages, model):
    """
    4) We do one final request to get a CalculationResult as JSON using
       our 'return_calculation_result' function.
    """
    final_request = {
        "role": "user",
        "parts": [
            "Now call the `return_calculation_result(answer: float, explanation: str)` function. "
            "Please do not add any extra text. I want strictly JSON data for the final result."
        ]
    }
    messages.append(final_request)

    response = model.generate_content(messages)
    parts = response.candidates[0].content.parts

    # If we get a function_call to "return_calculation_result", parse it
    function_calls = [p.function_call for p in parts if p.function_call]
    if function_calls:
        fn_call = function_calls[0]
        if fn_call.name != "return_calculation_result":
            print(f"[ERROR] Expected return_calculation_result, got {fn_call.name}")
            return
        fn_args = dict(fn_call.args)
        try:
            # Attempt to parse into CalculationResult
            calc = CalculationResult(**fn_args)
            print("\n***** Final CalculationResult *****")
            print(calc.model_dump_json(indent=2))
        except ValidationError as e:
            print("[ERROR] Could not parse final JSON into CalculationResult:\n", e)
    else:
        # Possibly the model just returned text. Let's see:
        text_blocks = [p.text for p in parts if p.text]
        raw_text = "\n".join(text_blocks).strip()
        print("[WARNING] Model did not call return_calculation_result. Text returned:\n", raw_text)


if __name__ == "__main__":
    main()

Gunand3043 · 2024-12-27T07:44:54Z

I tried using the gemini-1.5-pro model with automatic function calling, and it worked as expected. However, with gemini-1.5-flash, the results were inconsistent, sometimes it worked, sometimes it failed. Here is the colab gist link.

Gunand3043 added type:feature request New feature request/enhancement status:triaged Issue/PR triaged to the corresponding sub-team component:python sdk Issue/PR related to Python SDK labels Dec 26, 2024

Gunand3043 self-assigned this Dec 26, 2024

Gunand3043 removed the type:feature request New feature request/enhancement label Dec 26, 2024

Gunand3043 added status:awaiting user response Awaiting a response from the author type:help Support-related issues and removed status:triaged Issue/PR triaged to the corresponding sub-team labels Dec 26, 2024

Gunand3043 removed the status:awaiting user response Awaiting a response from the author label Dec 27, 2024

Gunand3043 added the status:awaiting user response Awaiting a response from the author label Dec 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallel function calling as the ONLY mode is not sufficient - need a sequential function calling mode #657

Parallel function calling as the ONLY mode is not sufficient - need a sequential function calling mode #657

jflam commented Dec 26, 2024

Gunand3043 commented Dec 26, 2024 •

edited

Loading

jflam commented Dec 26, 2024 •

edited

Loading

Gunand3043 commented Dec 27, 2024

Parallel function calling as the ONLY mode is not sufficient - need a sequential function calling mode #657

Parallel function calling as the ONLY mode is not sufficient - need a sequential function calling mode #657

Comments

jflam commented Dec 26, 2024

Description of the bug:

Actual vs expected behavior:

Any other information you'd like to share?

Gunand3043 commented Dec 26, 2024 • edited Loading

jflam commented Dec 26, 2024 • edited Loading

Gunand3043 commented Dec 27, 2024

Gunand3043 commented Dec 26, 2024 •

edited

Loading

jflam commented Dec 26, 2024 •

edited

Loading