Skip to content

Conversation

@sarojrout
Copy link
Contributor

  • Add _is_pdf_part() helper function to detect PDF parts
  • Add PDF handling in part_to_message_block() function
  • PDFs are encoded as base64 and sent as document blocks to Anthropic API
  • Update return type annotation to include dict for PDF document blocks
  • Add test for PDF support

Fixes #3614

Please ensure you have read the contribution guide before creating a pull request.

Link to Issue or Description of Change

1. Link to an existing issue (if applicable):

2. Or, if no issue exists, describe the change:

If applicable, please follow the issue templates to provide as much detail as
possible.

Problem:
When using Claude models (e.g., Claude Sonnet 4.5) in ADK with PDF files, the code throws a NotImplementedError: Not supported yet error. The part_to_message_block() function in anthropic_llm.py handles text, images, function calls, and function responses, but does not handle PDF documents. When a user attempts to upload a PDF file (with mime_type="application/pdf"), the function falls through to the final NotImplementedError at line 150.

Solution:
Added PDF support by:

  1. Creating a _is_pdf_part() helper function (similar to _is_image_part()) to detect PDF parts by checking for mime_type == "application/pdf"
  2. Adding PDF handling in part_to_message_block() function that:
    • Detects PDF parts using the new helper function
    • Encodes PDF data as base64 (same as images)
    • Returns a document block dictionary with type="document" and the base64-encoded PDF data
  3. Updated the return type annotation to include dict[str, Any] for PDF document blocks
  4. Added comprehensive unit test to verify PDF handling works correctly

This solution follows the same pattern used for image handling and leverages Anthropic's API support for PDF documents as document blocks.

Testing Plan

Please describe the tests that you ran to verify your changes. This is required
for all PRs that are not small documentation or typo fixes.

Unit Tests:

  • I have added or updated unit tests for my change.
  • All unit tests pass locally.

Please include a summary of passed pytest results.

Manual End-to-End (E2E) Tests:

Please provide instructions on how to manually test your changes, including any
necessary setup or configuration. Please provide logs or screenshots to help
reviewers better understand the fix.

Setup:

  1. Configure Claude model with proper Vertex AI credentials
  2. Set environment variables: GOOGLE_CLOUD_PROJECT and GOOGLE_CLOUD_LOCATION

Test Steps:

  1. Create an ADK agent using Claude model:

    from google.adk import Agent
    from google.adk.models.anthropic_llm import Claude
    
    agent = Agent(
        name="pdf_reader",
        model=Claude(model="claude-3-5-sonnet-v2@20241022"),
        instruction="Analyze PDF documents"
    )
  2. Upload a PDF file to the agent:

    from google.genai import types
    
    # Read PDF file
    with open("document.pdf", "rb") as f:
        pdf_data = f.read()
    
    # Create content with PDF
    content = types.Content(
        role="user",
        parts=[
            types.Part(
                inline_data=types.Blob(
                    mime_type="application/pdf",
                    data=pdf_data
                )
            )
        ]
    )
    
    # Run agent - should now work without NotImplementedError
    async for event in runner.run_async(
        user_id="test-user",
        session_id="test-session",
        new_message=content
    ):
        print(event)

Expected Result:

  • No NotImplementedError is raised
  • PDF is successfully sent to Claude API as a document block
  • Claude can process and analyze the PDF content
  • Agent responds with analysis of the PDF

Checklist

  • I have read the CONTRIBUTING.md document.
  • I have performed a self-review of my own code.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have added tests that prove my fix is effective or that my feature works.
  • New and existing unit tests pass locally with my changes.
  • I have manually tested my changes end-to-end. (Note: Manual testing requires Vertex AI setup with Claude access)
  • Any dependent changes have been merged and published in downstream modules.

Additional context

Code Changes Summary:

  • File: src/google/adk/models/anthropic_llm.py
  • Added _is_pdf_part() helper function (lines 79-85)
  • Added PDF handling in part_to_message_block() (lines 147-159)
  • Updated return type annotation (line 95)
  • File: tests/unittests/models/test_anthropic_llm.py
  • Added test_part_to_message_block_with_pdf() test (lines 467-496)

Technical Details:

  • PDFs are handled similarly to images: base64-encoded and sent as document blocks
  • The implementation follows Anthropic's API specification for document blocks
  • The fix is backward compatible - existing functionality (text, images, function calls) remains unchanged

usharout and others added 3 commits November 16, 2025 01:01
- Add _is_pdf_part() helper function to detect PDF parts
- Add PDF handling in part_to_message_block() function
- PDFs are encoded as base64 and sent as document blocks to Anthropic API
- Update return type annotation to include dict for PDF document blocks
- Add test for PDF support

Fixes google#3614
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @sarojrout, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a NotImplementedError that previously occurred when attempting to use PDF files with Claude models within the ADK framework. It introduces robust support for PDF documents, allowing them to be seamlessly processed by Anthropic's API. This enhancement significantly expands the capabilities of ADK agents by enabling them to analyze and interact with PDF content, mirroring the existing functionality for image handling.

Highlights

  • PDF Part Detection: Introduced a new helper function _is_pdf_part() to identify PDF data within message parts by checking for the application/pdf mime type.
  • PDF Handling in Message Blocks: Implemented logic within part_to_message_block() to process PDF parts by base64 encoding their data and formatting them as Anthropic API document blocks, similar to how images are handled.
  • Type Annotation Update: Updated the return type annotation for part_to_message_block() to include dict[str, Any] to correctly reflect the new dictionary type used for PDF document blocks.
  • Comprehensive Unit Test: Added a dedicated unit test (test_part_to_message_block_with_pdf()) to verify the correct handling and base64 encoding of PDF documents.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@adk-bot adk-bot added the models [Component] Issues related to model support label Nov 23, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for PDF documents when using Claude models, which is a great feature enhancement. The implementation correctly handles PDF parts by encoding them as base64 and formatting them as document blocks for the Anthropic API. The addition of a unit test ensures the new logic is working as expected. I have one suggestion to improve the code's type safety and maintainability by using a TypedDict for the document block, which would align it with how other content types are handled in the file.

- Add DocumentBlockParam TypedDict for better type safety
- Check for anthropic_types.DocumentBlockParam and use it if available
- Fallback to local TypedDict when DocumentBlockParam not in anthropic library
- Replace dict[str, Any] with DocumentBlockParam in return type annotation
- Maintains backward compatibility while improving type safety

Addresses reviewer feedback on PR google#3614
…DocumentBlockParam directly

- Removed custom DocumentBlockParam TypedDict (not needed)
- Use anthropic_types.DocumentBlockParam which exists in the library
- Simplified code by removing hasattr check and fallback logic
- Updated testcase to explicitly import anthropic_types
@ryanaiagent ryanaiagent self-assigned this Nov 25, 2025
@sarojrout
Copy link
Contributor Author

can we review and merge this if looks good @ryanaiagent ?

@jesse-aluiso
Copy link

This is amazing to see in-person how coding at a high-level is done correctly!

@ryanaiagent
Copy link
Collaborator

Hi @sarojrout , Thank you for your contribution through this pull request! This PR has merge conflicts that require changes from your end. Could you please rebase your branch with the latest main branch to address these? Once this is complete, please let us know so we can proceed with the review.

@ryanaiagent ryanaiagent added the request clarification [Status] The maintainer need clarification or more information from the author label Nov 30, 2025
@sarojrout
Copy link
Contributor Author

@ryanaiagent , pls review again. thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

models [Component] Issues related to model support request clarification [Status] The maintainer need clarification or more information from the author

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Claude imported from anthropic_llm throws NotImplementedError when using Anthropic LLM while uploading PDFs

5 participants