-
Notifications
You must be signed in to change notification settings - Fork 2.5k
fix(models): Add PDF support for Claude models #3671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Add _is_pdf_part() helper function to detect PDF parts - Add PDF handling in part_to_message_block() function - PDFs are encoded as base64 and sent as document blocks to Anthropic API - Update return type annotation to include dict for PDF document blocks - Add test for PDF support Fixes google#3614
Summary of ChangesHello @sarojrout, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves a Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds support for PDF documents when using Claude models, which is a great feature enhancement. The implementation correctly handles PDF parts by encoding them as base64 and formatting them as document blocks for the Anthropic API. The addition of a unit test ensures the new logic is working as expected. I have one suggestion to improve the code's type safety and maintainability by using a TypedDict for the document block, which would align it with how other content types are handled in the file.
- Add DocumentBlockParam TypedDict for better type safety - Check for anthropic_types.DocumentBlockParam and use it if available - Fallback to local TypedDict when DocumentBlockParam not in anthropic library - Replace dict[str, Any] with DocumentBlockParam in return type annotation - Maintains backward compatibility while improving type safety Addresses reviewer feedback on PR google#3614
…DocumentBlockParam directly - Removed custom DocumentBlockParam TypedDict (not needed) - Use anthropic_types.DocumentBlockParam which exists in the library - Simplified code by removing hasattr check and fallback logic - Updated testcase to explicitly import anthropic_types
|
can we review and merge this if looks good @ryanaiagent ? |
|
This is amazing to see in-person how coding at a high-level is done correctly! |
|
Hi @sarojrout , Thank you for your contribution through this pull request! This PR has merge conflicts that require changes from your end. Could you please rebase your branch with the latest main branch to address these? Once this is complete, please let us know so we can proceed with the review. |
|
@ryanaiagent , pls review again. thanks! |
Fixes #3614
Please ensure you have read the contribution guide before creating a pull request.
Link to Issue or Description of Change
1. Link to an existing issue (if applicable):
2. Or, if no issue exists, describe the change:
If applicable, please follow the issue templates to provide as much detail as
possible.
Problem:
When using Claude models (e.g., Claude Sonnet 4.5) in ADK with PDF files, the code throws a
NotImplementedError: Not supported yeterror. Thepart_to_message_block()function inanthropic_llm.pyhandles text, images, function calls, and function responses, but does not handle PDF documents. When a user attempts to upload a PDF file (withmime_type="application/pdf"), the function falls through to the finalNotImplementedErrorat line 150.Solution:
Added PDF support by:
_is_pdf_part()helper function (similar to_is_image_part()) to detect PDF parts by checking formime_type == "application/pdf"part_to_message_block()function that:type="document"and the base64-encoded PDF datadict[str, Any]for PDF document blocksThis solution follows the same pattern used for image handling and leverages Anthropic's API support for PDF documents as document blocks.
Testing Plan
Please describe the tests that you ran to verify your changes. This is required
for all PRs that are not small documentation or typo fixes.
Unit Tests:
Please include a summary of passed
pytestresults.Manual End-to-End (E2E) Tests:
Please provide instructions on how to manually test your changes, including any
necessary setup or configuration. Please provide logs or screenshots to help
reviewers better understand the fix.
Setup:
GOOGLE_CLOUD_PROJECTandGOOGLE_CLOUD_LOCATIONTest Steps:
Create an ADK agent using Claude model:
Upload a PDF file to the agent:
Expected Result:
NotImplementedErroris raisedChecklist
Additional context
Code Changes Summary:
Technical Details: