-
Notifications
You must be signed in to change notification settings - Fork 16
Bedrock Rag is missing ppt, pptx text splitting support with knowledgebase queries #819
Comments
This is a big gap in the functionality - it is quite a simple addition. Anthropic supports it already so at least Claude models should be able to support pptx very easily |
Indeed, but text spliting happens before any LLM query, but any model including titan should be able to use it easily, same as PDF's. |
Thanks for the feedback! I think this is a good addition to the bedrock service API. I will move this issue to a cross SDK issue and open a feature request to the service team! Thanks! |
Added Product Feature Request, title - "Add ppt, pptx text splitting support in Bedrock Rag knowledge base query" |
Thanks again for the feature request.The Bedrock team is continuing to track this in their backlog for consideration. We're going to close this on our end as the service team would need to take the next steps here. Please refer to the blog or CHANGELOG for updates, or feel free to reach out through support if you have a support plan. Thanks! |
This issue is now closed. Comments on closed issues are hard for our team to see. |
Describe the feature
Current support but is behind other systems at moment,
format: "pdf" || "csv" || "doc" || "docx" || "xls" || "xlsx" || "html" || "txt" || "md", // required
but in rag you need ppt, pptx, (powerpoint splitting)
to 100% complete you need
mp4, mp3, youtube URL, youtube channel and JSON (someone implied its in there but i've not seen it)
Use Case
You have everything else except powerpoint, you have word, excel, txt, csv, html but no powerpoint.
A lot of information is in powerpoints, company info, results, and numerous presentations for training so ragifying them and using the information is quite a substantial set of user cases.
Proposed Solution
implement the embedding extraction from powerpoints (like you do with PDF's). If your using langchain in the background, its 5 minute job to add the PPT/PPTX conversion as a loader type but I don't know your underlying implementation.
Other Information
No response
Acknowledgements
SDK version used
3.651.1
Environment details (OS name and version, etc.)
Linux Debian, Nodejs / EC2 or even using Lamba AWS direct
The text was updated successfully, but these errors were encountered: