gpt4v
Here are 37 public repositories matching this topic...
Mobile-Agent: The Powerful Mobile Device Operation Assistant Family
-
Updated
Sep 26, 2024 - Python
Vision utilities for web interaction agents 👀
-
Updated
Nov 25, 2024 - Jupyter Notebook
Control Any Computer Using LLMs
-
Updated
Dec 13, 2024 - Python
Prompts of GPT-4V & DALL-E3 to full utilize the multi-modal ability. GPT4V Prompts, DALL-E3 Prompts.
-
Updated
Oct 25, 2023
[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions
-
Updated
Jul 1, 2024 - Python
Convert different model APIs into the OpenAI API format out of the box.
-
Updated
Feb 21, 2024 - Go
GPT-4V in Wonderland: LMMs as Smartphone Agents
-
Updated
Jul 17, 2024 - Python
Implementation of MambaByte in "MambaByte: Token-free Selective State Space Model" in Pytorch and Zeta
-
Updated
Nov 11, 2024 - Python
The ultimate sketch to code app made using GPT4o serving 25k+ users. Choose your desired framework (React, Next, React Native, Flutter) for your app. It will instantly generate code and preview (sandbox) from a simple hand drawn sketch on paper captured from webcam
-
Updated
May 3, 2024
Early Alpha Release: Chat with Your Image - Leveraging GPT-4 Vision and Function Calls for AI-Powered Image Analysis and Description
-
Updated
Nov 29, 2023 - TypeScript
中文医学多模态大模型 Large Chinese Language-and-Vision Assistant for BioMedicine
-
Updated
May 22, 2024 - Python
Video Voiceover with gpt-4o-mini
-
Updated
Sep 27, 2024 - Jupyter Notebook
Monitor the performance of OpenAI's GPT-4V model over time.
-
Updated
Dec 25, 2024 - HTML
This repository offers a Python framework for a retrieval-augmented generation (RAG) pipeline using text and images from MHTML documents, leveraging Azure AI and OpenAI services. It includes ingestion and enrichment flows, a RAG with Vision pipeline, and evaluation tools.
-
Updated
Dec 9, 2024 - Python
Mark web pages for use with vision-language models
-
Updated
Sep 22, 2024 - TypeScript
Language instructions to mycobot using GPT-4V
-
Updated
Dec 11, 2023 - Python
Multi-Modal Multi-Embodied Hivemind-like Iteration of RTX-2
-
Updated
Nov 11, 2024 - Python
Improve this page
Add a description, image, and links to the gpt4v topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the gpt4v topic, visit your repo's landing page and select "manage topics."