multi-modal-llm

Star

Here are 2 public repositories matching this topic...

VIPL-VISMOD / UniPose

Star

[CVPR 2025] UniPose: A Unified Multimodal Framework for Human Pose Comprehension, Generation and Editing

language-model human-pose text-driven multi-modal-llm

Updated Apr 7, 2025
Python

hemangjoshi37a / AIComputerInteractionLogger

Star

Python tool for capturing and logging human-computer interactions. Generate rich datasets for training multi-modal LLMs in autonomous computer control. Features screenshot, mouse, keyboard, and audio recording.

nlp machine-learning automation computer-vision screen-capture audio-recording dataset-generation human-computer-interaction computer-interaction ai-training ai-dataset autonomous-control multi-modal-llm input-logging

Updated Sep 16, 2024
Python

Improve this page

Add a description, image, and links to the multi-modal-llm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multi-modal-llm topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multi-modal-llm

Here are 2 public repositories matching this topic...

VIPL-VISMOD / UniPose

hemangjoshi37a / AIComputerInteractionLogger

Improve this page

Add this topic to your repo