-
Notifications
You must be signed in to change notification settings - Fork 468
Add keypoint-detection task to Hub #870
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
8608371
c34499f
e5e4969
e8551a4
dbe94fa
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -126,6 +126,7 @@ export const TASKS_MODEL_LIBRARIES: Record<PipelineType, ModelLibraryKey[]> = { | |
"image-to-image": ["diffusers", "transformers", "transformers.js"], | ||
"image-to-text": ["transformers", "transformers.js"], | ||
"image-to-video": ["diffusers"], | ||
"keypoint-detection": ["transformers"], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What does this mean? Cause there's no keypoint detection pipeline in Transformers yet There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think we should tag those models then before the PR is merged There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice (I opened https://github.com/huggingface-internal/moon-landing/issues/11015 internally ) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, I see 4 models now 👍 |
||
"video-classification": ["transformers"], | ||
"mask-generation": ["transformers"], | ||
"multiple-choice": ["transformers"], | ||
|
@@ -205,6 +206,7 @@ export const TASKS_DATA: Record<PipelineType, TaskData | undefined> = { | |
"image-text-to-text": getData("image-text-to-text", imageTextToText), | ||
"image-to-text": getData("image-to-text", imageToText), | ||
"image-to-video": undefined, | ||
"keypoint-detection": getData("keypoint-detection", placeholder), | ||
"mask-generation": getData("mask-generation", maskGeneration), | ||
"multiple-choice": undefined, | ||
"object-detection": getData("object-detection", objectDetection), | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
## Task Variants | ||
|
||
### Pose Estimation | ||
|
||
Pose estimation is the process of determining the position and orientation of an object or a camera in a 3D space. It is a fundamental task in computer vision and is widely used in various applications such as robotics, augmented reality, and 3D reconstruction. | ||
|
||
## Use Cases for Keypoint Detection | ||
|
||
### Facial Landmark Estimation | ||
|
||
Keypoint detection models can be used to estimate the position of facial landmarks. Facial landmarks are points on the face such as the corners of the mouth, the outer corners of the eyes, and the tip of the nose. These landmarks can be used for a variety of applications, such as facial expression recognition, 3D face reconstruction, and cinematic animation. | ||
|
||
### Fitness Tracking | ||
|
||
Keypoint detection models can be used to track the movement of the human body, e.g. position of the joints in a 3D space. This can be used for a variety of applications, such as fitness tracking, sports analysis or virtual reality applications. | ||
|
||
## Inference Code | ||
|
||
Below you can find an example of how to use a keypoint detection model and how to visualize the results. | ||
|
||
```python | ||
from transformers import AutoImageProcessor, SuperPointForKeypointDetection | ||
import torch | ||
import matplotlib.pyplot as plt | ||
from PIL import Image | ||
import requests | ||
|
||
url_image = "http://images.cocodataset.org/val2017/000000039769.jpg" | ||
image = Image.open(requests.get(url_image_1, stream=True).raw) | ||
|
||
# initialize the model and processor | ||
processor = AutoImageProcessor.from_pretrained("magic-leap-community/superpoint") | ||
model = SuperPointForKeypointDetection.from_pretrained("magic-leap-community/superpoint") | ||
|
||
# infer | ||
inputs = processor(image, return_tensors="pt").to(model.device, model.dtype) | ||
outputs = model(**inputs) | ||
|
||
# visualize the output | ||
image_width, image_height = image.size | ||
image_mask = outputs.mask | ||
image_indices = torch.nonzero(image_mask).squeeze() | ||
|
||
image_scores = outputs.scores.squeeze() | ||
image_keypoints = outputs.keypoints.squeeze() | ||
keypoints = image_keypoints.detach().numpy() | ||
scores = image_scores.detach().numpy() | ||
|
||
plt.axis('off') | ||
plt.imshow(image) | ||
plt.scatter( | ||
keypoints[:, 0], | ||
keypoints[:, 1], | ||
s=scores * 100, | ||
c='cyan', | ||
alpha=0.4 | ||
) | ||
plt.show() | ||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
import type { TaskDataCustom } from ".."; | ||
|
||
const taskData: TaskDataCustom = { | ||
datasets: [ | ||
{ | ||
description: "A dataset of hand keypoints of over 500k examples.", | ||
id: "Vincent-luo/hagrid-mediapipe-hands", | ||
}, | ||
], | ||
demo: { | ||
inputs: [ | ||
{ | ||
filename: "keypoint-detection-input.png", | ||
type: "img", | ||
}, | ||
], | ||
outputs: [ | ||
{ | ||
filename: "keypoint-detection-output.png", | ||
type: "img", | ||
}, | ||
], | ||
}, | ||
metrics: [], | ||
models: [ | ||
{ | ||
description: "A robust keypoint detection model.", | ||
id: "magic-leap-community/superpoint", | ||
}, | ||
{ | ||
description: "Strong keypoint detection model used to detect human pose.", | ||
id: "qualcomm/MediaPipe-Pose-Estimation", | ||
}, | ||
], | ||
spaces: [ | ||
{ | ||
description: "An application that detects hand keypoints in real-time.", | ||
id: "datasciencedojo/Hand-Keypoint-Detection-Realtime", | ||
}, | ||
], | ||
summary: "Keypoint detection is the task of identifying meaningful distinctive points or features in an image.", | ||
widgetModels: [], | ||
youtubeId: "", | ||
}; | ||
|
||
export default taskData; |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<svg xmlns="http://www.w3.org/2000/svg" width="1em" height="1em" viewBox="0 0 32 32" {...$$props}><path fill="currentColor" d="m28.316 13.949l-.632-1.898L17 15.612V4h-2v11.612L4.316 12.051l-.632 1.898l10.684 3.561L7.2 27.066l1.6 1.201l7.2-9.6l7.2 9.6l1.6-1.201l-7.168-9.556z"/></svg> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any datasets for this on the Hub?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually there seems to be a lot, opening PRs now
https://huggingface.co/datasets?sort=trending&search=mediapipe
https://huggingface.co/datasets?sort=trending&search=keypoint
https://huggingface.co/datasets?sort=trending&search=pose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just set to false