-
-
Notifications
You must be signed in to change notification settings - Fork 476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use ARCamera as input for MediaPipe? #343
Comments
In the following, I assume that you are using
Not really. MediaPipeUnityPlugin/Assets/Mediapipe/Samples/Scenes/Pose Tracking/PoseTrackingSolution.cs Lines 134 to 135 in 2994e44
MediaPipeUnityPlugin/Assets/Mediapipe/Samples/Common/Scripts/Solution.cs Lines 88 to 108 in 2994e44
So if you have // ReadFromImageSource(imageSource, textureFrame);
// Texture2D texture2d;
textureFrame.ReadTextureFromOnCPU(texture2d); If you want to do it the right way, you will have to implement an |
I would also suggest trying out the XRCPUImage with AR Foundation, let me know if you get any further with this |
@homuler Thanks for the help! Sorry for the late reaction. I managed to find a solution that is kind of hacky but it works for now. Planning on cleaning it up soon and trying to find a more low level fix. For those interested:
|
I also have written a minimal code to run the Face Detection solution for those who've gotten to this issue. // Copyright (c) 2021 homuler
//
// Use of this source code is governed by an MIT-style
// license that can be found in the LICENSE file or at
// https://opensource.org/licenses/MIT.
using Mediapipe;
using Mediapipe.Unity;
using System;
using System.Collections;
using Unity.Collections;
using Unity.Collections.LowLevel.Unsafe;
using UnityEngine;
using UnityEngine.XR.ARFoundation;
using UnityEngine.XR.ARSubsystems;
using Stopwatch = System.Diagnostics.Stopwatch;
public class ARCameraManagerTest : MonoBehaviour
{
[SerializeField] private ARCameraManager _cameraManager;
[SerializeField] private TextAsset _configText; // attach `face_detection_gpu.txt`
private CalculatorGraph _calculatorGraph;
private NativeArray<byte> _buffer;
private Stopwatch _stopwatch;
private ResourceManager _resourceManager;
private GpuResources _gpuResources;
private IEnumerator Start()
{
_cameraManager.frameReceived += OnCameraFrameReceived;
_stopwatch = new Stopwatch();
_resourceManager = new StreamingAssetsResourceManager();
yield return _resourceManager.PrepareAssetAsync("face_detection_short_range.bytes");
yield return _resourceManager.PrepareAssetAsync("face_detection_full_range_sparse.bytes");
_gpuResources = GpuResources.Create().Value();
_calculatorGraph = new CalculatorGraph(_configText.text);
_calculatorGraph.SetGpuResources(_gpuResources).AssertOk();
_calculatorGraph.ObserveOutputStream("face_detections", 0, OutputCallback, true).AssertOk();
var sidePacket = new SidePacket();
sidePacket.Emplace("input_rotation", new IntPacket(0));
sidePacket.Emplace("input_horizontally_flipped", new BoolPacket(false));
sidePacket.Emplace("input_vertically_flipped", new BoolPacket(true));
sidePacket.Emplace("model_type", new IntPacket(0));
_calculatorGraph.StartRun(sidePacket).AssertOk();
_stopwatch.Start();
}
private void OnDestroy()
{
_cameraManager.frameReceived -= OnCameraFrameReceived;
var status = _calculatorGraph.CloseAllPacketSources();
if (!status.Ok())
{
Debug.Log($"Failed to close packet sources: {status}");
}
status = _calculatorGraph.WaitUntilDone();
if (!status.Ok())
{
Debug.Log(status);
}
_calculatorGraph.Dispose();
_gpuResources.Dispose();
_buffer.Dispose();
}
private unsafe void OnCameraFrameReceived(ARCameraFrameEventArgs eventArgs)
{
if (_cameraManager.TryAcquireLatestCpuImage(out var image))
{
InitBuffer(image);
var conversionParams = new XRCpuImage.ConversionParams(image, TextureFormat.RGBA32);
var ptr = (IntPtr)NativeArrayUnsafeUtility.GetUnsafePtr(_buffer);
image.Convert(conversionParams, ptr, _buffer.Length);
image.Dispose();
var imageFrame = new ImageFrame(ImageFormat.Types.Format.Srgba, image.width, image.height, 4 * image.width, _buffer);
var currentTimestamp = _stopwatch.ElapsedTicks / (TimeSpan.TicksPerMillisecond / 1000);
var imageFramePacket = new ImageFramePacket(imageFrame, new Timestamp(currentTimestamp));
_calculatorGraph.AddPacketToInputStream("input_video", imageFramePacket).AssertOk();
}
}
private void InitBuffer(XRCpuImage image)
{
var length = image.width * image.height * 4;
if (_buffer == null || _buffer.Length != length)
{
_buffer = new NativeArray<byte>(length, Allocator.Persistent, NativeArrayOptions.UninitializedMemory);
}
}
[AOT.MonoPInvokeCallback(typeof(CalculatorGraph.NativePacketCallback))]
private static IntPtr OutputCallback(IntPtr graphPtr, int steramId, IntPtr packetPtr)
{
try
{
using (var packet = new DetectionVectorPacket(packetPtr, false))
{
var value = packet.IsEmpty() ? null : packet.Get();
if (value != null && value.Count > 0)
{
foreach (var detection in value)
{
Debug.Log(detection);
}
}
}
return Status.Ok().mpPtr;
}
catch (Exception e)
{
return Status.FailedPrecondition(e.ToString()).mpPtr;
}
}
} |
@homuler Error at line 39. I'm using the latest MediaPipeUnityPlugin-all.zip |
@pinak1999 Please see #803 (comment) |
I'm also getting the same @homuler - How is #803 (comment) related to this issue? Because it seems like that comment is using Doesn't |
Line 16 in 01cdd56
You may rewrite the |
I see, thanks! I tried the suggested approach but I'm getting the following error related to the Any thoughts? Here's the code I added:
|
If you want to use FaceDetection, see the MediaPipeUnityPlugin/Assets/MediaPipeUnity/Samples/Scenes/Face Detection/FaceDetectionGraph.cs Lines 78 to 79 in 01cdd56
Note that the output type of
See also #803 (comment). Incidentally, the |
Thank you! I fixed the output type and now it works well for face detection :) I actually wanted to do this ARCore integration for object detection, but gave face detection a try as the first step. Now I want to make use of the object detector output by adding more lines into
What I wanted to do is similar to #1037 (comment). Do you know how I could address this issue? :) |
Ah, alright, based on #935 (comment), I realized I should use a question mark, i.e., |
Hi! I just wanted to follow up on something related to this ARcore integration :) I realized that the object detector/classifier works much better if the phone is held at a certain orientation. For example, it works much better if I hold the phone in landscape mode, compared to portrait mode. I was wondering if there is a way to set the orientation of MediaPipe detection somehow in the code? |
@homuler I am trying to implement this but I am not getting it working. Below is the steps i followed
Now I have Texture2D from OnARCameraFrameReceived and applying this to textureFrame.ReadTextureFromOnCPU(OutputTextureFromARCamera); (This is not in ImageSourceSolution and I am using holistic) From here I am not sure I am doing it correctly or not Please share your thoughts |
At least, you can rotate the input image on the Unity side MediaPipeUnityPlugin/Assets/MediaPipeUnity/Samples/Scenes/Tasks/Face Detection/FaceDetectorRunner.cs Line 80 in 2d2863e
or using ImageTransformationCalculator .MediaPipeUnityPlugin/Assets/MediaPipeUnity/Samples/Scenes/Object Detection/object_detection_gpu.txt Lines 65 to 78 in 2d2863e
|
@KiranJodhani Will you create a new issue? I'm sorry but I'm not sure what is the problem. |
Super helpful reply (#343 (comment)), thank you @homuler! :) As a follow-up, I'm wondering if there is a way to access the originating input frame in an executed Do you know if there is a way to get the successful camera frame when |
The Task API is designed to receive input images through a callback, Line 27 in 2d2863e
but OutputStream is not, so it may be difficult if not impossible.
If it's acceptable, you can get the same result by executing the graph synchronously. In this case, probably you have the reference to the input image after getting the result. |
Oh great, good to know, thanks @homuler! Do you have any example code on how to use Task API instead of Also, any pointers on how to execute the graph synchronously would be helpful too! |
Note that ObjectDetector is not ported yet.
See https://github.com/homuler/MediaPipeUnityPlugin/wiki/Getting-Started#get-imageframe. |
Great, thanks @homuler! Any idea when ObjectDetector would be ported for Task API? Thinking about it, I feel like the synchronous approach might cause problems for real-time ARCore. Have you tried this with AR solutions in the past? Your idea of storing the last few frames in an array for reference, and later searching for the one with the same timestamp as the output timestamp in the |
Hi @homuler! :) I would appreciate if you could share any insights you might have about the comment above ^^ |
Maybe when I feel motivated. If not pressured by others, it might be achievable by around next month.
Ah, it's designed not to pass the timestamp to the callback, so you'll need to make modifications to the following lines. MediaPipeUnityPlugin/Packages/com.github.homuler.mediapipe/Runtime/Scripts/Unity/OutputStream.cs Line 400 in 2d2863e
P.S. Responding to closed issues is challenging, so if necessary, please create a new issue. |
Hi,
So for a project I'm working on I got the mediapipe sample to work with the webcam as done in the Sample project as well. However in my project I use the ARCameraManager from Unity to render the camera image to the screen. I need this camera because I am also trying to get the depth from this camera.
This is currently giving me issues as MediaPipe tries to start and access the webcam, while ARCamera is already using the camera. I tried to make the MediaPipe sample to work with the ARCamera but failed. It's tightly coupled to the webcam as for now. Is there any input or help I could get regarding this issue? Perhaps someone has already managed to get it to work with the ARFoundation ARCameraManager?
In short what I'm trying to achieve: Give MediaPipe the texture2D I get from ARCameraManager(I managed to get this texture already) and get the pose from that source.
The text was updated successfully, but these errors were encountered: