Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text To Speech to Facial BlendShapes #4428

Open
GeorgeS2019 opened this issue May 18, 2023 · 11 comments
Open

Text To Speech to Facial BlendShapes #4428

GeorgeS2019 opened this issue May 18, 2023 · 11 comments
Assignees
Labels
legacy:face mesh Issues related to Face Mesh platform:unity MediaPipe Unity issues stat:awaiting googler Waiting for Google Engineer's Response type:feature Enhancement in the New Functionality or Request for a New Solution

Comments

@GeorgeS2019
Copy link

GeorgeS2019 commented May 18, 2023

MediaPipe Solution (you are using)

Part: 2 => Face Blendshape: May 2023 ->?
Part: 1 => Done: ARKit 52 blendshapes support request. June 2022 to April 2023 Completed

Programming language

c#

Are you willing to contribute it

Yes:

Describe the feature and the current behaviour/state

From the Modelling part using Godot
https://github.com/srcnalt/ReadyPlayerMe-Godot-Test/issues/1#issue-1713856035

Will this change the current API? How?

YES, additional non-conflicting API to the existing current API

Who will benefit with this feature?

Anyone who use MediaPipe BlendShape. It is NEXT STEP to Deep AI (Integrating Deep Audio to MediaPipe)

Please specify the use cases for this feature

User use ChatGPT or something similar to generate replies and this new feature translate the replies to speech with corresponding Avatar Blendshapes manipulation

Any Other info

No response

@GeorgeS2019 GeorgeS2019 added the type:feature Enhancement in the New Functionality or Request for a New Solution label May 18, 2023
@GeorgeS2019
Copy link
Author

GeorgeS2019 commented May 18, 2023

How the API looks Like ?

Given a ChatGPT or something similar from Google reply in text, the API will receive this string and output

  1. the corresponding facial blendshapes as Time coordinated list of Dictionary[ blendshapeName, blendshapeValueFloat]
  2. Voice (mp3 or WAV) that aligns with the blendshapeValues

@endink
Copy link

endink commented May 18, 2023

I have done this feature in Unreal Engine, it is easy to implement It use PaddleLite + OvrLipSync .😄

@GeorgeS2019
Copy link
Author

@endink
This is just Part 2 of many parts ahead :-)

@FishWoWater
Copy link

FishWoWater commented May 19, 2023

Agreed! It would be really exciting if blendshapes could be estimated and aligned with input audio clip.

I am currently working on a pipeline: user voice->speech recognition->chatgpt->text to speech->blendshapes. There exist many mature solutions except for the last stage (speech2blendshapes). Lipsync and face good can possibly do this, but have their limitations or problems. This feature will benefit the mediapipe community.

@ayushgdev ayushgdev added legacy:face mesh Issues related to Face Mesh platform:unity MediaPipe Unity issues labels May 22, 2023
@ayushgdev
Copy link
Contributor

Hello @GeorgeS2019 Thanks for raising this amazing feature request. We will discuss it internally and prioritise it in our roadmap. However, just a heads up, we are working in numerous fronts as of now hence this might get delayed.

@ayushgdev ayushgdev added the stat:awaiting response Waiting for user response label May 22, 2023
@GeorgeS2019
Copy link
Author

Now working, the BlendShape part in 8th Top Ranked Github Open source 3D game engine: Godot
@srcnalt
@kaiidams
@SpookyCorgi
@you-win
@j20001970
Godot_v4 0 3-rc2_mono_win64_JU4OlmIfLZ

@google-ml-butler google-ml-butler bot removed the stat:awaiting response Waiting for user response label May 27, 2023
@kuaashish kuaashish assigned lu-wang-g and unassigned ayushgdev Jun 6, 2023
@kuaashish
Copy link
Collaborator

Hello @lu-wang-g,
Could you please look into this amazing feature request? Thank you!!

@kuaashish kuaashish added the stat:awaiting googler Waiting for Google Engineer's Response label Jun 6, 2023
@lu-wang-g
Copy link
Contributor

At I/O 2023, Google released the demo app, Talking Character (https://developers.googleblog.com/2023/05/generative-ai-talking-character.html), which IIUC fits exactly the use case described here. The Web demo is partially open sourced here. You can find useful pieces of components in the directory. There has also been a discussion of releasing the talking character pipeline through MediaPipe, but we don't have concrete plan yet.

@ayushgdev and @kuaashish, do we have ways to track user requests like this?

@tiamy
Copy link

tiamy commented Sep 20, 2023

+1

@kuaashish kuaashish assigned yichunk and unassigned lu-wang-g Jan 8, 2024
@GeorgeS2019
Copy link
Author

We now have C# wrapper of Godot Mediapipe

@GeorgeS2019
Copy link
Author

The Godot community will attempt Text to Face => follow here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
legacy:face mesh Issues related to Face Mesh platform:unity MediaPipe Unity issues stat:awaiting googler Waiting for Google Engineer's Response type:feature Enhancement in the New Functionality or Request for a New Solution
Projects
None yet
Development

No branches or pull requests

8 participants