Skip to content

Animation via TTS instead of driving video #58

@xrtze

Description

@xrtze

Your approach first extracts 1D facial motion embeddings (local
facial dynamics), and 3D implicit keypoints (global pose, position, and scale) from the driver video. Is there a possibility to substitute this first step with an existing implementation to generate the animation cues from Audio/TTS?
This would for efficient portrait animation with Audio/text+TTS as driver.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions