-
Notifications
You must be signed in to change notification settings - Fork 0
/
proposal.html
190 lines (156 loc) · 8.88 KB
/
proposal.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
<!DOCTYPE html>
<!--
Editorial by HTML5 UP
html5up.net | @ajlkn
Free for personal and commercial use under the CCA 3.0 license (html5up.net/license)
-->
<html><head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<title>CS766 Project (LipGAN)</title>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, user-scalable=no">
<!--[if lte IE 8]><script src="assets/js/ie/html5shiv.js"></script><![endif]-->
<link rel="stylesheet" href="css_js_files/main.css">
<link rel="stylesheet" href="https://use.fontawesome.com/releases/v5.13.0/css/all.css">
<!--[if lte IE 9]><link rel="stylesheet" href="assets/css/ie9.css" /><![endif]-->
<!--[if lte IE 8]><link rel="stylesheet" href="assets/css/ie8.css" /><![endif]-->
</head>
<body class="">
<!-- Wrapper -->
<div id="wrapper">
<!-- Main -->
<div id="main">
<div class="inner">
<div></div>
<div></div>
<div></div>
<!-- Header -->
<h1>LipGAN: Speech to Lip Sync Generation </h1>
<h2>Proposal <a href="files/CS766_ProjectProposal.pdf"> (PDF) </a> </h2>
<section>
<header class="major">
<h2>Motivation</h2>
</header>
<div>
<artical>
With the increasing amount of multi-media content, there has been multiple research on synthesizing accurate, realistic talking face videos. This technique is broadly applicable to many scenarios such as realistic dubbing in the movie industry, conversational agents, virtual anchors, and gaming. There has become an increasingly large market for dubbing a foreign language onto videos; as the rate of video content creation increases, so does the need for accessibility for viewers across the globe. Providing a natural lip movement and facial expression generation improves the user's experience in these applications.
Despite the recent advances and its wide applicability, synthesizing a clear, accurate and human-like performance is still a challenging task. We want to explore the state-of-the-art techniques and their limitations. <br><br>
<p>
In this course project, we explore the problem of lip-syncing a talking face video to match the target speech segment to the lip and facial expression of the person in the video. The primary task is to achieve accurate audio-video synchronisation given a person's face and target audio clip. We can extend it to be speaker-independent model to produce lip-sync in the ''wild'', where videos feature faces that are dynamic and unconstrained.
</p>
</artical>
</div>
<!-- <span><img src="css_js_files/proposal_scale.png" alt="" width="30%"></span>
<span><img src="css_js_files/proposal_crop.png" alt="" width="30%"></span>
<span><img src="css_js_files/proposal_sc.png" alt="" width="30%"></span>
<figcaption>Fig.2. From left to right, Result of scaling, Cropping, Seam-carving</figcaption> -->
</section>
<section>
<header class="major">
<h2>State-of-the-art Model</h2>
</header>
<span>
<img src="files/a1_lip_GAN.png" alt="" width="100%">
<figcaption>Fig.1 LipGAN architecture. <a href="http://cdn.iiit.ac.in/cdn/cvit.iiit.ac.in/images/Projects/facetoface_translation/paper.pdf"> Paper Link </a>
</figcaption>
</span>
<br><br>
<span>
<img src="files/a1_lip_gan_output.gif" alt="" width="48%" class="center">
</span>
<!-- <br><br>
<h3> LipGAN Paper Video Demo
</h3> -->
<span>
<iframe width="460" height="315" src="https://www.youtube.com/embed/aHG6Oei8jF0" title="LipGAN Paper Video Demo" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen>
</iframe>
</span>
<figcaption>Fig.2 From left to right: (a)<a href="http://cdn.iiit.ac.in/cdn/cvit.iiit.ac.in/images/Projects/facetoface_translation/paper.pdf"> LipGAN output gif. </a> (b) <a href="https://www.youtube.com/embed/aHG6Oei8jF0"> LipGAN Paper Video Demo </a>
</figcaption>
</section>
<section>
<header class="major">
<h2>Time Table</h2>
</header>
<div>
<table class="table table-striped">
<thead>
<tr>
<th scope="col" style="width: 20%">Time</th>
<th scope="col">Work</th>
</tr>
</thead>
<tbody>
<tr>
<th scope="row">Till Feb 24:</th>
<td>Come up with a project proposal document and create web page. <br> Discussion about possible techniques and directions to be explored.</td>
</tr>
<tr>
<th scope="row">Feb 25 - Mar 07</th>
<td>Set up the running environment for the state-of-the-art methods.<br> Start implementing an existing approach as a baseline. </td>
</tr>
<tr>
<th scope="row">Mar 08 - Mar 20</th>
<td>Have one working implementation of an existing approach (LipGAN).</td>
</tr>
<tr>
<th scope="row">Mar 21 - Mar 24</th>
<td> Write Mid-term report. </td>
</tr>
<tr>
<th scope="row">Mar 25 - Apr 11</th>
<td>Try experimenting with different loss formulations, multi-task framework. <br> Explore alternate methods available in literature.</td>
</tr>
<tr>
<th scope="row">Apr 12 - Apr 23 </th>
<td>Summarize result and ablation studies. Complete all experiments.</td>
</tr>
<tr>
<th scope="row">Apr 25 - May 5 </th>
<td>Complete course project web page.</td>
</tr>
</tbody>
</table>
</div>
</section>
</div>
</div>
<!-- Sidebar -->
<div id="sidebar">
<div class="inner" style="">
<!-- Menu -->
<nav id="menu">
<header class="major">
<h2>CS766 Project (LipGAN)</h2>
</header>
<ul>
<li><a href="proposal.html"><b>Proposal</b></a></li>
<li><a href="midterm.html">Mid-Term Report</a></li>
<li><a href="index.html">Final Report</a></li>
<!-- <li><a href="elements.html">Elements</a></li>-->
</ul>
</nav>
<!-- Team -->
<subheader class="major">
<h2>Team Members</h2>
</subheader>
<ul>
<li><a href="https://abhayk1201.github.io/">Abhay Kumar</a></li>
<li><a href="">Maryam Vazirabd</a></li>
<li><a href="">Elizabeth Murphy</a></li>
<!-- <li><a href="elements.html">Elements</a></li>-->
</ul>
<!-- Footer -->
<footer id="footer">
<p class="copyright">© Abhay Kumar, Maryam Vazirabd, Elizabeth Murphy Design: <a href="https://html5up.net/">HTML5 UP</a>.</p>
</footer>
</div>
<a href="#sidebar" class="toggle">Toggle</a></div>
</div>
<!-- Scripts -->
<script src="css_js_files/jquery.js"></script>
<script src="css_js_files/skel.js"></script>
<script src="css_js_files/util.js"></script>
<!--[if lte IE 8]><script src="assets/js/ie/respond.min.js"></script><![endif]-->
<script src="css_js_files/main.js"></script>
</body></html>