-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.html
377 lines (337 loc) · 17.7 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="keywords" content="instance detection, object detection, long-tail, continual learning, lifelong learning, deep learning, computer vision, few-shot learning, zero-shot learning, active learning, transfer learning, catastrophic forgetting, self-supervised learning, weakly supervised learning, uncertainty, interpretability, anomaly detection, robustness, generalization, large-scale testing, simulation testing, evaluation metrics, train-test domain gap, streaming data, few-shot annotation, synthetic data, data bias, diverse annotations, multi-modality, AI safety, autonomous driving, medical image, auto diagnosis, real-world applications, inter-disciplinary research, carnegie mellon, machine learning, finetuning, foundation model, generative AI">
<!-- website icon-->
<link rel="short-icon" href="./static/site/logo.png">
<title>Object Instance Detection</title>
<meta name="description" content="instance detection for robotics and AR/VR">
<link href="https://fonts.googleapis.com/css?family=Google+Sans|Noto+Sans|Castoro" rel="stylesheet">
<link rel="stylesheet" href="./static/css/bulma.min.css">
<link rel="stylesheet" href="./static/css/fontawesome.all.min.css">
<link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css">
<link rel="stylesheet" href="./static/css/index.css">
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script>
<script defer src="./static/js/fontawesome.all.min.js"></script>
<style>
.title-bar {
width: 100%;
height: 0px;
background-image: #ffffffd0;
background-repeat: no-repeat;
background-size: cover;
background-attachment: fixed;
}
.rcorners1 {
border-radius: 10px;
background: #ffffffd0;
padding: 5px;
font-size: 120%;
color: #5c5c5c;
}
.button {
border-radius: 10px;
background: #ffffffd0;
padding: 5px 15px 5px 15px;
}
</style>
</head>
<body>
<section class="hero">
<div class="hero-body">
<div class="container is-max-desktop">
<!-- <div class="title-bar">
<img alt="title-image" src="https://raw.githubusercontent.com/insdet/insdet.github.io/main/static/pics/site/objdet-insdet.png">
</div> -->
<div class="columns is-centered">
<div class="column has-text-centered">
<h1 class="title is-1 publication-title">InsDet<br>
</h1>
<p class="title is-3 publication-title">
The 1<sup>st</sup> workshop on Object <font color="blue">Ins</font>tance <font color="blue">Det</font>ection
</p>
<h1 class="is-is-4"; style="color: #5c5c5c;">in conjunction with ACCV 2024, Hanoi, Vietnam.</h1>
<b>Location: TBD</b><br>
<b>Time: Dec 9</b><br>
</div>
</div>
</div>
</div>
</section>
<section class="hero teaser">
<div class="container is-max-desktop">
<div class="hero-body">
<h1 class="subtitle has-text-centered">
<a href="#overview" class="button"><b>Overview</b></a>
<a href="#schedule" class="button"><b>Schedule</b></a>
<a href="#speakers" class="button"><b>Invited Speakers</b></a>
<a href="#challenge" class="button"><b>Challenge</b></a>
<a href="#dates" class="button"><b>Important Dates</b></a>
<a href="#organizers" class="button"><b>Organizers</b></a>
</h1>
</div>
</div>
</section>
<section class="section" style="margin-top: -50px">
<div class="container is-max-desktop">
<section class="section" id="Overview">
<div class="container is-max-desktop content">
<h2 class="title">Overview</h2>
<div class="content has-text-justified">
Instance Detection (InsDet) is an important but fundamental problem in robotics and AR/VR applications.
Different from Object Detection (ObjDet) aiming to detect all objects belonging to some predefined classes,
InsDet requires detecting specific object instances defined by some examples capturing the instance from multiple views.
For example, in a daily scenario, when fetching a specific object instance (e.g., my-coffee-mug),
a robot must detect it at a distance, distinguishing it from similar objects (e.g., other mugs or cups)
in a cluttered scene for subsequent operations. An illustration of Object Detection vs. Instance Detection is shown below.
We invite researchers to the Challenge Workshop on Object Instance Detection where we investigate multiple directions
through a competition to address InsDet problem.
</div>
<br>
<div class="center">
<img alt="fig1" src="https://raw.githubusercontent.com/insdet/insdet.github.io/main/static/pics/site/objdet-insdet.png">
</div>
</div>
</section>
</div>
</section>
<section class="section" style="margin-top: -50px">
<div class="container is-max-desktop">
<section class="section" id="schedule">
<div class="container is-max-desktop content">
<h2 class="title">Schedule (tentative)</h2>
<div class="content has-text-justified">
<table class="table table-striped">
<tr>
<td width="130">09:00 - 09:15</td>
<td width="300" style="background-color:#e4ffc2">Opening remarks</td>
<td></td>
</tr>
<tr>
<td>09:15 - 09:55</td>
<td style="background-color:#cae1ff">Keynote 1 <b></b></td>
<td><a href="https://aimerykong.github.io/">Shu Kong</a></td>
</tr>
<tr>
<td>09:55 - 10:40</td>
<td style="background-color:#cae1ff">Keynote 2 <b></b></td>
<td><a href="https://yananlix1.github.io/">TBD</a></td>
</tr>
<tr>
<td>10:40 - 10:50</td>
<td style="background-color:#ffe0c6">Coffee break<b></b></td>
<td></td>
</tr>
<tr>
<td>10:50 - 11:05</td>
<td style="background-color:#cae1ff">Challenge Overview and Results<b></b></td>
<td><a href="https://insdet.github.io/">TBD</a></td>
</tr>
<tr>
<td>11:05 - 11:50</td>
<td style="background-color:#cae1ff">Challenge Winner Talks<b></b></td>
<td><a href="https://insdet.github.io/">TBD</a></td>
</tr>
<tr>
<td>11:50 - 12:00</td>
<td style="background-color:#e4ffc2">Closing remarks</td>
<td></td>
</tr>
</table>
</div>
</div>
</section>
</div>
</section>
<section class="section" id="Invited Speakers">
<div class="container is-max-desktop content">
<h2 class="title" id="speakers">Invited Speakers</h2>
<a href="https://aimerykong.github.io/" target="_blank">
<div class="card">
<div class="card-content">
<div class="columns is-vcentered">
<div class="column is-one-quarter">
<figure class="image is-128x128">
<img class="is-rounded" src="./static/pics/people/shu.jpg">
</figure>
</div>
<div class="column">
<p class="title is-4">Shu Kong</p>
<p class="subtitle is-6">UMacau, Texas A&M</p>
</div>
</div>
<div class="content has-text-justified">
Shu Kong is on the faculty of FST, University of Macau, and CSE, Texas A&M University. He leads the Computer Vision Lab. Before that he spent two years as a postdoctoral researcher at the Robotics Institute, CMU. He received his PhD from UC-Irvine. His research interests lie in computer vision, machine learning and robotics, with a particular interest in visual perception and learning in an open world. He has published a number of papers addressing open-world problems and applying their solutions to interdisciplinary research. His paper on open-set recognition received honorable mention for Best Paper / Marr Prize at ICCV2021. He was the lead organizer of the workshops on "open-world vision" at CVPR 2021-2024, and "Dealing with the Novelty in Open Worlds" at WACV 2022 and 2023.
</div>
</div>
</div>
</a>
<!-- <a href="https://yananlix1.github.io/" target="_blank">
<div class="card">
<div class="card-content">
<div class="columns is-vcentered">
<div class="column is-one-quarter">
<figure class="image is-128x128">
<img class="is-rounded" src="./static/pics/people/yanan-li.jpg">
</figure>
</div>
<div class="column">
<p class="title is-4">Yanan Li</p>
<p class="subtitle is-6">Zhejiang Lab</p>
</div>
</div>
<div class="content has-text-justified">
Yanan Li is an associate researcher at Zhejiang Lab, leading a fundamental computer vision/machine learning research group. Before that, she completed her PhD in the Department of Computer Science and Technology from Zhejiang University. Her research interests focus on zero/few-shot learning, open-world learning, long-tailed recognition, object/instance detection and segmentation. She has published a number of papers addressing zero/few-shot learning problems and applying their solutions to interdisciplinary research. She was one of the organizers of the workshops on "open-world vision" at CVPR 2023 and 2024.
</div>
</div>
</div>
</a> -->
<a href="https://instdet.github.io" target="_blank">
<div class="card">
<div class="card-content">
<div class="columns is-vcentered">
<div class="column is-one-quarter">
<figure class="image is-128x128">
<img class="is-rounded" src="./static/pics/people/profile-dummy.jpg">
</figure>
</div>
<div class="column">
<p class="title is-4">TBD</p>
<p class="subtitle is-6">TBD</p>
</div>
</div>
<div class="content has-text-justified"></div>
</div>
</div>
</a>
</div>
</section>
<section class="section" style="margin-top: -50px">
<div class="container is-max-desktop">
<section class="section" id="Challenge">
<div class="container is-max-desktop content">
<h2 class="title" id="challenge">Challenge</h2>
<div class="content has-text-justified">
This year, we plan to run a competition on our InsDet dataset, which is the first instance detection benchmark dataset which is larger
in scale and more challenging than existing InsDet datasets. The major strengths of our InsDet dataset over prior InsDet datasets include
(1) both high-resolution profile images of object instances and high-resolution testing images from more realistic indoor scenes,
simulating real-world indoor robots locating and recognizing object instances from a cluttered indoor scene in a distance
(2) a realistic unified InsDet protocol to foster the InsDet research.
<ul>
<b><li>A realistic unified InsDet protocol.</li></b>
In real-world indoor robotic applications, we consider the scenario that assistive robots must locate and recognize instances to fetch them
in a cluttered indoor scene. For a given object instance, the robots should see it only from a few views at the training stage,
and then accurately detect it at a distance in any scene at the testing stage.
<b><li>InsDet in the closed-world.</li></b>
InsDet has been explored in a closed-world setting, which allows access to profile images during model development.
While one can exploit profile images to train models, it is still unknown how testing images look like when encountered in the open world.
Prevalent methods adopt a cut-paste-learn strategy [10] that cuts and pastes profile images on random background photos
(sampled in the open world) to generate synthetic training data and uses such synthetic data to train a detector.
<b><li>InsDet in the open-world.</li></b>
The challenge of InsDet lies in its open-world nature that one has no knowledge of data distribution at test-time,
which can be unknown testing scene imagery, unexpected scene clutter, and novel object instances specified only in testing.
Prevalent methods exploit the open world by using foundation models and by using diverse data to pretrain InsDet models.
</ul>
<br>
We use <a href="https://eval.ai/web/challenges/challenge-page/2358/overview"><b>EvalAI</b></a> as submission protal.
<b>The team with the top-performing submission will be invited to give short talks during the workshop.</b>
</div>
</div>
</section>
</div>
</section>
<section class="section" style="margin-top: -50px">
<div class="container is-max-desktop">
<section class="section" id="Dates">
<div class="container is-max-desktop content">
<h2 class="title" id="dates">Important Dates</h2>
<ul><b>Data Instructions & Helper Scripts</b>: September 10, 2024</ul>
<ul><b>Dev Phase Start</b>: September 10, 2024</ul>
<ul><b>Submission Portal Start</b>: September 10, 2024</ul>
<ul><b>Test Phase Start</b>: October 20, 2024</ul>
<ul><b>Test Phase End</b>: November 30, 2024</ul>
<ul><b>Winner Notification</b>: December 01, 2024</ul>
</div>
</section>
</div>
</section>
<section class="section" id="Organizers">
<div class="container is-max-desktop content">
<h2 class="title" id="organizers">Organizers</h2>
<div class="columns is-centered is-variable is-0">
<div style="display: flex">
<div style="width:25%; justify-content: center">
<a href="https://shenqq377.github.io/">
<img alt="Qianqian Shen" src="static/pics/people/qianqian-shen.png" height="200" width ="200" style = "border-radius: 50%; object-fit: cover; ">
</a><br>
<a href="https://shenqq377.github.io/">Qianqian Shen</a><br>
Zhejiang University
</div>
<div style="width:7.5%">
</div>
<div style="width:25%; justify-content: center">
<a href="https://www.aminer.cn/profile/5617e32a45cedb3397c418c6/">
<img alt="Haishuai Wang" src="static/pics/people/haishuai-wang.jpg" height="200" width ="200" style = "border-radius: 50%; object-fit: cover; ">
</a><br>
<a href="https://www.aminer.cn/profile/5617e32a45cedb3397c418c6">Haishuai Wang</a><br>
Zhejiang University
</div>
<div style="width:7.5%">
</div>
<div style="width:25%; justify-content: center">
<a href="https://yananlix1.github.io/">
<img alt="Yanan Li" src="static/pics/people/yanan-li.jpg" height="200" width ="200" style = "border-radius: 50%; object-fit: cover; ">
</a><br>
<a href="https://yananlix1.github.io/">Yanan Li</a><br>
Zhejiang Lab
</div>
<div style="width:7.5%">
</div>
<div style="width:25%; justify-content: center">
<a href="https://ics.uci.edu/~yunhaz5/">
<img alt="Yunhan Zhao" src="static/pics/people/yunhan-zhao.png" height="200" width ="200" style = "border-radius: 50%; object-fit: cover; ">
</a><br>
<a href="https://ics.uci.edu/~yunhaz5/">Yunhan Zhao</a><br>
UC Irvine
</div>
<div style="width:7.5%">
</div>
<div style="width:25%; justify-content: center">
<a href="https://nahyunkwon.github.io/">
<img alt="Nahyun Kwon" src="static/pics/people/nahyun-kwon.png" height="200" width ="200" style = "border-radius: 50%; object-fit: cover; ">
</a><br>
<a href="https://nahyunkwon.github.io/">Nahyun Kwon</a><br>
Texas A&M
</div>
<div style="width:7.5%">
</div>
<div style="width:25%; justify-content: center">
<a href="https://insdet.github.io/">
<img alt="Kelu Yao" src="static/pics/people/kelu-yao.png" height="200" width ="200" style = "border-radius: 50%; object-fit: cover; ">
</a><br>
<a href="https://insdet.github.io/">Kelu Yao</a><br>
Zhejiang Lab
</div>
<div style="width:7.5%">
</div>
</div>
</div>
</div>
</section>
<footer class="footer">
<div class="container">
<div class="content">
<p>
It borrows the source code of <a href="https://github.com/nerfies/nerfies.github.io">this website</a>.
We would like to thank Utkarsh Sinha and Keunhong Park.
</p>
</div>
</div>
</footer>
</body>
<script src="js/jquery-2.1.1.js"></script>
<script src="js/jquery.mobile.custom.min.js"></script> <!-- Resource jQuery -->
<script src="js/main.js"></script> <!-- Resource jQuery -->
</html>