Skip to content

May I ask you train and evaluate dataset by samples or by tasks? #18

Answered by xhluca
miyapeng asked this question in Q&A
Discussion options

You must be logged in to vote

The following intents are both used for training and evaluation: click, textinput, load, say, submit.

The training objective is to predict the next action to be performed by a navigator given the current state (HTML, bboxes, screenshot for MM models) and action/utterance history. We do turn-level evaluation and aggregate the score.

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by xhluca
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants
Converted from issue

This discussion was converted from issue #17 on March 29, 2024 14:50.