-
Good job.When I see the dataset,I want to konw you train and evaluate the datasets by samples(such as click action) or task(such as a whole dialog)?And I want to konw which action don't need coordinates. thank you |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
The following intents are both used for training and evaluation: click, textinput, load, say, submit. The training objective is to predict the next action to be performed by a navigator given the current state (HTML, bboxes, screenshot for MM models) and action/utterance history. We do turn-level evaluation and aggregate the score. |
Beta Was this translation helpful? Give feedback.
The following intents are both used for training and evaluation: click, textinput, load, say, submit.
The training objective is to predict the next action to be performed by a navigator given the current state (HTML, bboxes, screenshot for MM models) and action/utterance history. We do turn-level evaluation and aggregate the score.