Skip to content

crowd-deliberation/data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Crowd Deliberation

This is the official repository of the Crowd Deliberation data set. There are three different files with relevant data:

  • data.csv which lists all 80 texts that were analyzed in the study: 40 from the Sarcasm data set, and 40 from the Relation data set.
  • labels.csv which lists all labels collected from crowd workers for the 80 texts listed in data.csv.
  • deliberations.csv which lists all group deliberations that were done on individual cases. Each group deliberation has 3 members who may or may not have been active in each of the Justify and Reconsider sessions.

Columns in data.csv


  • DATASET: the identifier of the data set that text belongs to; either "deliberation-sarcasm" or "deliberation-relation-person-place"
  • DATA_ID: the unique numeric ID of that text, used to match entries in labels.csv and deliberations.csv
  • TEXT: the actual text content for that text
  • GROUND_TRUTH_LABEL: the correct label for that text; only available for the first 25 cases of the "deliberation-relation-person-place" data set.

Columns in labels.csv


  • DATA_ID: the numeric ID of the text that label was given for
  • DATASET: the data set this text belongs to
  • ANNOTATOR_ID: the numeric ID of the crowd worker who provided that label
  • LABEL_ID: the unique numeric ID of that label, used to match entries in deliberations.csv
  • ORIGINAL_LABEL: the actual label provided; one of "sarcastic" / "not_sarcastic" for the Sarcasm data set, and "relation_expressed" / "not_expressed" for the Relation Extraction data set
  • ORIGINAL_CONFIDENCE: the confidence for the label provided; one of 0 (Not Sure) / 0.5 (Sure) / 1 (Very Sure)
  • EVIDENCE: A list of text snippets highlighted as evidence for the provided label in JSON format: [ { "start": 123, "end": 456, "quote": "The highlight text snippet.", "comment": "An optional comment from the annotator." }, { ... }, { ... } ]; annotators were required to highlight at least one text snippet, but they could highlight as many as they wanted.
  • QUESTION_DO_YOU_THINK_OTHER_PEOPLE_MIGHT_CHOOSE_A_DIFFERENT_ANSWER_THAN_YOU_DID: One of "I expect most people to agree with me." / "I expect only about half of the people to agree with me." / "I expect most people to disagree with me."
  • QUESTION_WHY_DO_YOU_THINK_OTHER_PEOPLE_MIGHT_CHOOSE_A_DIFFERENT_ANSWER: Annotators were shown a list of six possible reasons and an "Other: ______" field to provide a free-form text answer; they had to check at least one, but could check multiple of them if they had indicated they expected some disagreement from other people; this column contains the list of all answers checked, including a potential free-form answer in JSON format: [ "The text contains relevant details other people could easily miss.", "This is a case where a person's answer would depend heavily on their personal preferences and taste.", "Other: some other free-form answer." ]
  • QUESTION_PLEASE_ELABORATE_ON_YOUR_ANSWER_TO_THE_PREVIOUS_QUESTION_EXPLAINING_WHY_YOU_THINK_OTHER_PEOPLE_MIGHT_CHOOSE_A_DIFFERENT_ANSWER: Free-form text answer
  • QUESTION_IF_THERE_WERE_OTHER_PEOPLE_WHO_CHOSE_A_DIFFERENT_ANSWER_THAN_YOU_DID_DO_YOU_THINK_A_GROUP_DISCUSSION_WOULD_HELP_TO_RESOLVE_THE_CASE: One of "Yes, a group discussion would help to resolve the case." / "No, a group discussion would not help to resolve the case."
  • DELIBERATION_ID: the numeric ID of the deliberation this label was discussed in if any. If this label was never part of any group discussion, this field is empty.

Columns in deliberations.csv


  • DELIBERATION_ID: The unique numeric ID of that deliberation
  • DATA_ID: The numeric ID of the text discussed in that deliberatino
  • WAS_RESOLVED: Whether this case was resolved; one of "True" / "False"
  • NUM_DISSENTING_OPINIONS_DISCUSSED: The number of dissenting opinions discussed in this deliberation; one of 0, 2, 3
  • NUM_DISSENTING_OPINIONS_RECONSIDERED: The number of dissenting opinions reconsidered in this deliberation; one of 0, 2, 3
  • NUM_DISSENTING_OPINIONS_DISCUSSED_AND_RECONSIDERED: The number of dissenting opinions discussed AND reconsidered in this deliberation; one of 0, 2, 3
  • NUM_DISSENTING_OPINIONS_DISCUSSED_RECONSIDERED_AND_CONCLUDED: The number of dissenting opinions discussed, reconsidered AND concluded in this deliberation; one of 0, 2, 3
  • MESSAGES: The chat history of messages exchanged in this deliberation in JSON format: [ { "user_id": 123, "user_pseudonym": "Happy Hippo", "content": "This is my message for the group." }, { ... }, { ... } ]

The following fields starting with MEMBER_ exist for all 3 members of the group discussion, starting with MEMBER_1_, MEMBER_2_, MEMBER_3_ respectively:

  • MEMBER_X_USER_ID: The numeric ID of group member X
  • MEMBER_X_NAME: The human-readable pseudonym assigned to group member X
  • MEMBER_X_LABEL_ID: The numeric ID of group member X's label discussed in this deliberation
  • MEMBER_X_ORIGINAL_LABEL: Group member X's original label; one of "sarcastic" / "not_sarcastic" for the Sarcasm data set, and "relation_expressed" / "not_expressed" for the Relation Extraction data set
  • MEMBER_X_ORIGINAL_CONFIDENCE: Group member X's original confidence in her label; one of 0 (Not Sure) / 0.5 (Sure) / 1 (Very Sure)
  • MEMBER_X_DID_DISCUSS: Whether group member X came back for follow-up session 1 to discuss the case; one of "True" / "False"
  • MEMBER_X_DID_RECONSIDER: Whether group member X came back for follow-up session 2 to reconsider her position; one of "True" / "False"
  • MEMBER_X_RECONSIDERED_LABEL: Group member X's reconsidered label; one of "sarcastic" / "not_sarcastic" for the Sarcasm data set, and "relation_expressed" / "not_expressed" for the Relation Extraction data set
  • MEMBER_X_RECONSIDERED_CONFIDENCE: Group member X's reconsidered confidence in her label; one of 0 (Not Sure) / 0.5 (Sure) / 1 (Very Sure)
  • MEMBER_X_QUESTION_BASED_ON_YOUR_DELIBERATION_WHY_DO_YOU_THINK_THE_OTHER_PEOPLE_IN_THE_GROUP_CHOSE_A_DIFFERENT_ANSWER: Same format as in column QUESTION_WHY_DO_YOU_THINK_OTHER_PEOPLE_MIGHT_CHOOSE_A_DIFFERENT_ANSWER for labels.csv to make both answers easily comparable.
  • MEMBER_X_QUESTION_PLEASE_ELABORATE_ON_YOUR_ANSWER_TO_THE_PREVIOUS_QUESTION_FOR_EXAMPLE_IF_YOU_CHANGED_YOUR_MIND_ABOUT_THE_SOURCE_OF_DISAGREEMENT_PLEASE_EXPLAIN_WHY: Free-form text answer
  • MEMBER_X_DID_CONCLUDE: Whether group member X came back for follow-up session 3 to conclude the case and give their assessment on why the case could be/could not be resolved; one of "True" / "False"
  • MEMBER_X_QUESTION_WHY_DO_YOU_THINK_THIS_CASE_COULD_BE_RESOLVED: Free-form text answer
  • MEMBER_X_QUESTION_WHY_DO_YOU_THINK_THIS_CASE_COULD_NOT_BE_FULLY_RESOLVED: Free-form text answer
  • MEMBER_X_QUESTION_DID_SOMEBODY_MAKE_YOU_DOUBT_YOUR_ORIGINAL_ANSWER_WHY_OR_WHY_NOT: A text string starting with "Yes" / "No" followed with an explanation.
  • MEMBER_X_QUESTION_DID_SOMEBODY_MAKE_YOU_CHANGE_YOUR_ORIGINAL_ANSWER_WHY_OR_WHY_NOT: A text string starting with "Yes" / "No" followed with an explanation.
  • MEMBER_X_QUESTION_DID_YOU_MANAGE_TO_CONVINCE_SOMEONE_TO_CHANGE_THEIR_ANSWER_OR_CONFIDENCE_LEVEL_WHY_DO_YOU_THINK_YOU_WERE_ABLEUNABLE_TO_CONVINCE_THEM: A text string starting with "Yes" / "No" followed with an explanation.
  • MEMBER_X_QUESTION_DESCRIBE_HOW_YOU_FEEL_ABOUT_THE_DELIBERATION_PROCESS: Free-form text answer
  • MEMBER_X_QUESTION_DESCRIBE_HOW_YOU_FEEL_ABOUT_THE_DELIBERATION_OUTCOME: Free-form text answer

About

The official Crowd Deliberation data set.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published