Skip to content

fabianslife/BUGA_data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

BUGA 23 Chatbot Interaction Dataset

Overview

Welcome to the repository of the BUGA 23 Chatbot Interaction Dataset. This dataset comprises conversations recorded from March to September during the Bundesgartenschau (BUGA) 2023 event in Mannheim, Germany. It features 4.423 interactions between visitors and a virtual medical chatbot presented through an avatar within a telephone booth setup.

Dataset Description

The dataset includes:

  • CSV Format Files: Each conversation is structured in CSV format, detailing the dialogue between the chatbot and users.
  • Timeframe: Data was collected from March to September 2023. Focus on Medical Queries: The chatbot was primarily prompted for medical questions, offering insights into public health-related inquiries and responses.
  • Languages: The users had the coice to lead their conversation in German, English or Spanish. The languge is marked at the end of the file name.

Data Collection Method

The data was collected in a controlled environment where participants interacted with a virtual doctor-avatar through a specially designed telephone booth. This unique setup provided a semi-private space, encouraging more open and detailed conversations.

Privacy and Ethical Considerations

Anonymization: All personal information has been removed or anonymized to protect the privacy of the participants. Consent: Participants were informed about the data collection, and consent was obtained prior to the interactions. Compliance: The dataset adheres to applicable privacy and data protection laws.

Potential Uses

This dataset is invaluable for research in fields such as:

Natural Language Processing (NLP) Human-Computer Interaction (HCI) Medical Informatics Conversational AI Analysis

How to Use

Clone the Repository: git clone https://github.com/fabianslife/BUGA23_Chatbot_Dataset.git Navigate to the Data Folder. Data Analysis: Import the CSV files into your preferred data analysis tool. Contributing

We welcome contributions to enhance the dataset's quality and documentation. Please read CONTRIBUTING.md for details on our code of conduct and the process for submitting pull requests.

Issues

We used microsoft Azure to detect spoken languge. Unfortunatly this frequently lead to missunderstanding of cutoffs of the spoken user dialogue. We are currently working on recreating parts of this dataset that might be unstructured.

License

This project is licensed under the MIT License

Acknowledgments

Bundesgartenschau (BUGA) 2023 Mannheim Participants and volunteers Insitute for AI in Medicine of the Uiniversity Hospital Gießen and Marburg (UKGM)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published