This document was originally created for the ICA 2022 Hackathon tutorial "Exploring Google Takeout's Search History as a Data Source." If you are using or adapting it for your class or your research I would like to know. Please email menchent at american dot edu.
The motivation for this tutorial is as follows: The vast majority of people who use the Internet use search to find information and resources. Search queries and the resulting page visits, combined with self-reported information about a topic of interest (e.g. politics, content creation, sales funnel, fandom), are an under-explored and potentially rich source of behavioral information that could be used in social science research of many types. Google's 'Takeout' service allows users to download their own data, including all of the search terms they have used and the resulting web pages they have visited from any device logged into Google.
Furthermore, seeing this data can help students understand the detailed personal information that technology companies have about them. It can also help students to learn how to analyze data with a dataset that they know well, their own behavior.
The tutorial will show participants how to download and explore their own data. Social science researchers can then come together to think about how to incorporate this type of data into our research in an ethical and practical way.
- Introduction to the topic and each other
- Downloading
- Loading / Cleaning
- Exploring
- What kinds of questions would you like to answer with this type of dataset?
- How can we incorporate this type of data into our research in an ethical and practical way?
- What kind of profile do I think Google would construct about me?
- How can I use this data to help me understand my own behavior?