Skip to content

Getting Started with Wikidata (artists gender)

Lennart edited this page Jan 15, 2025 · 2 revisions

Reading time: 5 minutes

In this Getting Started you will learn how to fetch a list of musical artists and their respective sex or gender as provided by Wikidata using Facepager.

Wikidata is an open, collaborative knowledge base that serves as a central repository of structured data for Wikimedia projects, such as Wikipedia. It stores factual information about a wide range of topics and entities such as people, places, events etc. in a machine-readable format. Data is organised into items, each with a unique identifier (Q number) as well as reasonable properties (P number), and can be accessed, updated, and queried by anyone. When retrieving data from Wikidata, always ensure to comply with their User-Agent policy and be mindful of their query limits.

This tutorial makes central use of SPARQL, a semantic query language used to access and retrieve data stored in RDF (Resource Description Framework) databases, such as Wikidata. If you want to learn more about SPARQL, check out our grounds-up introduction over at Getting Started with SPARQL. Making an effort to learn the basics of its functionality or to dive deeper into its syntax, will not only allow you to literally read any SPARQL triplet query but equip you with the skills necessary to transfer your knowledge to many useful applications. If you plan on working with Wikidata more often in the future, you might want to have a look at Wikidata's own introduction to SPARQL as well.

Before you start, please, make sure to install the latest release of Facepager. Depending on your previous experience, it will take some time to familiarise yourself with the software. Eventually, getting accustomed to its many features will enable you to fetch all kinds of intriguing data. In that sense, learning about the basic concepts will certainly make it easier to proceed from here on.

How to fetch a musicians or band member's gender

The step-by-step guide below will briefly take you through the beginner-friendly process of fetching a solo artist's or band member's gender as stored in Wikidata. Naturally, you could search for the same information on Wikidata, however, Facepager allows you retrieve the information of several artists in one go, ready for exporting and further processing. Let's dive right in:

  1. Create a database: Click New Database in the Menu Bar of Facepager to create a blank database. Save it in a directory of your choice.
  2. Setup the Generic module: From the Presets tab in the Menu Bar select and Apply the Knowledge Graph preset "Wikidata: gender of solo artist/band members". The SPARQL module in the Query Setup will refresh automatically. Notice that the base path is now set to call Wikidata's SPARQL endpoint. Further the following SPARQL query will be installed within the Query box.
SELECT distinct ?entityLabel ?genderLabel WHERE {
# Search for solo artist or band
?entity rdfs:label "<Object ID>"@en.
# Check if ?entity is solo artist or band (musical group)
{
# If solo artist, get sex or gender
?entity wdt:P31 wd:Q5.
?entity wdt:P21 ?gender.
}
UNION
{
# If band (musical group), first get all members then their sex or gender
?entity wdt:P31 wd:Q215380.
?entity wdt:P527 ?member.
?member wdt:P21 ?gender.
?member rdfs:label ?entityLabel.
FILTER(LANG(?entityLabel) = "en")
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

It includes the standard <Object ID> seed node placeholder and features detailed comments (marked by #) explaining each of its sections. Head to our Getting Started with SPARQL to fully understand the syntax used. In short, the query first attempts to match any seed node with a Wikidata entity and checks if the entity represents a human. Based on the result, it either directly returns the entity's 'sex or gender' (P21) or verifies if the entity is a musical group. In the latter case, it retrieves all associated band members and returns their respective genders.

  1. Add nodes: Before fetching data, you will need to provide one or more seed nodes which will fill in said placeholder upon fetching. To do so, select Add Nodes in the Menu Bar. In the open dialogue box enter a solo artist's or band's English name or 'label' (e.g., "Tyler, the Creator" or "Coldplay"). Include as many nodes as you like.

  1. Fetch data: Select one or more seed nodes, then hit Fetch Data at the bottom of the Query Setup. Facepager will now fetch data based on your setup. Once finished, you can inspect the data by expanding your seed node or clicking Expand nodes in the Menu Bar. For more detail, select a child node and review the raw data displayed in the Data View to the right. If you want to fetch other information about a musician or band such as, for example, their birth name, find the respective P number on Wikidata and adjust the Query accordingly.
  2. Export data: Expand all nodes and select the ones you want to export. Hit Export Data to get a CSV-file. Notice the options provided by the export dialogue. You can open CSV files with Excel or any statistics software you like.

What's next?

Clone this wiki locally