Replies: 4 comments 1 reply
-
Commenting on structures in openrefine
|
Beta Was this translation helpful? Give feedback.
-
Leaning more on just getting the data psql dump and reconciliation from CSVs... seems like more work and more prone to error if we try to merge json data. |
Beta Was this translation helpful? Give feedback.
-
@dchiller I talked about the different json-ld structure and how to get them from musicbrainz here if you're interested |
Beta Was this translation helpful? Give feedback.
-
After discussions with Alistair, he suggested using MusicBrainz Json Dumps (https://data.metabrainz.org/pub/musicbrainz/data/json-dumps/) and instead of using OpenRefine, bypass its inability to handle structured data by using the reconciliation service api (https://www.w3.org/community/reports/reconciliation/CG-FINAL-specs-0.2-20230410/) |
Beta Was this translation helpful? Give feedback.
-
MusicBrainz provide access to their json-ld files.
Curl Command
To access data from the MusicBrainz JSON-LD API, you can use the following
curl
command:Entities of Interest
There are several entities available in MusicBrainz, including release, release group, event, and artist.
Most of the relevant information can be found in the release entity, especially when dealing with albums.
Release Entity Example
Here is an example of JSON-LD data for a release:
Track Information
When examining specific tracks within a release, the information largely remains the same as in the elements in
track
from the previous JSON-LD.Release Group Entity
Another entity of interest is the release group, which has less detailed information than the release entity. It doesn't include track, region, release format, and other information. However, it has links to other databases.
Here is an example of the "sameAs" field in a release group:
Sometimes they even have links to wikidata
Main Entity Proposal
My proposal is to use the MusicBrainz release entity as the main entity to retrieve JSON-LD data from the API. Additional information can be merged or expanded using the release group entity or any other necessary entities to obtain more comprehensive data.
Reconciling with Wikidata
For reconciling values such as musicians, file formats, etc., we can use OpenRefine.
@id
key).Regarding keys, MusicBrainz follows the schema.org context, which doesn't link to Wikidata. We can consider overriding this context and adding our properties where appropriate.
To start, I'll collect a list of releases (I'll try to find overlaps with our other dbs) and try to expand the jsonld at certain interesting metada. I'll then reconcile with wikidata using OpenRefine, overriding their contexts where needed. For values like musicians, cities, genres, I'll get the main @id as a wikidata iri, and add the musicbrainz id as another field in the json document.
Beta Was this translation helpful? Give feedback.
All reactions