Skip to content
Emmanuel Blondel edited this page Oct 20, 2021 · 35 revisions

zen4R - R Interface to Zenodo REST API

DOI

Provides an Interface to Zenodo REST API, including management of depositions, attribution of DOIs by 'Zenodo' and upload of files.


If you wish to sponsor zen4R, do not hesitate to contact me

Many thanks to the following organizations that have provided fundings for strenghtening the zen4R package:


Table of contents

1. Overview
2. Package status
3. Credits
4. User guide
   4.1 Installation
   4.2 Connect to Zenodo REST API
   4.3 Query Zenodo deposited records
      4.3.1 Get Depositions
      4.3.2 Get Deposition By Concept DOI
      4.3.3 Get Deposition By DOI
      4.3.4 Get Deposition By Zenodo record ID
      4.3.5 Get Deposition versions
   4.4 Manage Zenodo record depositions
      4.4.1 Create an empty record
      4.4.2 Fill a record
      4.4.3 Deposit/Update a record
      4.4.4 Delete a record
      4.4.5 Publish a record
      4.4.6 Edit/Update a published record
      4.4.7 Discard changes of a draft record
      4.4.8 Create a new record version
   4.5 Manage Zenodo record deposition files
      4.5.1 Upload file
      4.5.2 Get files
      4.5.3 Delete file
   4.6 Export Zenodo record metadata
      4.6.1 Export Zenodo record metadata by format
      4.6.2 Export Zenodo record metadata - all formats
   4.7 Browse Zenodo controlled vocabularies
      4.7.1 Communities
      4.7.2 Licenses
      4.7.3 Funders
      4.7.4 Grants
   4.8 Query Zenodo published records
      4.8.1 Get Records
      4.8.2 Get Record By Concept DOI
      4.8.3 Get Record By DOI
      4.8.4 Get Record By ID
   4.9 Download files from Zenodo records
5. Issue reporting

1. Overview and vision


The zen4R package offers an R interface to the Zenodo e-infrastructure. It supports the creation of metadata records (including versioning), upload of files, and assignment of Digital Object Identifier(s) (DOIs).

zen4R is jointly developed together with the geoflow which intends to facilitate and automate the production of geographic metadata documents and their associated datasources, where zen4R is used to assign DOIs and cross-reference these DOIs in other metadata documents such as geographic metadata (ISO 19115/19139) hosted in metadata catalogues and open data portals.

2. Development status


  • January 2019: Inception. Code source managed on GitHub.
  • June 2019: Published on CRAN.

3. Credits


(c) 2019, Emmanuel Blondel

Package distributed under MIT license.

If you use zen4R, I would be very grateful if you can add a citation in your published work. By citing zen4R, beyond acknowledging the work, you contribute to make it more visible and guarantee its growing and sustainability. For citation, please use the DOI: DOI

4. User guide


4.1 How to install zen4R in R

For now, the package can be installed from Github

install.packages("remotes")

Once the remotes package loaded, you can use the install_github to install zen4R. By default, package will be installed from master which is the current version in development (likely to be unstable).

require("remotes")
install_github("eblondel/zen4R")

For Linux/OSX, make sure to install the sodium package as follows:

sudo apt-get install -y libsodium-dev

4.2 Connect to Zenodo REST API

The main entry point of zen4R is the ZenodoManager. Some basic methods, such as listing licenses known by Zenodo, do not require the token.

zenodo <- ZenodoManager$new()

To use deposit functions of zen4R, you will need to specify the token. This token can be created here.

zenodo <- ZenodoManager$new(
   token = <your_token>, 
   logger = "INFO" # use "DEBUG" to see detailed API operation logs, use NULL if you don't want logs at all
)

By default, the zen4R logger is deactivated. To enable the logger, specify the level of log you wish as parameter of the above R code. Two logging levels are available:

  • INFO: will print the zen4R logs. Three types of messages can be distinguished: INFO, WARN, ERROR. The latter is generally associated with a stop and indicate an blocking error for the R method executed.
  • DEBUG will print the above zen4R logs, and report all logs from HTTP requests performed with cURL

If you want to use the Zenodo sandbox to test record management before going with the production Zenodo e-infrastructure, you can specify the Zenodo sandbox URL (http://sandbox.zenodo.org/api) in the ZenodoManager.

Important: To use the Zenodo sandbox, you need to set up a sandbox account separately from Zenodo, create a separate personal access token to the sandbox API, and you must confirm via the confirmation link this account separately in order to be able to test your record management on this clone of Zenodo.

The below code instructs how to connect to the Sandbox Zenodo e-infrastructure:

zenodo <- ZenodoManager$new(
   url = "http://sandbox.zenodo.org/api",
   token = <your_zenodo_sandbox_token>, 
   logger = "INFO"
)

4.3 Query Zenodo deposited records

zen4R offers several methods to query Zenodo depositions.

4.3.1 Get Depositions

The generic way to query depositions is to use the method getDepositions. If specified with no parameter, all depositions will be returned:

my_zenodo_records <- zenodo$getDepositions()

It is also possible to specify an ElasticSearch query using the q parameter. For helpers and query examples, please consult this Zenodo Search guide.

Since the Zenodo API is paginated, an extra parameter size can be specified to indicate the number of records to be queried by page (default value is 10).

By default, the Zenodo API will return only the latest versions of a record. It is possible to retrieve all versions of records by specifying all_versions = FALSE.

4.3.2 Get Deposition By Concept DOI

It is possible to interrogate and get a Zenodo record with its concept DOI (generic DOI common to all versions of a record):

my_rec <- zenodo$getDepositionByConceptDOI("<my_concept_doi>")
4.3.3 Get Deposition By DOI

It is possible to interrogate and get a Zenodo record with its DOI (record version-specific DOI):

my_rec <- zenodo$getDepositionByDOI("<my_doi>")
4.3.4 Get Deposition By Zenodo ID

It is possible to interrogate and get a Zenodo record with its internal ID:

my_rec <- zenodo$getDepositionById(id)
4.3.4 Get versions of a Zenodo record

For a given record, it's possible to get the list of versions of this record:

my_rec <- zenodo$getDepositionByConceptDOI("<some_concept_doi>")
my_rec$getVersions()

The list of versions is provided as data.frame given the date of publication, version number, and DOI of each version.

Note: This function is not provided through the Zenodo API, but exploits to Zenodo website and has been added to zen4R to facilitate the browsing of record versions.

4.4 Manage Zenodo record depositions

4.4.1 Create an empty record

It is possible to create and deposit an empty record, ready for editing. For that, run the following R code:

myrec <- zenodo$createEmptyRecord()

This method will return an object of class ZenodoRecord for which an internal id and a DOI have been pre-defined by Zenodo. An alternate method is to create a local empty record (not deposited on Zenodo) doing:

myrec <- ZenodoRecord$new()

The next section explains how to fill the record with metadata elements.

4.4.2 Fill a record

Zenodo records can be described a set of multiple metadata elements. For a full documentation of these metadata elements, please consult the zen4R documentation with ?ZenodoRecord. The online Zenodo API documentation can be consulted as well here.

Example of record filling with metadata elements:

myrec <- ZenodoRecord$new()
myrec$setTitle("my R package")
myrec$setDescription("A description of my R package")
myrec$setUploadType("software")
myrec$addCreator(firstname = "John", lastname = "Doe", affiliation = "Independent", orcid = "0000-0000-0000-0000")
myrec$setLicense("mit")
myrec$setAccessRight("open")
myrec$setDOI("mydoi") #use this method if your DOI has been assigned elsewhere, outside Zenodo
myrec$addCommunity("ecfunded")
4.4.3 Deposit/update a record

Once the record is edited/updated, you can deposit it on Zenodo with the following code:

myrec <- zenodo$depositRecord(myrec)

In order to apply further methods on this record (e.g. upload a file, publish/delete a record), you need to get the output of the function depositRecord (see example above) since after the deposition Zenodo will return the record that now contains an internal id required to identify and apply further actions. This id can be inspected with myrec$id.

Instead, if you don't get the output of depositRecord and try to upload files or publish/delete the record based on the local record you handle (built upon ZenodoRecord$new()), this will not work. Because it is a local record, the id of the record will still be NULL, with no value assigned by Zenodo, and Zenodo will be unable to identify which record needs to be handled.

4.4.4 Delete a record

A record deposited on Zenodo but not yet published remains in the Upload area of Zenodo (a kind of staging area where draft records are in edition). As long as it is not published, a record can be deleted from the Zenodo Upload area using:

zenodo$deleteRecord(myrec$id)
4.4.5 Publish a record

To publish a deposited record and make it available publicly online on Zenodo, the following method can be run:

myrec <- zenodo$publishRecord(myrec$id)

A shortcut to publish a record is also available through the method depositRecord, specifying publish = TRUE. This method should be used with cautious giving the fact the record will go straight online on Zenodo if published. By default the parameter publish will be set to FALSE:

myrec <- zenodo$depositRecord(myrec, publish = TRUE)

The publication of a record requires at least to have uploaded at least one file for this record. See section 4.4.1 Upload file.

4.4.6 Edit/Update a published record

It is possible to update metadata of a published record, but not to modify the files associated to it. In order to update metadata of a published record, the state of this record has to be modified to make it editable. For that, use the editRecord function giving the id of the record to edit:

myrec <- zenodo$editRecord(myrec$id)

Next, perform your metadata updates, and re-deposit the record

myrec$setTitle("newtitle")
myrec <- zenodo$depositRecord(myrec, publish = FALSE)

Since the record has been switched back to draft state, the record has to be re-published otherwise it will remain a draft record in your Zenodo user session.

4.4.7 Discard changes of a draft record

In case you started editing a record and you want to discard changes on it, it is possible to do it with the discardChanges.

zenodo$discardChanges(myrec$id)
4.4.8 Create a new version record

To create a new record version, you should first retrieve the record for which you want to create a new version. You can retrieve this record with methods based on DOI such as getDepositionByConceptDOI (to get a record based on concept DOI) or getDepositionByDOI; or by Zenodo id with getDepositionById :

#get record by DOI
myrec <- zenodo$getDepositionByDOI("<some doi>")

#edit myrec with updated metadata for the new version
#...

#create new version
myrec <- zenodo$depositRecordVersion(myrec, delete_latest_files = TRUE, files = "newversion.csv", publish = FALSE)

The function depositRecordVersion will create a new version for the published record. The parameter delete_latest_files (default = TRUE) allows to delete latest files (knowing that a new record version expect to have different file(s) than the latest version). The files parameter allows to list the files to be uploaded. As for the depositRecord, it is possible to publish the record with the publish paramater.

4.5 Manage Zenodo record deposition files

4.5.1 Upload file

With zen4R, it is very easy to upload a file to a record deposition. The record should first deposited on Zenodo. To upload a file, the following single line can be used, where the file path is specified and the record deposition id to which the file should be uploaded:

zenodo$uploadFile("path/to/your/file", myrec$id)
4.5.2 Get files

To get the list of files attached to a record, you can specify the following method adding the record id:

zen_files <- zenodo$getFiles(myrec$id)

This retrieves a list of files. Each file has a unique id, that can be used to for file deletion.

4.5.3 Delete file

The following example shows how to delete the first file attached to the record defined earlier. To delete a file, we need to specify both the record and file identifiers:

zenodo$deleteFile(myrec$id, zen_files[[1]]$id)

4.6 Export Zenodo record metadata

For a given Zenodo record, zen4R let you export the metadata in a metadata file, with a series of exportAs* methods.

The metadata formats supported are: BibTeX, CSL, DataCite, DublinCore, DCAT, JSON, JSON-LD, GeoJSON, MARCXML

4.6.1 Export Zenodo record metadata by format

To export a record in a given format, they are two ways:

#using the generic exportAs
myrec$exportAs("BibTeX", filename = "myfilename")

#using the format-specific wrapper
myrec$exportAsBibTeX(filename = "myfilename")

The filename provided should not include the file extension that is managed by zen4R, depending on the chosen format.

4.6.2 Export Zenodo record metadata - all formats

To export a record in all above metadata formats:

myrec$exportAsAllFormats(filename = "myfilename")

The filename provided should not include the file extension that is managed by zen4R, depending on the format.

4.7 Browse Zenodo controlled vocabularies

4.7.1 Communities
communities <- zenodo$getCommunities()
4.7.2 Licenses
licenses <- zenodo$getLicenses()
4.7.3 Funders
funders <- zenodo$getFunders()
4.7.4 Grants
grants <- zenodo$getGrants()

4.8 Query Zenodo published records

zen4R offers several methods to query Zenodo records.

4.8.1 Get Records

The generic way to query records is to use the method getRecords

my_zenodo_records <- zenodo$getRecords(q = "<my_elastic_search_query>")

The q parameter should be an ElasticSearch query. For helpers and query examples, please consult this Zenodo Search guide.

Since the Zenodo API is paginated, an extra parameter size can be specified to indicate the number of records to be queried by page (default value is 10).

By default, the Zenodo API will return only the latest versions of a record. It is possible to retrieve all versions of records by specifying all_versions = FALSE.

4.8.2 Get Record By Concept DOI

It is possible to interrogate and get a Zenodo record with its concept DOI (generic DOI common to all versions of a record):

my_rec <- zenodo$getRecordByConceptDOI("<my_concept_doi>")
4.8.3 Get Record By DOI

It is possible to interrogate and get a Zenodo record with its DOI (record version-specific DOI):

my_rec <- zenodo$getRecordByDOI("<my_doi>")
4.8.4 Get Record By Zenodo ID

It is possible to interrogate and get a Zenodo record with its internal ID:

my_rec <- zenodo$getRecordById(id)

4.9 Download files from Zenodo records

zen4R offers methods to download files Zenodo published records.

Being published records, the latter and their files are accessible without any user token using zenodo <- ZenodoManager$new(logger = "INFO"). Files can be then downloaded either from a Zenodo record object (fetched with getRecordByDOI):

  rec <- zenodo$getRecordByDOI("10.5281/zenodo.3378733")
  files <- rec$listFiles(pretty = TRUE)
  
  #create a folder where to download my files
  dir.create("download_zenodo")
  
  #download files
  rec$downloadFiles(path = "download_zenodo")
  downloaded_files <- list.files("download_zenodo")

or using the shortcut function download_zenodo:

  #create a folder where to download my files
  dir.create("download_zenodo")

  #download files with shortcut function 'download_zenodo'
  download_zenodo(path = "download_zenodo", "10.5281/zenodo.3378733")
  downloaded_files <- list.files("download_zenodo")

Download can be also be done in parallel with parallel package, depending on the plateform. See below examples:

  • For both Unix/Win OS (using clusters)
  library(parallel)
  #download files as parallel using a cluster approach (for both Unix/Win systems)
  download_zenodo("10.5281/zenodo.2547036", parallel = TRUE, parallel_handler = parLapply, cl = makeCluster(2))
  • For Unix OS (using mclapply )
  #download files as parallel using mclapply (for Unix systems)
  download_zenodo("10.5281/zenodo.2547036", parallel = TRUE, parallel_handler = mclapply, mc.cores = 2)

5. Issue reporting


Issues can be reported at https://github.com/eblondel/zen4R/issues

Clone this wiki locally