-
Notifications
You must be signed in to change notification settings - Fork 14
Home
Provides an Interface to Zenodo REST API, including management of depositions, attribution of DOIs by 'Zenodo' and upload of files.
If you wish to sponsor zen4R
, do not hesitate to contact me
Many thanks to the following organizations that have provided fundings for strenghtening the zen4R
package:
Table of contents
1. Overview
2. Package status
3. Credits
4. User guide
4.1 Installation
4.2 Connect to Zenodo REST API
4.3 Query Zenodo deposited records
4.3.1 Get Depositions
4.3.2 Get Deposition By Concept DOI
4.3.3 Get Deposition By DOI
4.3.4 Get Deposition By Zenodo record ID
4.3.5 Get Deposition versions
4.4 Manage Zenodo record depositions
4.4.1 Create an empty record
4.4.2 Fill a record
4.4.3 Deposit/Update a record
4.4.4 Delete a record
4.4.5 Publish a record
4.4.6 Edit/Update a published record
4.4.7 Discard changes of a draft record
4.4.8 Create a new record version
4.5 Manage Zenodo record deposition files
4.5.1 Upload file
4.5.2 Get files
4.5.3 Delete file
4.6 Export Zenodo record metadata
4.6.1 Export Zenodo record metadata by format
4.6.2 Export Zenodo record metadata - all formats
4.7 Browse Zenodo controlled vocabularies
4.7.1 Communities
4.7.2 Licenses
4.7.3 Funders
4.7.4 Grants
4.8 Query Zenodo published records
4.8.1 Get Records
4.8.2 Get Record By Concept DOI
4.8.3 Get Record By DOI
4.8.4 Get Record By ID
4.9 Download files from Zenodo records
5. Issue reporting
The zen4R package offers an R interface to the Zenodo e-infrastructure. It supports the creation of metadata records (including versioning), upload of files, and assignment of Digital Object Identifier(s) (DOIs).
zen4R is jointly developed together with the geoflow which intends to facilitate and automate the production of geographic metadata documents and their associated datasources, where zen4R is used to assign DOIs and cross-reference these DOIs in other metadata documents such as geographic metadata (ISO 19115/19139) hosted in metadata catalogues and open data portals.
- January 2019: Inception. Code source managed on GitHub.
- June 2019: Published on CRAN.
(c) 2019, Emmanuel Blondel
Package distributed under MIT license.
If you use zen4R
, I would be very grateful if you can add a citation in your published work. By citing zen4R
, beyond acknowledging the work, you contribute to make it more visible and guarantee its growing and sustainability. For citation, please use the DOI:
For now, the package can be installed from Github
install.packages("remotes")
Once the remotes package loaded, you can use the install_github to install zen4R
. By default, package will be installed from master
which is the current version in development (likely to be unstable).
require("remotes")
install_github("eblondel/zen4R")
For Linux/OSX, make sure to install the sodium
package as follows:
sudo apt-get install -y libsodium-dev
The main entry point of zen4R
is the ZenodoManager
. Some basic methods, such as listing licenses known by Zenodo, do not require the token.
zenodo <- ZenodoManager$new()
To use deposit functions of zen4R
, you will need to specify the token
. This token can be created here.
zenodo <- ZenodoManager$new(
token = <your_token>,
logger = "INFO" # use "DEBUG" to see detailed API operation logs, use NULL if you don't want logs at all
)
By default, the zen4R
logger is deactivated. To enable the logger, specify the level of log you wish as parameter of the above R code. Two logging levels are available:
-
INFO
: will print thezen4R
logs. Three types of messages can be distinguished:INFO
,WARN
,ERROR
. The latter is generally associated with astop
and indicate an blocking error for the R method executed. -
DEBUG
will print the abovezen4R
logs, and report all logs from HTTP requests performed withcURL
If you want to use the Zenodo sandbox to test record management before going with the production Zenodo e-infrastructure, you can specify the Zenodo sandbox URL (http://sandbox.zenodo.org/api) in the ZenodoManager
.
Important: To use the Zenodo sandbox, you need to set up a sandbox account separately from Zenodo, create a separate personal access token to the sandbox API, and you must confirm via the confirmation link this account separately in order to be able to test your record management on this clone of Zenodo.
The below code instructs how to connect to the Sandbox Zenodo e-infrastructure:
zenodo <- ZenodoManager$new(
url = "http://sandbox.zenodo.org/api",
token = <your_zenodo_sandbox_token>,
logger = "INFO"
)
zen4R offers several methods to query Zenodo depositions.
The generic way to query depositions is to use the method getDepositions
. If specified with no parameter, all depositions will be returned:
my_zenodo_records <- zenodo$getDepositions()
It is also possible to specify an ElasticSearch query using the q
parameter. For helpers and query examples, please consult this Zenodo Search guide.
Since the Zenodo API is paginated, an extra parameter size
can be specified to indicate the number of records to be queried by page (default value is 10).
By default, the Zenodo API will return only the latest versions of a record. It is possible to retrieve all versions of records by specifying all_versions = FALSE
.
It is possible to interrogate and get a Zenodo record with its concept DOI (generic DOI common to all versions of a record):
my_rec <- zenodo$getDepositionByConceptDOI("<my_concept_doi>")
It is possible to interrogate and get a Zenodo record with its DOI (record version-specific DOI):
my_rec <- zenodo$getDepositionByDOI("<my_doi>")
It is possible to interrogate and get a Zenodo record with its internal ID:
my_rec <- zenodo$getDepositionById(id)
For a given record, it's possible to get the list of versions of this record:
my_rec <- zenodo$getDepositionByConceptDOI("<some_concept_doi>")
my_rec$getVersions()
The list of versions is provided as data.frame
given the date of publication, version number, and DOI of each version.
Note: This function is not provided through the Zenodo API, but exploits to Zenodo website and has been added to zen4R to facilitate the browsing of record versions.
It is possible to create and deposit an empty record, ready for editing. For that, run the following R code:
myrec <- zenodo$createEmptyRecord()
This method will return an object of class ZenodoRecord
for which an internal id
and a DOI have been pre-defined by Zenodo. An alternate method is to create a local empty record (not deposited on Zenodo) doing:
myrec <- ZenodoRecord$new()
The next section explains how to fill the record with metadata elements.
Zenodo records can be described a set of multiple metadata elements. For a full documentation of these metadata elements, please consult the zen4R documentation with ?ZenodoRecord
. The online Zenodo API documentation can be consulted as well here.
Example of record filling with metadata elements:
myrec <- ZenodoRecord$new()
myrec$setTitle("my R package")
myrec$setDescription("A description of my R package")
myrec$setUploadType("software")
myrec$addCreator(firstname = "John", lastname = "Doe", affiliation = "Independent", orcid = "0000-0000-0000-0000")
myrec$setLicense("mit")
myrec$setAccessRight("open")
myrec$setDOI("mydoi") #use this method if your DOI has been assigned elsewhere, outside Zenodo
myrec$addCommunity("ecfunded")
Once the record is edited/updated, you can deposit it on Zenodo with the following code:
myrec <- zenodo$depositRecord(myrec)
In order to apply further methods on this record (e.g. upload a file, publish/delete a record), you need to get the output of the function depositRecord
(see example above) since after the deposition Zenodo will return the record that now contains an internal id
required to identify and apply further actions. This id can be inspected with myrec$id
.
Instead, if you don't get the output of depositRecord
and try to upload files or publish/delete the record based on the local record you handle (built upon ZenodoRecord$new()
), this will not work. Because it is a local record, the id
of the record will still be NULL
, with no value assigned by Zenodo, and Zenodo will be unable to identify which record needs to be handled.
A record deposited on Zenodo but not yet published remains in the Upload area of Zenodo (a kind of staging area where draft records are in edition). As long as it is not published, a record can be deleted from the Zenodo Upload area using:
zenodo$deleteRecord(myrec$id)
To publish a deposited record and make it available publicly online on Zenodo, the following method can be run:
myrec <- zenodo$publishRecord(myrec$id)
A shortcut to publish a record is also available through the method depositRecord
, specifying publish = TRUE
. This method should be used with cautious giving the fact the record will go straight online on Zenodo if published. By default the parameter publish
will be set to FALSE
:
myrec <- zenodo$depositRecord(myrec, publish = TRUE)
The publication of a record requires at least to have uploaded at least one file for this record. See section 4.4.1 Upload file.
It is possible to update metadata of a published record, but not to modify the files associated to it. In order to update metadata of a published record, the state
of this record has to be modified to make it editable. For that, use the editRecord
function giving the id
of the record to edit:
myrec <- zenodo$editRecord(myrec$id)
Next, perform your metadata updates, and re-deposit the record
myrec$setTitle("newtitle")
myrec <- zenodo$depositRecord(myrec, publish = FALSE)
Since the record has been switched back to draft
state, the record has to be re-published otherwise it will remain a draft record in your Zenodo user session.
In case you started editing a record and you want to discard changes on it, it is possible to do it with the discardChanges
.
zenodo$discardChanges(myrec$id)
To create a new record version, you should first retrieve the record for which you want to create a new version. You can retrieve this record with methods based on DOI such as getDepositionByConceptDOI
(to get a record based on concept DOI) or getDepositionByDOI
; or by Zenodo id
with getDepositionById
:
#get record by DOI
myrec <- zenodo$getDepositionByDOI("<some doi>")
#edit myrec with updated metadata for the new version
#...
#create new version
myrec <- zenodo$depositRecordVersion(myrec, delete_latest_files = TRUE, files = "newversion.csv", publish = FALSE)
The function depositRecordVersion
will create a new version for the published record. The parameter delete_latest_files
(default = TRUE
) allows to delete latest files (knowing that a new record version expect to have different file(s) than the latest version). The files
parameter allows to list the files to be uploaded. As for the depositRecord
, it is possible to publish the record with the publish
paramater.
With zen4R
, it is very easy to upload a file to a record deposition. The record should first deposited on Zenodo. To upload a file, the following single line can be used, where the file path
is specified and the record deposition id
to which the file should be uploaded:
zenodo$uploadFile("path/to/your/file", myrec$id)
To get the list of files attached to a record, you can specify the following method adding the record id
:
zen_files <- zenodo$getFiles(myrec$id)
This retrieves a list of files. Each file has a unique id
, that can be used to for file deletion.
The following example shows how to delete the first file attached to the record defined earlier. To delete a file, we need to specify both the record and file identifiers:
zenodo$deleteFile(myrec$id, zen_files[[1]]$id)
For a given Zenodo record, zen4R let you export the metadata in a metadata file, with a series of exportAs*
methods.
The metadata formats supported are: BibTeX
, CSL
, DataCite
, DublinCore
, DCAT
, JSON
, JSON-LD
, GeoJSON
, MARCXML
To export a record in a given format, they are two ways:
#using the generic exportAs
myrec$exportAs("BibTeX", filename = "myfilename")
#using the format-specific wrapper
myrec$exportAsBibTeX(filename = "myfilename")
The filename
provided should not include the file extension that is managed by zen4R, depending on the chosen format.
To export a record in all above metadata formats:
myrec$exportAsAllFormats(filename = "myfilename")
The filename
provided should not include the file extension that is managed by zen4R, depending on the format.
communities <- zenodo$getCommunities()
licenses <- zenodo$getLicenses()
funders <- zenodo$getFunders()
grants <- zenodo$getGrants()
zen4R offers several methods to query Zenodo records.
The generic way to query records is to use the method getRecords
my_zenodo_records <- zenodo$getRecords(q = "<my_elastic_search_query>")
The q
parameter should be an ElasticSearch query. For helpers and query examples, please consult this Zenodo Search guide.
Since the Zenodo API is paginated, an extra parameter size
can be specified to indicate the number of records to be queried by page (default value is 10).
By default, the Zenodo API will return only the latest versions of a record. It is possible to retrieve all versions of records by specifying all_versions = FALSE
.
It is possible to interrogate and get a Zenodo record with its concept DOI (generic DOI common to all versions of a record):
my_rec <- zenodo$getRecordByConceptDOI("<my_concept_doi>")
It is possible to interrogate and get a Zenodo record with its DOI (record version-specific DOI):
my_rec <- zenodo$getRecordByDOI("<my_doi>")
It is possible to interrogate and get a Zenodo record with its internal ID:
my_rec <- zenodo$getRecordById(id)
zen4R offers methods to download files Zenodo published records.
Being published records, the latter and their files are accessible without any user token using zenodo <- ZenodoManager$new(logger = "INFO")
. Files can be then downloaded either from a Zenodo record
object (fetched with getRecordByDOI
):
rec <- zenodo$getRecordByDOI("10.5281/zenodo.3378733")
files <- rec$listFiles(pretty = TRUE)
#create a folder where to download my files
dir.create("download_zenodo")
#download files
rec$downloadFiles(path = "download_zenodo")
downloaded_files <- list.files("download_zenodo")
or using the shortcut function download_zenodo
:
#create a folder where to download my files
dir.create("download_zenodo")
#download files with shortcut function 'download_zenodo'
download_zenodo(path = "download_zenodo", "10.5281/zenodo.3378733")
downloaded_files <- list.files("download_zenodo")
Download can be also be done in parallel with parallel
package, depending on the plateform. See below examples:
- For both Unix/Win OS (using clusters)
library(parallel)
#download files as parallel using a cluster approach (for both Unix/Win systems)
download_zenodo("10.5281/zenodo.2547036", parallel = TRUE, parallel_handler = parLapply, cl = makeCluster(2))
- For Unix OS (using
mclapply
)
#download files as parallel using mclapply (for Unix systems)
download_zenodo("10.5281/zenodo.2547036", parallel = TRUE, parallel_handler = mclapply, mc.cores = 2)
Issues can be reported at https://github.com/eblondel/zen4R/issues