Skip to content

add new data set

Oliver Beckstein edited this page Oct 4, 2018 · 11 revisions

Adding a new data set requires:

  1. Put the data on figshare (or another archive-grade repository such as zenodo or DataDryad; some university also provide digital repositories that are suitable). The site must provide stable download links and may not change the content during download because we store a SHA256 checksum.
  2. Add a Python module such as MDAnalysisData/adk_equilibrium.py; in many cases you can copy the module and adapt
  • text
  • NAME: name of the data set; will be used as a file name so do not use spaces etc
  • DESCRIPTION: filename of the description file (restructured text format, so has suffix .rst)
  • ARCHIVE: dictionary containing RemoteFileMetadata instances. Keys should describe the file type. Typically
    • topology: topology file (PSF, TPR, ...)
    • trajectory: trajectory coordinate file (DCD, XTC, ...)
    • structure (optional): system with single frame of coordinates (typically PDB, GRO, CRD, ...)
  • name of the fetch_NAME function
  • docs of the fetch_NAME function
  1. Add a description file such as MDAnalysisData/descr/adk_equilibrium.rst; copy this file and adapt. Make sure to add license information.
Clone this wiki locally