You must be signed in to change notification settings - Fork 0
executable file
·442 lines (402 loc) · 18.4 KB
<HTML><HEAD><TITLE>Data Hub LOD Validator - Help</TITLE><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"><script type="text/javascript" src="autocomplete/jquery.js"></script><script type='text/javascript' src='autocomplete/jquery.bgiframe.min.js'></script><script type='text/javascript' src='autocomplete/jquery.ajaxQueue.js'></script><script type='text/javascript' src='autocomplete/thickbox-compressed.js'></script><script type='text/javascript' src='autocomplete/jquery.autocomplete.js'></script><link rel="stylesheet" type="text/css" href="autocomplete/main.css" /><link rel="stylesheet" type="text/css" href="autocomplete/jquery.autocomplete.css" /><link rel="stylesheet" type="text/css" href="autocomplete/thickbox.css" /><script> $(document).ready(function() { $("#package").autocomplete("autocomplete.php", { matchContains: true, width: 300, selectFirst: false }); });</script><script type="text/javascript"> function KeyCode(ev) { if(ev){ TastenWert = ev.which } else { TastenWert = window.event.keyCode } if (TastenWert == 13) { document.form.submit(); } } document.onkeypress = KeyCode;</script><style type=text/css> body { background: white; color: black; font-family: sans-serif; line-height: 1.4em; padding: 2.5em 3em; margin: 0; } :link { color: #00c; } :visited { color: #609; } a:link img { border: none; } a:visited img { border: none; } .source { font-size: 10px; color: #800; } h1, h2, h3, h4 { background: white; color: #800; } h1 { font: 170% sans-serif; margin: 0; } h2 { clear: both; font: 140% sans-serif; margin: 1.5em 0 -0.5em 0; padding-bottom:20px; } h3 { font: 120% sans-serif; margin: 0.5em 0; } h4 { font: 110% sans-serif; margin: 1em 0 0.5em 0; } h5 { font: bold 100% sans-serif; margin: 0.5em 0 0.3em 0;} h6 { font: small-caps 100% sans-serif; } .hide { display: none; } p { margin: 0.5em 0;} pre { background: #fff6bb; font-family: monospace; line-height: 1.2em; padding: 1em 2em; } dt { font-weight: bold; margin-top: 0; margin-bottom: 0; } dd { margin-top: 0; margin-bottom: 0; } code, tt { font-family: monospace; } ul.toc { list-style-type: none; } ol.toc li a { text-decoration: none; } .note { color: red; } #header { border-bottom: 1px solid #ccc; } #logo { float: right; } #logo a img { padding-left: 20px; background: white; } #authors { clear: right; float: right; font-size: 80%; text-align: right; } #content { clear: both; margin: 2em auto 0 0; text-align: justify } #download, #demo { float: left; font-family: sans-serif; margin: 1em 0 1.5em; text-align: center; width: 50%; } #download h2, #demo h2 { font-size: 125%; margin: 1.5em 0 -0.2em 0; } #download small, #demo small { color: #888; font-size: 80%; } #footer { border-top: 1px solid #ccc; color: #aaa; margin: 2em 0 0; } th { text-align: left; font-weight: bold; font-size: 95%; margin: 0px; padding: 3px 3px 8px 3px; } .red { color: #800; font-weight: bold; } .info { padding-left: 15px; } ol, ul, ul.info { padding-left: 30px; } table { border-collapse:collapse; } table, th, td { border: 1px solid black; } td { vertical-align: top; padding: 3px 3px 8px 3px; } h4 img { vertical-align: top; } .screenshot { float: right; margin: 4px; } </style></HEAD><BODY><div id="logo"> <a href="http://www.fu-berlin.de/"><img src="http://www4.wiwiss.fu-berlin.de/bizer/d2r-server/images/fu-logo.gif" alt="Freie Universität Berlin Logo" /></a></div><div id="header"> <h1 style="font-size: 250%">Data Hub LOD Validator</h1></div><div> <a href="index.php">LOD Datasets on Data Hub</a> | <a href="validate.php">Validate</a> | Help</div><div id="content"> <form action="validate.php" method="post" name="package"> <p style="margin-bottom:40px;">Search Data Hub dataset: <input style="width: 300px;" size="500" id="package" name="package" onKeyPress="KeyCode;"/> <input type="submit" name="submit" value=">"/></p> </form><h3>How do I add a data set to Data Hub or edit an existing data set?</h3><p><ol><li> Please register with <a href="http://www.ckan.net" class="external text" title="http://www.ckan.net">Data Hub</a> before editing or adding any packages.</li><li> Please confirm that the data set does not already exist on CKAN before adding a new data set.</li><li> Add or edit the data set and provide as much additional information as possible as described below. This information is made available via the Data Hub API and can be used by search engines or data consumers to find new datasets to which to link. Furthermore, through the <a href="http://www4.wiwiss.fu-berlin.de/lodcloud/state/">state-of-the-lod-cloud page</a> it will help the community to know more about the development state of the Web of Linked Data.</li><li> Please tag newly added data sets with <code><i>lod</i></code>. </li><li> If you are not aware of any in- or outlinks, tag it with <code><i>lodcloud.nolinks</i></code>.</li></ol></p> <h3>LOD Cloud Diagram Compliance Levels</h3> <h4><a name="level1"/>Level 1 (basic) </h4> <p>Please provide the following basic information about the data set:</p> <h5>CKAN / Basic Information </h5><img src="images/basic.jpg" class="screenshot"/> <table> <tr> <th>Field name</th> <th>Description</th> <th>Format/Examples</th> </tr> <tr> <td>Name</td> <td>Unique ID for the data set on CKAN</td> <td>[a-z0-9-]+ "my-dataset"</td> </tr> <tr> <td>Title</td> <td>Full name of the data set</td> <td>"My Dataset"</td> </tr> <tr> <td>URL</td> <td>Link to data set homepage</td> <td>http://example.com/my-ds</td> </tr> <tr> <td>Author</td><td>Name of publishing org and/or person </td><td>"Talis (Leigh Dodds)" </td> </tr> <tr> <td>Author email<br/>Maintainer email </td><td>Contact email </td><td>leigh@ldodds.com </td></tr> </table> <h5>Data Hub / Basic Information / Tags</h5> <img src="images/tags.jpg" class="screenshot"/> <table> <tr> <th>Tag</th> <th>Purpose</th> </tr> <tr> <td><code>lod</code></td> <td>Identifies the data set as Linked Data</td> </tr> </table> <h4><a name="level2"/>Level 2 (minimal) </h4> <h5>Data Hub / Basic Information / Tags</h5> <p>Please provide a topic tag for the data set. We will use the topic information to color the LOD cloud later. </p> <table> <tr> <th>Tag</th> <th>Purpose</th> </tr> <tr> <td><code><i><topic></i></code> </td><td>One of: <ul><li> <code>media</code> </li><li> <code>geographic</code> </li><li> <code>lifesciences</code> </li><li> <code>publications</code> (including library and museum data) </li><li> <code>government</code> </li><li> <code>ecommerce</code> </li><li> <code>socialweb</code> (people and their activities) </li><li> <code>usergeneratedcontent</code> (blog posts, discussions, pictures, ...) </li><li> <code>schemata</code> (structural resources, including vocabularies, ontologies, classifications, thesauri) </li><li> <code>crossdomain</code> </li></ul> </td></tr></table> <h5>Data Hub / Resources</h5> <img src="images/resources.jpg" class="screenshot"/> <p>Provide a link to an example URI in the Resources section. Example URIs help people to get a feel for your data before they decide to use it.</p> <table> <tr><th>What</th><th>Format</th><th>Description</th></tr> <tr> <td>RDF example link</td><td style="min-width: 150px">Any of: <li><code>example/rdf+xml</code></li><li><code>example/turtle</code></li><li><code>example/ntriples</code></li><li><code>example/x-quads</code></li><li><code>example/rdfa</code></li><li><code>example/x-trig</code></li></td><td>Link to an example data item within the data set in the corresponding format (e.g. RDF/XML)</td></tr> <tr> </tr> </table> <p>Provide links to the data set download files (dumps) or the SPARQL endpoint. Download files relieve your server from strong crawling/querying activity for people interested in bulk loading (e.g. indexing) your dataset. SPARQL endpoints allow people to select a subset of their interest through a query.</p> <table> <tr><th>URL</th><th>Format</th><th>Description</th></tr> <tr> <td>SPARQL endpoint </td><td><code>api/sparql</code> </td><td>SPARQL endpoint </td> </tr> <tr> <td>Direct link to each RDF download file (preferred)</td><td>Any of: <li><code>application/rdf+xml</code></li><li><code>text/turtle</code></li><li><code>application/x-ntriples</code></li><li><code>application/x-nquads</code></li><li><code>application/x-trig</code></li> </td><td>Download</td> </tr> <tr> <td>Download page with list of downloads (accepted) </td><td> - </td><td>Download (for multiple files) </td> </tr> </table> <h5>Data Hub / Extras - via "Add more information (Groups, authors etc)" </h5> <p>Please provide size and linkage information. These estimates will be compared with automatically estimated numbers for the sake of quality assessment and graphically displaying your dataset on the cloud.</p> <img src="images/extras.jpg" class="screenshot"/> <table> <tr> <th>New key</th><th>With value</th><th>Format/Examples</th></tr> <tr> <td>triples </td><td>Approximate size of the data set in RDF triples </td><td>100000, 62345123 </td></tr> <tr> <td>links:xxx </td><td>Number of RDF links pointing at data set xxx. Please provide separate links xxx statements for each data set linked to </td><td>20000 </td></tr></table> If you have a SPARQL graph, please also provided that information. <table> <tr> <th>New key</th><th>With value</th><th>Format/Examples</th></tr> <td>sparql_graph_name </td><td>Named graph in SPARQL store (if used by the SPARQL endpoint) </td><td><a href="http://species.geospecies.org" class="external free" title="http://species.geospecies.org">http://species.geospecies.org</a> </td></tr> <tr> </table> <h4><a name="level3"/>Level 3 (complete) </h4> <p>Please provide the following additional information about the data set. </p> <h5>Data Hub / Basic Information </h5> <img src="images/basic.jpg" class="screenshot"/> <table> <tr> <td><b>Field name</b> </td><td><b>Description</b> </td><td><b>Format/Examples</b> </td></tr> <tr> <td>Version </td><td>Last modification date or version of the data set </td><td>"2010-04 (3.5)", "2006", "beta" </td></tr> <tr> <td>Notes </td><td>Description of the data set </td><td>some free text </td></tr> <tr> <td>License </td><td>Standard license drop-down </td><td>OSI approved::MIT license </td></tr></table> <h5>Data Hub / Extras - via "Add more information (Groups, authors etc)"</h5> <img src="images/extras.jpg" class="screenshot"/> <table> <tr> <td><b>New key</b> </td><td><b>With value</b> </td><td><b>Format/examples</b> </td></tr> <tr> <td>shortname </td><td>Short name for LOD bubble </td><td>"NY Times" </td></tr> <tr> <td>license_link </td><td>Custom license link </td><td><a href="http://example.com/so-sue-me" class="external free" title="http://example.com/so-sue-me">http://example.com/so-sue-me</a> </td></tr> <tr> <td>namespace </td><td>Instance namespace </td><td><a href="http://dbpedia.org/resource/" class="external free" title="http://dbpedia.org/resource/">http://dbpedia.org/resource/</a> </td></tr></table> <h5>Data Hub / Resources</h5> <img class="screenshot" src="images/resources.jpg" /> <p>Links (other than dereferenceable URIs) that enable alternative access to the data set (e.g., via downloads or SPARQL endpoints) should be specified in the Resources section of the CKAN entry form. Please also provide links to the <a href="http://vocab.deri.ie/void/guide" class="external text" title="http://vocab.deri.ie/void/guide">voiD description</a> or <a href="http://sw.deri.org/2007/07/sitemapextension/" class="external text" title="http://sw.deri.org/2007/07/sitemapextension/">Semantic Web Sitemap</a> describing the data set. </p> <table> <tr> <td><b>Purpose</b> </td><td><b>Format</b> </td><td><b>Description</b> </td></tr> <tr> <td>voiD file</td> <td>meta/void</td> <td>voiD description</td> </tr> <tr> <td>XML Sitemap </td><td><code>meta/sitemap</code> </td><td>XML Sitemap </td></tr> <tr> <td>RDF Schema </td><td><code>meta/rdf-schema</code> </td><td>Download link to RDF/OWL Schema used by the data set (in addition to having dereferenceable vocabulary URIs) </td></tr> <tr> <td>Vocabulary Mappings, e.g., OWL, RDFS, RIF, R2R </td><td><code>mapping/<i><format></i></code> </td><td> If the data set provides vocabulary mappings to other vocabularies (<code>owl:equivalentClass</code>, <code>owl:equivalentProperty</code>, <code>rdfs:subClassOf</code>, and/or <code>rdfs:subPropertyOf</code> links), provide a link to the mapping file in the <em>Downloads & Resources</em> section, using the following format: <em>mapping/<format></em>. Replace <em><format></em> with the mapping/rule language used, like R2R or RIF.</td></tr></table> <h5>Data Hub / Basic Information / Tags</h5> <img src="images/tags.jpg" class="screenshot"/> <p>Please provide provenance and vocabulary metadata for this dataset. Please list the vocabularies used by the data set so that the community can get an overview of which vocabularies are commonly used on the Web of Linked Data. </p><p>Linked Data published on the Web should be as <a href="http://www.w3.org/2001/tag/doc/selfDescribingDocuments.html" class="external text" title="http://www.w3.org/2001/tag/doc/selfDescribingDocuments.html">self-describing</a> as possible in order to make it easier for clients to understand and use the data. Important aspects of self-descriptiveness are making vocabulary terms dereferenceable according to the best practices described in <a href="http://www.w3.org/TR/swbp-vocab-pub/" class="external text" title="http://www.w3.org/TR/swbp-vocab-pub/">Publishing RDF Vocabularies</a>, using terms from common vocabularies and providing vocabulary mappings for proprietary vocabulary terms. In order to allow the community to get an overview which data sets implement these best practices, please tag your data set accordingly. </p> <table> <tr> <th style="min-width: 200px">Tag</th><th>Purpose</th></tr> <tr> <td>One of: <li><code>no-proprietary-vocab</code></li><li><code>deref-vocab</code></li><li><code>no-deref-vocab</code></li></td><td>The tag <code>no-proprietary-vocab</code> indicates that your data set does not use a proprietary vocabulary (defined within your top-level domain). The other two tags indicate that your dataset uses proprietary vocabulary terms (the ones that are defined within your top-level domain) and they are (<code>deref-vocab</code>) or are not (<code>no-deref-vocab</code>) dereferenceable according to the best practices for <a href="http://www.w3.org/TR/swbp-vocab-pub/" class="external text" title="http://www.w3.org/TR/swbp-vocab-pub/">Publishing RDF Vocabularies</a> </td></tr> <tr> <td><code>vocab-mappings</code> <p><code>no-vocab-mappings</code> </p> </td><td>Indicates whether mappings for proprietary vocabulary terms are provided (by setting <code>owl:equivalentClass</code>, <code>owl:equivalentProperty</code>, <code>rdfs:subClassOf</code>, and/or <code>rdfs:subPropertyOf</code> links, or publish mapping expressed as RIF rules or using the R2R Mapping Language). </td></tr> <tr> <td><code>provenance-metadata</code> <p><code>no-provenance-metadata</code> </p> </td><td>Indicates whether the data set provides provenance meta-information (creator of the data set, creation date, maybe creation method) as document meta-information or via a voiD description. For instance, using the <code>dc:creator</code> or <code>dc:date</code> properties. </td></tr> <tr> <td><code>license-metadata</code> <p><code>no-license-metadata</code> </p> </td><td>Indicates whether the data set provides licensing meta-information as document meta-information or via a voiD description. For instance, using the <code>dc:rights</code> property. </td></tr> <tr> <td><code>published-by-producer</code> <p><code>published-by-third-party</code> </p> </td><td>Indicates whether the data set is published by the original data producer or a third party. </td></tr> <tr> <td><code>limited-sparql-endpoint</code> </td><td>Indicates whether the SPARQL endpoint is not serving the whole data set. </td></tr> <tr> <tr><td><code>format-<i><prefix></i></code> </td><td>A <a href="http://esw.w3.org/TaskForces/CommunityProjects/LinkingOpenData/CommonVocabularies" class="external text" title="http://esw.w3.org/TaskForces/CommunityProjects/LinkingOpenData/CommonVocabularies">vocabulary</a> used by the data set, e.g., <code>format-skos</code>, <code>format-dc</code>, <code>format-foaf</code> </td></tr> <td><code>lodcloud.nolinks</code> </td><td>Data set has no external RDF links to other datasets. </td></tr> <tr> <td><code>lodcloud.unconnected</code> </td><td>Data set has no external RDF links to or from other datasets. </td></tr> <tr> <td><code>lodcloud.needsinfo</code> </td><td>The data provider or data set homepage do not provide mininum information (and information can't be determined from SPARQL endpoint or downloads). </td></tr> <tr> <td><code>lodcloud.needsfixing</code> </td><td>The dataset is currently broken. Provide details in the Notes. </td></tr></table> <h4><a name="level4"/>Level 4 (reviewed and added to lodcloud group) </h4> <p>The data set has been reviewed and added to the lodcloud group.</p> <p>Please check it still for missing information and update those if needed.</p> </div> </BODY></HTML>