From 1eef3dd0f85480e6996cc5dd7d9dd5ad631c4489 Mon Sep 17 00:00:00 2001 From: Claus Stadler Date: Tue, 17 Sep 2024 15:27:16 +0200 Subject: [PATCH] updated doc --- README.md | 30 ------------------------ docs/getting-started/index.md | 2 +- docs/index.md | 2 +- docs/integrate/canned-queries.md | 39 ++++++++++++++++++++++++++++++++ 4 files changed, 41 insertions(+), 32 deletions(-) create mode 100644 docs/integrate/canned-queries.md diff --git a/README.md b/README.md index dd9a244..2152d98 100644 --- a/README.md +++ b/README.md @@ -32,36 +32,6 @@ rpt ngs wc file.trig ./produce-graphs.sh | ngs head -n 3 ``` -## Canned Queries -RPT ships with several useful queries on its classpath. Classpath resources can be printed out using `cpcat`. The following snippet shows examples of invocations and their output: - -### Overview -```bash -$ rpt cpcat spo.rq -CONSTRUCT WHERE { ?s ?p ?o } - -$ rpt cpcat gspo.rq -CONSTRUCT WHERE { GRAPH ?g { ?s ?p ?o } } -``` - -Any resource (query or data) on the classpath can be used as an argument to the `integrate` command: - -``` -rpt integrate yourdata.nt spo.rq -# When spo.rq is executed then the data is queried and printed out on STDOUT -``` - -### Reference - -The exact definitions can be viewed with `rpt cpcat resource.rq`. - -* `spo.rq`: Output triples from the default graph -* `gspo.rq`: Output quads from the named graphs -* `tree.rq`: Deterministically replaces all intermediate nodes with blank nodes. Intermediate nodes are those that appear both as subject and as objects. Useful in conjunction with `--out-format turtle/pretty` for formatting e.g. RML. -* `gtree.rq`: Named graph version of `tree.rq` -* `rename.rq`: Replaces all occurrences of an IRI in subject and object positions with a different one. Usage (using environment variables): `FROM='urn:from' TO='urn:to' rpt integrate data.nt rename.rq` -* `count.rq`: Return the sum of the counts of triples in the default graph and quads in the named graphs. -* `s.rq`: List the distinct subjects in the default graph ## Example Use Cases diff --git a/docs/getting-started/index.md b/docs/getting-started/index.md index ae5d788..939efe1 100644 --- a/docs/getting-started/index.md +++ b/docs/getting-started/index.md @@ -13,7 +13,7 @@ layout: default You can download RPT as self-contained Debian or RPM packages from [RPT's GitHub release page](https://github.com/SmartDataAnalytics/RdfProcessingToolkit/releases). -Note, that for running the JAR bundle with the `java` command yourself you need to add the appropriate `--add-opens` declarations. This is documented on the [Building from Source](getting-started/build.html) page. +Note, that for running the JAR bundle with the `java` command yourself you need to add the appropriate `--add-opens` [JVM Options](build.html#jvm-options) page. ### Docker diff --git a/docs/index.md b/docs/index.md index 678ac66..2f37351 100644 --- a/docs/index.md +++ b/docs/index.md @@ -20,7 +20,7 @@ RPT is Java tool which comes with debian and rpm packaging. It is invoked using * [ngs](named-graph-streams): Processor for named graph streams (ngs) which enables processing for collections of named graphs in streaming fashion. Process huge datasets without running into memory issues. * [sbs](sparql-binding-streams): Processor for SPARQL binding streams (sbs) which enables processing of SPARQL result sets in streaming fashion. Most prominently for use in aggregating the output of a `ngs map` operation. * [rmltk](https://github.com/Scaseco/r2rml-api-jena/tree/jena-5.0.0#usage-of-the-cli-tool): These are the (sub-)commands of our (R2)RML toolkit. The full documentation is available [here](https://github.com/SmartDataAnalytics/r2rml-api-jena). -* sansa: These are the (sub-)commands of our Semantic Analysis Stack (Stack) - a Big Data RDF Processing Framework. Features parallel execution of RML/SPARQL and TARQL (if the involved sources support it). +* [sansa]: These are the (sub-)commands of our Semantic Analysis Stack (Stack) - a Big Data RDF Processing Framework. Features parallel execution of RML/SPARQL and TARQL (if the involved sources support it). **Check this [documentation](doc) for the supported SPARQL extensions with many examples** diff --git a/docs/integrate/canned-queries.md b/docs/integrate/canned-queries.md new file mode 100644 index 0000000..083fcde --- /dev/null +++ b/docs/integrate/canned-queries.md @@ -0,0 +1,39 @@ +--- +title: Canned Queries +parent: RDF/SPARQL Processing +nav_order: 10 +layout: default +--- + +## Canned Queries +RPT ships with several useful queries on its classpath. Classpath resources can be printed out using `cpcat`. The following snippet shows examples of invocations and their output: + +### Overview +```bash +$ rpt cpcat spo.rq +CONSTRUCT WHERE { ?s ?p ?o } + +$ rpt cpcat gspo.rq +CONSTRUCT WHERE { GRAPH ?g { ?s ?p ?o } } +``` + +Any resource (query or data) on the classpath can be used as an argument to the `integrate` command: + +``` +rpt integrate yourdata.nt spo.rq +# When spo.rq is executed then the data is queried and printed out on STDOUT +``` + +### Reference + +The exact definitions can be viewed with `rpt cpcat resource.rq`. + +* `spo.rq`: Output triples from the default graph. +* `gspo.rq`: Output quads from the named graphs. +* `spogspo.rq`: Output all triples followed by all quads. +* `tree.rq`: Deterministically replaces all intermediate nodes with blank nodes. Intermediate nodes are those that appear both as subject and as objects. Useful in conjunction with `--out-format turtle/pretty` for formatting e.g. RML. +* `gtree.rq`: Named graph version of `tree.rq`. +* `rename.rq`: Replaces all occurrences of an IRI in subject and object positions with a different one. Usage (using environment variables): `FROM='urn:from' TO='urn:to' rpt integrate data.nt rename.rq` +* `count.rq`: Return the sum of the counts of triples in the default graph and quads in the named graphs. +* `s.rq`: List the distinct subjects in the default graph. +