Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify documentation of datamodels and usage of plugins #977

Merged
merged 19 commits into from
Oct 28, 2024
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 0 additions & 77 deletions doc/user_guide/concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -269,83 +269,6 @@ Relations are currently not explored in metadata, but are included because of
their generality.
However, relations are heavily used in [collections].


### Representing an entity
Lets start to make a "Person" entity, where we want to describe his/her name, age and skills.

```json
{
"uri": "http://onto-ns.com/meta/0.1/Person",
"meta": "http://onto-ns.com/meta/0.3/EntitySchema",
"description": "A person.",
"dimensions": [
{
"name": "N",
"description": "Number of skills."
}
],
"properties": [
{
"name": "name",
"type": "string",
"description": "Full name."
},
{
"name": "age",
"type": "float",
"unit": "years",
"description": "Age of person."
},
{
"name": "skills",
"type": "string",
"shape": ["N"],
"description": "List of skills."
}
]
}
```

First we have "uri" identifying the entity, "meta" telling that this is an instance of the entity schema (hence an entity) and a human description.
Then comes "dimensions".
In this case one dimension named "N", which is the number of skills the person has.
Finally we have the properties; "name", "age" and "skills".
We see that "name" is represented as a string, "age" as a floating point number with unit years and "skills" as an array of strings, one for each skill.


### SOFT7 representation
Based on input from [SOFT7], DLite also supports a slightly shortened representation of entities.
The "Person" entity from the above example will in this representation, look like:

```json
{
"uri": "http://onto-ns.com/meta/0.1/Person",
"description": "A person.",
"dimensions": {
"N": "Number of skills."
},
"properties": {
"name": {
"type": "string",
"description": "Full name."
},
"age": {
"type": "float",
"unit": "years",
"description": "Age of person."
},
"skills": {
"type": "string",
"shape": ["N"],
"description": "List of skills."
}
}
}
```

In this representation defaults the `meta` field to the entity schema if it is left out.
Dimensions and Properties are dictionaries (JSON objects) instead of arrays with the dimension or property name as key.

references
----------

Expand Down
69 changes: 69 additions & 0 deletions doc/user_guide/datamodels.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
Representing a datamodel (entity)
----------------------------------

The underlying structure of DLite datamodels are described under [concepts].

Here, at set of rules on how to create a datamodel is presented.

Note that several other possibilities are avilable, and this can be seen in the
examples and tests present in the repository.

We choose here to present only one method as mixing repsentation methods might
be confusing. Note, however that yaml and json representations are interchangable.

A generic example with some comments for clarity can be seen below.

```yaml
francescalb marked this conversation as resolved.
Show resolved Hide resolved
uri: http://namespace/version/name
description: A description of what this datamodel represents.
dimensions: # Named dimensions referred to in the property shapes. Simplest to represent it as a dict, set to {} if there are no dimensions
name_of_dimension: description of dimension
properties:
name_of_property1:
description: What is this property
type: ref # Can be any on string, float, double, int, ref ....
jesper-friis marked this conversation as resolved.
Show resolved Hide resolved
unit: unit # can be ommitted, not relevant with type ref
francescalb marked this conversation as resolved.
Show resolved Hide resolved
shape: [name_of_dimension] # Can be omitted if the property is a scalar
$ref: http://namespace/version/name_of_referenceddatamodel # only if type is ref
```

A slightly more realistic example is the "Person" entity, where we want to describe his/her name, age and skills.

```yaml
uri: http://onto-ns.com/meta/0.1/Person
description: A person.
dimensions:
N: Number of skills.
properties:
name:
description: Full name.
type: string
age:
description: Age of person.
type: float
unit: years
skills:
description: List of skills.
type: string
shape: [N]
```

First we have "uri" identifying the entity, and a human description.
Then comes "dimensions".In this case one dimension named "N", which is the number of skills the person has.
Finally we have the properties; "name", "age" and "skills".
We see that "name" is represented as a string, "age" as a floating point number with unit years and "skills" as an array of strings, one for each skill.
francescalb marked this conversation as resolved.
Show resolved Hide resolved


dlite-validate
==============
The dlite-validate tool can be used to check if a specific representation (in a file) is a valid DLite datamodel

This can be run as follows
```bash
dlite-validate filename.yaml # or json
```

It will then return a list of errors if it is not a valid representation.
francescalb marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be moved to doc/user_guide/tools.md. But it would be great with a note here saying that datamodels can be validated using the dlite_validate tool, with a reference to the correct section in doc/user_guide/tools.md.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you pushed the changes to doc/user_guide/tools.md?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet, waiting for you answer on teams.



[concepts]: https://sintef.github.io/dlite/user_guide/concepts.html
1 change: 1 addition & 0 deletions doc/user_guide/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ User Guide
:caption: Contents

concepts
datamodels
type-system
exceptions
collections
Expand Down
35 changes: 33 additions & 2 deletions doc/user_guide/storage_plugins.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Storage plugins
===============
Storage plugins / Drivers
=========================

Content
-------
Expand Down Expand Up @@ -28,6 +28,37 @@ It also comes with a specific `Blob` and `Image` storage plugin, that can load a
Storage plugins can be written in either C or Python.


How to make storage plugins available
-------------------------------------

As described below it is possible (and most often advisable) to create specific drivers (storage plugins) for your data.
Additional storage plugins drivers can be made available by setting the environment variables
`DLITE_STORAGE_PLUGIN_DIRS` or `DLITE_PYTHON_STORAGE_PLUGIN_DIRS` e.g.:
```bash
export DLITE_STORAGE_PLUGIN_DIRS=/path/to/new/folder:$DLITE_STORAGE_PLUGIN_DIRS
```

Within python, the path to the directory containing plugins can be added as follows:

```python
import dlite
dlite.python_storage_plugin_path.append("/path/to/plugins/dir")
```

Often drivers are connected to very specific datamodel (entities).
DLite will find these datamodels if the path to their directory is set with the
environment variable `DLITE_STORAGES` or added within python with `dlite.storage_path.append` similarly to described above for drivers.


IMPORTANT:
francescalb marked this conversation as resolved.
Show resolved Hide resolved
Often, during development dlite will fail unexpectedly. This is typically either because of an error in the
datamodel or the driver.
The variable DLITE_PYDEBUG can be set as `export DLITE_PYDEBUG=` to get python debugging information.
This will give information about the driver.
But is is advisable to first check that the datamodel is valid with the command `dlite-validate datamodelfilename`.
francescalb marked this conversation as resolved.
Show resolved Hide resolved

francescalb marked this conversation as resolved.
Show resolved Hide resolved
francescalb marked this conversation as resolved.
Show resolved Hide resolved


Using storages implicitly from Python
-------------------------------------
For convenience DLite also has an interface for creating storages implicitly.
Expand Down