Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modeling patterns and/or gist updates for a magnitude range #1145

Open
rjyounes opened this issue Aug 8, 2024 · 25 comments
Open

Modeling patterns and/or gist updates for a magnitude range #1145

rjyounes opened this issue Aug 8, 2024 · 25 comments

Comments

@rjyounes
Copy link
Collaborator

rjyounes commented Aug 8, 2024

We recommend the following two patterns for modeling a range of magnitudes. Note that the class gist:Magnitude is not used, because a magnitude represents an exact value, whereas a range is a specification.

Pattern 1

:_Specification_1 
    a gist:Specification ; 
    gist:hasAspect gistd:_Aspect_length ;
    gist:hasUnitOfMeasure gistd:_UnitOfMeasure_inch ;
    :valueGreaterThan 40 ;
    :valueLessThan 50 ;
    .

To implement this model, define in your own namespace the predicates you need for greater than, less than, greater than or equal to, less than or equal to, equal to. Note that gist:numericValue, which is used for actual amounts, should not be the superproperty of these properties, else a reasoner will generate the following false inferences from the triples above:

:_Specification_1 
   gist:numericValue 40 ;
   gist:numericValue 50 ;
.

While gist:numericValue is not formally defined as a functional property, it is typically used that way, and therefore these inferences would likely wreak havoc with your data.

You might consider defining a subclass of gist:Specification such as :SpecEntry, roughly as follows:

:SpecEntry a owl:Class ;
    owl:equivalentClass [
        a owl:Class ;
        owl:intersectionOf (
            gist:Specification ;
            [
                a owl:Restriction ;
                owl:onProperty gist:hasAspect ;
                owl:someValuesFrom gist:Aspect ;
            ]
            [
                a owl:Restriction ;
                owl:onProperty gist:hasUnitOfMeasure ;
                owl:someValuesFrom gist:UnitOfMeasure ;
            ]
            # If you have defined a superproperty above:
            [
                a owl:Restriction ;
                owl:onProperty :specifiedValue ;
                owl:someValuesFrom rdfs:Literal ; # or see the range of gist:numericValue
            ]
        )
    ] ;
.

Pattern 2

ex:_Specification_1 # or SpecEntry
   a gist:Specification ; # or SpecEntry  
   ops:hasValueGreaterThan :mag_1 ;
   ops:hasValueLessThan :mag_2 ;
.

:mag_1 a gist:Magnitude ;
   gist:hasAspect gistd:_Aspect_length ;
   gist:hasUnitOfMeasure gistd:_UnitOfMeasure_inch ;
   gist:numericValue 40 ;
.

:mag_1 a gist:Magnitude ;
   gist:hasAspect gistd:_Aspect_length ;
   gist:hasUnitOfMeasure gistd:_UnitOfMeasure_inch ;
   gist:numericValue 50 ;
.

This pattern uses predicates defined in Semantic Arts' Operators Ontology. Note that this ontology is entirely independent of gist and does not import it.

Pros and Cons

  • Pattern 1
    • Pro: Aspect and unit of measure are attributed to the specification and thus common to both amounts.
    • Con: Requires entirely new predicates defined in your namespace. (The Operators Ontology predicates are object properties and thus cannot be used here.)
  • Pattern 2
    • Con: Allows the aspect and unit of measure to differ between the two amounts.
    • Pro: Predicates are already defined in an Operators Ontology that can be imported. No additions to extension ontologies are required.

Introducing the SpecEntry class is already under consideration for gist 13.1.0. (Note that this issue is outdated, using old predicate names and the gist 12 and earlier definition of Magnitude. It would be updated to align with the current definition.)

We will also consider introducing into gist the datatype properties used in pattern 1. In this case we would consider renaming gist:numericValue to gist:valueEqualTo so that numericValue can become the superproperty. We would consider this a major change, despite no formal change to the definition, due to the shift in the meaning in the skos:definition and the use of the property in current implementations.

Pattern 2 is supported by gist out of the box along with the Operators Ontology.

@MichaelSullivanArchitect
Copy link

@philblackwood
Copy link
Contributor

As I suggested in the meeting, we could use numericValue, maxNumericValue, and minNumericValue to economize on the number of concepts used.

valueGreaterThanOrEqualTo 40 would be the same as minNumericValue 40.

Also, do we want a more specific name for the class of things that represent a required range of magnitudes, such as SpecifiedMagnitude?

@rjyounes
Copy link
Collaborator Author

rjyounes commented Aug 9, 2024

Also, do we want a more specific name for the class of things that represent a required range of magnitudes, such as SpecifiedMagnitude?

Yes, @Jamie-SA proposed this as well: "I would probably give a different name to the class that you defined as :SpecEntry because it currently only works for a Specification of numeric values. There are other specifications that are not related to Magnitudes (such as types of allowed metal alloys) that I would still consider a SpecEntry. So maybe :MeasurementSpecEntry or something else that qualifies it compared to other possible sub-classes."

And in fact, the original definition of SpecEntry from Michael does include category values as well as magnitudes. MagnitudeSpecification is another option.

As I suggested in the meeting, we could use numericValue, maxNumericValue, and minNumericValue to economize on the number of concepts used. valueGreaterThanOrEqualTo 40 would be the same as minNumericValue 40.

What about valueGreaterThan 40? We don't know the min value unless we know the precision; e.g., with precision of 1 the min value is 41, with precision of 0.1 the min value is 40.1. Perhaps specifications should always have precision, though.

Is there a concern about asymmetry in modeling a range specification and a single value that's part of a specification? For example, a product specification might specify a length of exactly 40 inches and a width between 20 and 30 inches (not a great example, but you get the idea). In the first case we have a magnitude, in the second a magnitude specification. I'm not claiming there's a problem with that, just raising the question.

@philblackwood
Copy link
Contributor

Let's consider what queries should look like.

If a Thing has a magnitude and a specification for magnitude, a SPARQL ASK query could pull the data about both and check for conformance to the spec with a few statements like:

filter(?numericValue <= ?maxAcceptableValue)
filter(?numericValue >= ?minAcceptableValue)
filter(?numericValue = ?requiredValue)

A slight refinement is needed to adjust for different units of measure (multiple values by conversion factors, or in the case of temperatures convert to Kelvin using conversionOffset and conversionFactor).

maxAcceptableValue, etc. also make sense for any comparable non-numeric values like release versions, alphas, etc.

Sketching out the query, it seemed to make sense to have a property hasRequiredMagnitude. It is somewhat specialized, but Magnitudes are a high-value, high-volume use case for any Enterprise.

@rjyounes
Copy link
Collaborator Author

rjyounes commented Aug 9, 2024

Sketching out the query, it seemed to make sense to have a property hasRequiredMagnitude. It is somewhat specialized, but Magnitudes are a high-value, high-volume use case for any Enterprise.

Maybe more generally requires? I've used that often.

@uscholdm
Copy link
Contributor

uscholdm commented Aug 9, 2024

As I suggested in the meeting, we could use numericValue, maxNumericValue, and minNumericValue to economize on the number of concepts used.

valueGreaterThanOrEqualTo 40 would be the same as minNumericValue 40.

Also, do we want a more specific name for the class of things that represent a required range of magnitudes, such as SpecifiedMagnitude?

Its true than min is the same as greater or equal to and max means less or equal to.
Just different names, not more economy. What do you propose to call the properties for strictly greater than and strictly less than that users may need?

@MichaelSullivanArchitect

FWIW, we will probably be leaning towards pattern #2 as it aligns nicely with our current (GIST11) implementation -- our pipeline will validate the shape of the data so having the possibility of conflicting _Aspects and _UOMs is a non-issue.

@rjyounes
Copy link
Collaborator Author

Jamie: Pattern 2 allows specifying hasAccuracy on each value.
Michael: the operators ontology includes variance

Perhaps we don't need accuracy on the specification. Instead of 40-50 inches plus or minus 1/16 inch you would say 39 15/16 - 50 1/16 inches.

Jamie: add a note saying that if needed you could use the operators ontology with 2 magnitudes.

Do we want to introduce the predicates into gist, or just recommend a modeling pattern.

Proposals:

  • Add 4 (5?) datatype properties to gist
  • Do nothing in gist, offer both patterns as recommendations depending on use case

@philblackwood
Copy link
Contributor

Attached are diagrams of examples from the current gitHub repository for operators (option 2), and the same diagrams refactored to use data properties (option 1). Using data properties makes the diagrams simpler for spec entries and also for version dependencies. For version dependencies, it eliminates some blank nodes.

spec entry and version dependency.pptx

@philblackwood
Copy link
Contributor

philblackwood commented Aug 28, 2024

Could we use existing SHACL properties to define SpecEntries?

sh:minExclusive
sh:minInclusive
sh:maxExclusive
sh:maxInclusive

They have literal as an object. The SHACL spec does not say what the subject can be, so why not a spec entry?

Note: in SHACL, properties like sh:lessThan have properties as subject and object, e.g. ex:birthDate sh:lessThan ex:deathDate. So we would not want to use these for defining the values in a SpecEntry.

If we do want to "define our own" in gist, it might be helpful to people who use SHACL to use the min/max style instead of lessThan/greaterThan style so they don't have to switch contexts too much when going between gist and SHACL.

@rjyounes
Copy link
Collaborator Author

@philblackwood We cannot use SHACL in gist because it makes gist non-DL compliant. See #1001 and the issues referenced there. I would also argue that your proposal is a misuse of the intent of the SHACL terms, but I don't need that to make my case.

@rjyounes rjyounes closed this as not planned Won't fix, can't repro, duplicate, stale Aug 29, 2024
@rjyounes
Copy link
Collaborator Author

rjyounes commented Aug 29, 2024

I am leaning towards Pattern 2, using Magnitude objects rather than datatype properties. It is more consistent to always represent amounts using a Magnitude, rather than sometimes using magnitudes and sometimes something else. A magnitude in a specification is still an amount, it is just an amount specified rather than an actual amount. I don't believe the specified vs actual difference should change the modeling.

@rjyounes rjyounes reopened this Aug 29, 2024
@MichaelSullivanArchitect
Copy link

@philblackwood
Copy link
Contributor

Looking at the predicates ... here is an argument for using the min/max style rather than the lessThan/greaterThan style.

We want to be able to say that a certain aspect of a thing needs to fall within a range of values.

For example:

[set of "good" values] [some predicate] [highest acceptable value]

The predicate should relate a set of values to a specific value.

hasValueLessThan would typically mean "has a value that is less than" (but that's not what we want).

hasValuesLessThan would typically mean "has some values less than" (but that's not what we want).

As a statement about the set of good values, the predicate hasMaximumValue has a clearer meaning:

[the set of good values] [has a maximum of] [highest acceptable value] expresses the desired relationship between a set of values and a specific value.

In other words, the min/max style makes a clear statement about the set of values, while the lessThan/greaterThan style seems to require giving the predicates a meaning that is different from the typical plain English meaning we use with other predicates that start with "has".

@uscholdm
Copy link
Contributor

uscholdm commented Sep 9, 2024

@philblackwood

Looking at the predicates ... here is an argument for using the min/max style rather than the lessThan/greaterThan style.

We want to be able to say that a certain aspect of a thing needs to fall within a range of values.

That is correct. The operators ontology was designed to be used with an object whose meaning is: "A specification of values for a particular aspect indicating what it means to be in spec for that aspect." Conceptually, there are two parts. The aspect, and the specification of the set of acceptable values. The latter is not a thing unto itself with an IRI, rather it's specified by one or more triples that collectively define the range for the aspect, i.e. LE, LT, EQ, GE, GT and a few others. So there might be an IRI that means 'the width is less than 10 inches' :_SpecEntry_WidthLt10inches or :W_Lt_10in for short .

For example:

[set of "good" values] [some predicate] [highest acceptable value]

The predicate should relate a set of values to a specific value.

No, there is no IRI that corresponds to the set of good values on its own. The triple that expresses that the width must be LT 10 inches is
:W_Lt_10in :hasValueLessThan :_Magnitude_10inches. there might also be a triple that expresses that the width must be GE 4 inches. :W_Ge_4in :hasValueGreaterOrEqualTo :_Magnitude_4inches

Less than is sometimes called max exclusive. In mathematical as well as everyday terms greater than or equal to is identical in meaning to minimum. It was hard to come up with good definitions, but the current definition of hasValueLessThan is:
"Relates a specification to a value that an aspect must be less than."

:W_Lt_10in :hasValueLessThan :_Magnitude_10inches. says that the specification for the aspect width must have a value less than 10in in order to be regarded as being in spec (in the allowable range).

hasValueLess would typically mean "has a value that is less than" (but that's not what we want).

Why not? It is exactly what I want. 'Less than' is much more friendly than 'min exclusive'.

As a statement about the set of good values, the predicate hasMaximumValue has a clearer meaning:
[the set of good values] [has a maximum of] [highest acceptable value] expresses the desired relationship between a set of values and a specific value.

In other words, the min/max style makes a clear statement about the set of values, while the lessThan/greaterThan style seems to require giving the predicates a meaning that is different from the typical plain English meaning we use with other predicates that start with "has".

It is true that the use of 'has' is not great, I could not think of a better alternative.
Two more things. First, max has the same meaning as 'less than or equal to' but a different meaning from 'less than'. Second I am not wanting to say something about the set of good values, there is no IRI that corresponds exactly to that set. I am specifying what it means to BE a good value. This may be where we are talking across purposes.

@philblackwood
Copy link
Contributor

An aggregate relates a set of values to a single value, e.g. the average of the scores is 87 or the max of the scores is 96.

We don't say "the scores less or equal to 96". The comparison relations <, <=, >, >= are used to compare two things of the same type, and we shouldn't use them in place of the aggregate relations that are defined as relating a set of values to a specific value.

@philblackwood
Copy link
Contributor

philblackwood commented Sep 25, 2024

rewritten Oct 1, 2024 -- includes issue #527

No attempt is made to propose what belongs in gist vs. a submodule like the versioning ontology. (even if the examples might look like it)

General pattern:

image

Queries can compare the actual value against the acceptable values for the given common element. The common element is used to find the set of acceptable values that the actual value should be validated against.

The common element snaps together when the actual_value and the acceptable_values are created and both refer to it.

Case 1: values of a category

The way we typically represent categorical data is like this:

image

We want to specify the list of acceptable or available colors for a given thing.

image image

We want to be able to see if the color of a thing is one of the acceptable colors. To do this we need a way to find the correct set of acceptable values. What the actual color and the list of acceptable colors have in common is that they are both related to color.

image

In this pattern, we expect the thing to have a single set of acceptable colors.

In this simple scenario, if the thing has two colors, say red and green, we can infer that they are both valid. (if we want color combinations, we would have to create a different category)

Since each category has a finite list of members, the simple approach is to always list the acceptable values using gist:isMemberOf.

Case 2: magnitudes

A thing can have many different magnitudes, each represented as:

image

We want to specify the acceptable or available magnitudes for a given thing.

image

If the specification is used as guidance to create something, we want to be able to say what the magnitude of the created thing should be.

image If we want to identify a list of magnitudes a thing may have, we could perhaps list multiple targets. (tbd)

We want to be able to identify a range of acceptable values for the magnitude of the thing. Since the specification is a set of values, the most descriptive properties to identify the range are max and min (for example, min relates a set of values to a single value).

A magnitude specification may have any combination of target, max, and min (at least one of these).

image

In this approach, a validation should be done to ensure that all of the specified magnitudes have the same aspect.

The presentation layer can translate the max and min to percentages or to a tolerance (if the max and min are equally distant from the target value).

We want to be able to see if the magnitude of a thing is one of the acceptable values. To do this we need a way to find the correct set of acceptable values. What the actual magnitude and the set of acceptable magnitudes have in common is that they both are related to the same aspect.

image

In this pattern, we expect the thing to have a single actual magnitude and a single set of acceptable magnitudes.

Case 3: versions

It is common for a versioned thing to use another versioned thing:

image

The predicate "uses" will cover the case of imports for ontologies and also versions of software that may be related to other software in many different ways. It implies there is a dependency.

In this example, we want to specify which versions of gistCore are compatible with version 6.3.0 of the disease taxonomy.

image

The versions of gistCore are ordered, so we can define the compatible versions as:

image

In this example, version 12.1.0 and all it successors up to but not including version 13.0.0 are compatible.

We want to be able to see if the version being used by a thing is one of the compatible values. To do this we need a way to find the correct set of compatible versions. What the actual version being used and the set of compatible versions have in common is that they both are related to the same underlying ontology.

image

In this pattern, we expect that there should be a single set of compatible versions of gist for the given version of the taxonomy.

It might be possible to extend this to say that a range of versions of the disease taxonomy is compatible with a range of versions of gist.

These patterns are equally applicable to versioned software.

Note that in some organizations, governance is done by directly relating compatible versions (instead of creating ranges). We could use :isCompatibleWith to handle that case.

@MichaelSullivanArchitect

This is great work Phil. We will be leveraging these patterns for sure as our use-cases are all engineering oriented.

@MichaelSullivanArchitect

Does this suggest that gist:MagnitudeSpec will be a new class in the next release?

@MichaelSullivanArchitect

also, does this pattern preclude using gist:hasAccuracy in addition to the ranges and target?

@philblackwood
Copy link
Contributor

@MichaelSullivanArchitect Accuracy is best thought of as a characteristic of the measurement method, e.g. the scale is accurate to within a pound.

It can also be thought of as the accuracy of a measurement, e.g. the scale says I weigh 180 and it's accurate to within a pound, so I weigh somewhere between 179 and 181.

However, when specifying a required amount I think it's better to just set hard limits. It's fine to set a requirement on the accuracy, but I don't think there is a notion of accuracy of a required range because the required range is not a measurement.

@MichaelSullivanArchitect
Copy link

@uscholdm
Copy link
Contributor

An aggregate relates a set of values to a single value, e.g. the average of the scores is 87 or the max of the scores is 96.

That sounds right.

We don't say "the scores less or equal to 96".

Correct.

The comparison relations <, <=, >, >= are used to compare two things of the same type, and we shouldn't use them in place of the aggregate relations that are defined as relating a set of values to a specific value.

I agree that they are two different things, but they are closely related. The operators ontology uses comparison operators to define a set of values implicitly without having an IRI that denotes that set of values. E.g. 6 <= x <= 8 defines the set of numbers between 6 & 8 including 6 & 8.

@uscholdm
Copy link
Contributor

Does this suggest that gist:MagnitudeSpec will be a new class in the next release?

@MichaelSullivanArchitect It is a proposal. This is tightly linked with issue #1161 which calls the class SpecEntry. Other proposed names are: ValueSpec and ValueSpecification.

@rjyounes
Copy link
Collaborator Author

This is tightly linked with issue #1161 which calls the class SpecEntry

The classes are actually different. I would say this model uses MagnitudeSpec where the other uses SpecEntry (or ValueSpec(ification). The latter allows for categorial values as well. Refer to issue #527 and especially this comment from our internal gist meeting for comparison.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Triage
Development

No branches or pull requests

4 participants