Skip to content

Commit d40ba53

Browse files
authored
Merge pull request #39 from crim-ca/fix-comments
2 parents 8046280 + 1c518c4 commit d40ba53

File tree

12 files changed

+574
-476
lines changed

12 files changed

+574
-476
lines changed

CHANGELOG.md

Lines changed: 22 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,19 +8,37 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
88
## [Unreleased](https://github.com/crim-ca/mlm-extension/tree/main)
99

1010
### Added
11-
- n/a
11+
- Add `raster:bands` required property `name` for describing `mlm:input` bands
12+
(see [README - Bands and Statistics](README.md#bands-and-statistics) for details).
13+
- Add README warnings about new extension `eo` and `raster` versions.
1214

1315
### Changed
14-
- n/a
16+
- Split `ModelBands` and `AnyBandsRef` definitions in the JSON schema to allow them to be referenced individually.
17+
- Move `AnyBandsRef` definition explicitly to STAC Item JSON schema, rather than implicitly inferred via `mlm:input`.
18+
- Modified the JSON schema to use a `if` check of the `type` (STAC Item or Collection) prior to validating further
19+
properties. This allows some validators (e.g. `pystac`) to better report the *real* error that causes the schema
20+
to fail, rather than reporting the first mismatching `type` case with a poor error description to debug the issue.
1521

1622
### Deprecated
1723
- n/a
1824

1925
### Removed
20-
- n/a
26+
- Removed `$comment` entries from the JSON schema that are considered as invalid by some parsers.
27+
- When `mlm:input` objects do **NOT** define band references (i.e.: `bands: []` is used), the JSON schema will not
28+
fail if an Asset with the `mlm:model` role contains a band definition. This is to allow MLM model definitions to
29+
simultaneously use some inputs with `bands` reference names while others do not.
2130

2231
### Fixed
23-
- n/a
32+
- Band checks against [`eo`](https://github.com/stac-extensions/eo), [`raster`](https://github.com/stac-extensions/eo)
33+
or STAC Core 1.1 [`bands`](https://github.com/radiantearth/stac-spec/blob/master/commons/common-metadata.md#bands)
34+
when a `mlm:input` references names in `bands` are now properly validated.
35+
- Fix the examples using `raster:bands` incorrectly defined in STAC Item properties.
36+
The correct use is for them to be defined under the STAC Asset using the `mlm:model` role.
37+
- Fix the [EuroSAT ResNet pydantic example](./stac_model/examples.py) that incorrectly referenced some `bands`
38+
in its `mlm:input` definition without providing any definition of those bands. The `eo:bands` properties have
39+
been added to the corresponding `model` Asset using
40+
the [`pystac.extensions.eo`](https://github.com/stac-utils/pystac/blob/main/pystac/extensions/eo.py) utilities.
41+
- Fix various STAC Asset definitions erroneously employing `mlm:model` role instead of the intended `mlm:source_code`.
2442

2543
## [v1.2.0](https://github.com/crim-ca/mlm-extension/tree/v1.2.0)
2644

CONTRIBUTING.md

Lines changed: 24 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ make install-dev
1919
make pre-commit-install
2020
```
2121

22-
## PR submittion
22+
## PR submission
2323

2424
Before submitting your code please do the following steps:
2525

@@ -41,7 +41,7 @@ make lint-all
4141
make test
4242
```
4343

44-
5. Upload your changes to your fork, then make a PR from there to the main repo:
44+
6. Upload your changes to your fork, then make a PR from there to the main repo:
4545

4646
```bash
4747
git checkout -b your-branch
@@ -53,11 +53,15 @@ git push -u origin your-branch
5353

5454
## Building and releasing
5555

56-
> :warning: <br>
56+
<!-- lint disable no-undefined-references -->
57+
58+
> [!WARNING]
5759
> There are multiple types of releases for this repository: <br>
5860
> 1. Release for MLM specification (usually, this should include one for `stac-model` as well to support it)
5961
> 2. Release for `stac-model` only
6062
63+
<!-- lint enable no-undefined-references -->
64+
6165
### Building a new version of MLM specification
6266

6367
- Checkout to the `main` branch by making sure the CI passed all previous tests.
@@ -69,9 +73,14 @@ git push -u origin your-branch
6973
- Make a commit to `GitHub` and push the corresponding auto-generated `v{MAJOR}.{MINOR}.{PATCH}` tag.
7074
- Validate that the CI validated everything once again.
7175
- Create a `GitHub release` with the created tag.
72-
> :warning: <br>
73-
> - Ensure the "Set as the latest release" option is selected :heavy_check_mark:.
74-
> - Ensure the diff ranges from the previous MLM version, and not an intermediate `stac-model` release.
76+
77+
<!-- lint disable no-undefined-references -->
78+
79+
> [!WARNING]
80+
> - Ensure the "Set as the latest release" option is selected :heavy_check_mark:.
81+
> - Ensure the diff ranges from the previous MLM version, and not an intermediate `stac-model` release.
82+
83+
<!-- lint enable no-undefined-references -->
7584

7685
### Building a new version of `stac-model`
7786

@@ -83,17 +92,22 @@ git push -u origin your-branch
8392
- Checkout to `main` branch that contais the freshly created merge commit.
8493
- Push the tag `stac-model-v{MAJOR}.{MINOR}.{PATCH}`. The CI should auto-publish it to PyPI.
8594
- Create a `GitHub release`
86-
> :warning: <br>
87-
> - Ensure the "Set as the latest release" option is deselected :x:.
88-
> - Ensure the diff ranges from the previous release of `stac-model`, not an intermediate MLM release.
95+
96+
<!-- lint disable no-undefined-references -->
97+
98+
> [!WARNING]
99+
> - Ensure the "Set as the latest release" option is deselected :x:.
100+
> - Ensure the diff ranges from the previous release of `stac-model`, not an intermediate MLM release.
101+
102+
<!-- lint enable no-undefined-references -->
89103

90104
## Other help
91105

92106
You can contribute by spreading a word about this library.
93107
It would also be a huge contribution to write
94108
a short article on how you are using this project.
95109
You can also share how the ML Model extension does or does
96-
not serve your needs with us in the Github Discussions or raise
110+
not serve your needs with us in the GitHub Discussions or raise
97111
Issues for bugs.
98112

99113
[poetry-install]: https://github.com/python-poetry/install.python-poetry.org

Makefile

Lines changed: 20 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
#* Variables
2-
SHELL := /usr/bin/env bash
3-
PYTHON := python
2+
SHELL ?= /usr/bin/env bash
3+
PYTHON ?= python
44
PYTHONPATH := `pwd`
5+
POETRY ?= poetry
56

67
#* Poetry
78
.PHONY: poetry-install
@@ -14,36 +15,36 @@ poetry-remove:
1415

1516
.PHONY: poetry-plugins
1617
poetry-plugins:
17-
poetry self add poetry-plugin-up
18+
$(POETRY) self add poetry-plugin-up
1819

1920
.PHONY: poetry-env
2021
poetry-env:
21-
poetry config virtualenvs.in-project true
22+
$(POETRY) config virtualenvs.in-project true
2223

2324
.PHONY: publish
2425
publish:
25-
poetry publish --build
26+
$(POETRY) publish --build
2627

2728
#* Installation
2829
.PHONY: install
2930
install: poetry-env
30-
poetry lock -n && poetry export --without-hashes > requirements-lock.txt
31-
poetry install -n
31+
$(POETRY) lock -n && poetry export --without-hashes > requirements-lock.txt
32+
$(POETRY) install -n
3233
-poetry run mypy --install-types --non-interactive ./
3334

3435
.PHONY: install-dev
3536
install-dev: poetry-env install
36-
poetry install -n --with dev
37+
$(POETRY) install -n --with dev
3738

3839
.PHONY: pre-commit-install
3940
pre-commit-install:
40-
poetry run pre-commit install
41+
$(POETRY) run pre-commit install
4142

4243

4344
#* Formatters
4445
.PHONY: codestyle
4546
codestyle:
46-
poetry run ruff format --config=pyproject.toml stac_model tests
47+
$(POETRY) run ruff format --config=pyproject.toml stac_model tests
4748

4849
.PHONY: format
4950
format: codestyle
@@ -61,29 +62,29 @@ check-all: check
6162

6263
.PHONY: mypy
6364
mypy:
64-
poetry run mypy --config-file pyproject.toml ./
65+
$(POETRY) run mypy --config-file pyproject.toml ./
6566

6667
.PHONY: check-mypy
6768
check-mypy: mypy
6869

6970
.PHONY: check-safety
7071
check-safety:
71-
poetry check
72-
poetry run safety check --full-report
73-
poetry run bandit -ll --recursive stac_model tests
72+
$(POETRY) check
73+
$(POETRY) run safety check --full-report
74+
$(POETRY) run bandit -ll --recursive stac_model tests
7475

7576
.PHONY: lint
7677
lint:
77-
poetry run ruff --config=pyproject.toml ./
78-
poetry run pydocstyle --count --config=pyproject.toml ./
79-
poetry run pydoclint --config=pyproject.toml ./
78+
$(POETRY) run ruff --config=pyproject.toml ./
79+
$(POETRY) run pydocstyle --count --config=pyproject.toml ./
80+
$(POETRY) run pydoclint --config=pyproject.toml ./
8081

8182
.PHONY: check-lint
8283
check-lint: lint
8384

8485
.PHONY: format-lint
8586
format-lint:
86-
poetry run ruff --config=pyproject.toml --fix ./
87+
$(POETRY) run ruff --config=pyproject.toml --fix ./
8788

8889
.PHONY: install-npm
8990
install-npm:
@@ -113,7 +114,7 @@ lint-all: lint mypy check-safety check-markdown
113114

114115
.PHONY: update-dev-deps
115116
update-dev-deps:
116-
poetry up --only=dev-dependencies --latest
117+
$(POETRY) up --only=dev-dependencies --latest
117118

118119
#* Cleaning
119120
.PHONY: pycache-remove

README.md

Lines changed: 48 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -224,14 +224,18 @@ It is recommended to define `accelerator` with one of the following values:
224224
- `intel-ipex-gpu` for models optimized with IPEX for Intel GPUs
225225
- `macos-arm` for models trained on Apple Silicon
226226

227-
> :warning: <br>
227+
<!-- lint disable no-undefined-references -->
228+
229+
> [!WARNING]
228230
> If `mlm:accelerator = amd64`, this explicitly indicates that the model does not (and will not try to) use any
229231
> accelerator, even if some are available from the runtime environment. This is to be distinguished from
230232
> the value `mlm:accelerator = null`, which means that the model *could* make use of some accelerators if provided,
231233
> but is not constrained by any specific one. To improve comprehension by users, it is recommended that any model
232234
> using `mlm:accelerator = amd64` also set explicitly `mlm:accelerator_constrained = true` to illustrate that the
233235
> model **WILL NOT** use accelerators, although the hardware resolution should be identical nonetheless.
234236
237+
<!-- lint enable no-undefined-references -->
238+
235239
When `mlm:accelerator = null` is employed, the value of `mlm:accelerator_constrained` can be ignored, since even if
236240
set to `true`, there would be no `accelerator` to contain against. To avoid confusion, it is suggested to set the
237241
`mlm:accelerator_constrained = false` or omit the field entirely in this case.
@@ -265,20 +269,37 @@ representing bands information, including notably the `nodata` value,
265269
the `data_type` (see also [Data Type Enum](#data-type-enum)),
266270
and [Common Band Names][stac-band-names].
267271

268-
> :information_source: <br>
272+
<!-- lint disable no-undefined-references -->
273+
274+
> [!WARNING]
275+
> Only versions `v1.x` of `eo` and `raster` are supported to provide `mlm:input` band references.
276+
> Versions `2.x` of those extensions rely on the [STAC 1.1 - Band Object][stac-1.1-band] instead.
277+
> If those versions are desired, consider migrating your MLM definition to use [STAC 1.1 - Band Object][stac-1.1-band]
278+
> as well for referencing `mlm:input` with band names.
279+
280+
> [!NOTE]
269281
> Due to how the schema for [`eo:bands`][stac-eo-band] is defined, it is not sufficient to *only* provide
270282
> the `eo:bands` property at the STAC Item level. The schema validation of the EO extension explicitly looks
271283
> for a corresponding set of bands under an Asset, and if none is found, it disallows `eo:bands` in the Item properties.
272284
> Therefore, `eo:bands` should either be specified *only* under the Asset containing the `mlm:model` role
273285
> (see [Model Asset](#model-asset)), or define them *both* under the Asset and Item properties. If the second
274286
> approach is selected, it is recommended that the `eo:bands` under the Asset contains only the `name` or the
275287
> `common_name` property, such that all other details about the bands are defined at the Item level.
288+
> An example of such representation is provided in
289+
> [examples/item_eo_bands_summarized.json](examples/item_eo_bands_summarized.json).
290+
> <br><br>
291+
> For an example where `eo:bands` are entirely defined in the Asset on their own, please refer to
292+
> [examples/item_eo_bands.json](examples/item_eo_bands.json) instead.
276293
> <br><br>
277294
> For more details, refer to [stac-extensions/eo#12](https://github.com/stac-extensions/eo/issues/12).
278295
> <br>
279-
> For an example, please refer to [examples/item_eo_bands.json](examples/item_eo_bands.json).
280-
> Notably in this example, the `assets.weights.eo:bands` property provides the `name` to fulfill the Asset requirement,
281-
> while all additional band details are provided in `properties.eo:bands`.
296+
297+
> [!NOTE]
298+
> When using `raster:bands`, and additional `name` parameter **MUST** be provided for each band. This parameter
299+
> is not defined in `raster` extension itself, but is permitted. This addition is required to ensure
300+
> that `mlm:input` bands referenced by name can be associated to their respective `raster:bands` definitions.
301+
302+
<!-- lint enable no-undefined-references -->
282303

283304
Only bands used as input to the model should be included in the MLM `bands` field.
284305
To avoid duplicating the information, MLM only uses the `name` of whichever "Band Object" is defined in the STAC Item.
@@ -294,12 +315,12 @@ to normalize all bands, rather than normalizing the values over a single product
294315
applied differently for distinct [Model Input](#model-input-object) definitions, in order to adjust for intrinsic
295316
properties of the model.
296317

297-
[stac-1.1-band]: https://github.com/radiantearth/stac-spec/pull/1254
298-
[stac-1.1-stats]: https://github.com/radiantearth/stac-spec/blob/bands/item-spec/common-metadata.md#statistics-object
299-
[stac-eo-band]: https://github.com/stac-extensions/eo?tab=readme-ov-file#band-object
300-
[stac-raster-band]: https://github.com/stac-extensions/raster?tab=readme-ov-file#raster-band-object
301-
[stac-raster-stats]: https://github.com/stac-extensions/raster?tab=readme-ov-file#statistics-object
302-
[stac-band-names]: https://github.com/stac-extensions/eo?tab=readme-ov-file#common-band-names
318+
[stac-1.1-band]: https://github.com/radiantearth/stac-spec/blob/v1.1.0/commons/common-metadata.md#bands
319+
[stac-1.1-stats]: https://github.com/radiantearth/stac-spec/blob/v1.1.0/commons/common-metadata.md#statistics-object
320+
[stac-eo-band]: https://github.com/stac-extensions/eo/tree/v1.1.0#band-object
321+
[stac-raster-band]: https://github.com/stac-extensions/raster/tree/v1.1.0#raster-band-object
322+
[stac-raster-stats]: https://github.com/stac-extensions/raster/tree/v1.1.0#statistics-object
323+
[stac-band-names]: https://github.com/stac-extensions/eo#common-band-names
303324

304325
#### Model Band Object
305326

@@ -309,10 +330,14 @@ properties of the model.
309330
| format | string | The type of expression that is specified in the `expression` property. |
310331
| expression | \* | An expression compliant with the `format` specified. The expression can be applied to any data type and depends on the `format` given. |
311332

312-
> :information_source: <br>
333+
<!-- lint disable no-undefined-references -->
334+
335+
> [!NOTE]
313336
> Although `format` and `expression` are not required in this context, they are mutually dependent on each other. <br>
314337
> See also [Processing Expression](#processing-expression) for more details and examples.
315338
339+
<!-- lint enable no-undefined-references -->
340+
316341
The `format` and `expression` properties can serve multiple purpose.
317342

318343
1. Applying a band-specific pre-processing step,
@@ -441,14 +466,18 @@ the following formats are recommended as alternative scripts and function refere
441466
| `docker` | string | An URI with image and tag to a Docker. | `ghcr.io/NAMESPACE/IMAGE_NAME:latest` |
442467
| `uri` | string | An URI to some binary or script. | `{"href": "https://raw.githubusercontent.com/ORG/REPO/TAG/package/cli.py", "type": "text/x-python"}` |
443468

444-
> :information_source: <br>
469+
<!-- lint disable no-undefined-references -->
470+
471+
> [!NOTE]
445472
> Above definitions are only indicative, and more can be added as desired with even more custom definitions.
446473
> It is left as an implementation detail for users to resolve how these expressions should be handled at runtime.
447474
448-
> :warning: <br>
475+
> [!WARNING]
449476
> See also discussion regarding additional processing expressions:
450477
> [stac-extensions/processing#31](https://github.com/stac-extensions/processing/issues/31)
451478
479+
<!-- lint enable no-undefined-references -->
480+
452481
[stac-proc-expr]: https://github.com/stac-extensions/processing#expression-object
453482

454483
### Model Output Object
@@ -543,10 +572,14 @@ In order to provide more context, the following roles are also recommended were
543572
| mlm:model | `model` | Required role for [Model Asset](#model-asset). |
544573
| mlm:source_code | `code` | Required role for [Model Asset](#source-code-asset). |
545574

546-
> :information_source: <br>
575+
<!-- lint disable no-undefined-references -->
576+
577+
> [!NOTE]
547578
> (*) These roles are offered as direct conversions from the previous extension
548579
> that provided [ML-Model Asset Roles][ml-model-asset-roles] to provide easier upgrade to the MLM extension.
549580
581+
<!-- lint enable no-undefined-references -->
582+
550583
[ml-model-asset-roles]: https://github.com/stac-extensions/ml-model?tab=readme-ov-file#asset-objects
551584

552585
### Model Asset

README_DLM_LEGACY.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,13 @@
11
# Deep Learning Model (DLM) Extension
22

3-
> :information_source: <br>
3+
<!-- lint disable no-undefined-references -->
4+
5+
> [!NOTE]
46
> This is legacy documentation references of Deep Learning Model extension
57
> preceding the current Machine Learning Model (MLM) extension.
68
9+
<!-- lint enable no-undefined-references -->
10+
711
Check the original [Technical Report](https://github.com/crim-ca/CCCOT03/raw/main/CCCOT03_Rapport%20Final_FINAL_EN.pdf).
812

913
![Image Description](https://i.imgur.com/cVAg5sA.png)

0 commit comments

Comments
 (0)