Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
a0b2478
current state
TejasMorbagal Dec 13, 2024
42367da
implemented dataset stac generator class and unit tests
TejasMorbagal Dec 23, 2024
76c7e82
refactor
TejasMorbagal Dec 23, 2024
79321d9
The parameters of build_stac_collection have been moved to the class …
TejasMorbagal Dec 23, 2024
b852315
modified logic for open_data
TejasMorbagal Dec 23, 2024
6830d27
reordered imports and made logger part of the class
TejasMorbagal Dec 27, 2024
e1eda9f
refactor
TejasMorbagal Dec 27, 2024
ef97cc0
updated git ignore to Stop tracking git.yaml and dataset-config.yaml
TejasMorbagal Dec 27, 2024
6d5d9af
updated constants
TejasMorbagal Dec 27, 2024
f8b139f
refactor
TejasMorbagal Dec 27, 2024
e4fe7f2
add main func for cli
TejasMorbagal Dec 27, 2024
3d35cd0
publish as api and cli
TejasMorbagal Dec 27, 2024
48a01de
refactor
TejasMorbagal Dec 27, 2024
e8a2457
updated env and pyproject.toml
TejasMorbagal Dec 27, 2024
16c245a
cli as module
TejasMorbagal Dec 27, 2024
15c63da
updated constants
TejasMorbagal Dec 27, 2024
de4c5c6
latest state
TejasMorbagal Dec 27, 2024
58630b8
latest state 30.12
TejasMorbagal Dec 30, 2024
aadcb07
refactor
TejasMorbagal Dec 30, 2024
5a40d76
support cf_parameter in stac generator
TejasMorbagal Dec 30, 2024
4f95d77
modify the get_schema_uri method to handle osc and cf ext uris
TejasMorbagal Dec 30, 2024
bfe04c8
refactor line wrap at 88 chars
TejasMorbagal Dec 30, 2024
f5ad7f1
python 3.10 typing updates and black code formatting
TejasMorbagal Dec 30, 2024
87bb0bd
updated copyright notices
TejasMorbagal Dec 30, 2024
bbd4746
updated environment.yml and pyproject.toml
TejasMorbagal Dec 30, 2024
a567981
updated environment.yml and pyproject.toml
TejasMorbagal Dec 30, 2024
76da5aa
refactor
TejasMorbagal Jan 2, 2025
c72044e
refactor imports
TejasMorbagal Jan 2, 2025
5e4b2ed
update environment.yml
TejasMorbagal Jan 2, 2025
23568fb
make directories module
TejasMorbagal Jan 2, 2025
b9ddbec
introduced logging
TejasMorbagal Jan 2, 2025
43cddde
unit test for publish api
TejasMorbagal Jan 2, 2025
be055af
unit test workflow
TejasMorbagal Jan 2, 2025
bdc8bf1
updated unit tests and workflow
TejasMorbagal Jan 2, 2025
b5863a6
updated workflow
TejasMorbagal Jan 2, 2025
94bad29
updated workflow
TejasMorbagal Jan 2, 2025
1ef4c9d
updated ENV
TejasMorbagal Jan 2, 2025
d03479c
updated unit test
TejasMorbagal Jan 2, 2025
1749ae8
upload codecov
TejasMorbagal Jan 2, 2025
ecc7951
badges
TejasMorbagal Jan 2, 2025
92537df
badges
TejasMorbagal Jan 2, 2025
d301f82
code formatting
TejasMorbagal Jan 2, 2025
40e94e0
updated README.md
TejasMorbagal Jan 6, 2025
3fda30d
extended get_spatial_extent to handle ds with latitude and longitude …
TejasMorbagal Jan 9, 2025
c39b4be
adapted doc strings to follow google style
TejasMorbagal Jan 9, 2025
0306001
adapted doc strings to follow google style
TejasMorbagal Jan 9, 2025
ead03d9
renamed api module to tools
TejasMorbagal Jan 9, 2025
0aab936
refactor doc string
TejasMorbagal Jan 9, 2025
021647d
update copywrite notices
TejasMorbagal Jan 9, 2025
4b2987e
refactor
TejasMorbagal Jan 9, 2025
0993540
update README.md
TejasMorbagal Jan 9, 2025
8238380
refactor
TejasMorbagal Jan 9, 2025
04428d8
refactor test case
TejasMorbagal Jan 9, 2025
abe73f9
black code formatting
TejasMorbagal Jan 9, 2025
737e3f3
Update README.md
TejasMorbagal Jan 9, 2025
f456821
Update README.md
TejasMorbagal Jan 9, 2025
b12cb84
refactor
TejasMorbagal Jan 9, 2025
1b7202e
removed .gitaccess as a option in the cli cmd
TejasMorbagal Jan 9, 2025
452c104
dataset_config is not an option but an argument
TejasMorbagal Jan 9, 2025
965674b
pin zarr version to fix failing ci
TejasMorbagal Jan 10, 2025
dc7d726
pin zarr version to fix failing ci
TejasMorbagal Jan 10, 2025
a4efe5a
pin zarr version to fix failing ci
TejasMorbagal Jan 10, 2025
e69805e
updated read me
TejasMorbagal Jan 10, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions .github/workflows/unittest-workflow.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
name: Unittest deep-code

on:
push:
release:
types: [published]

jobs:
unittest:
runs-on: ubuntu-latest
steps:
- name: checkout deep-code
uses: actions/checkout@v4

- name: Set up MicroMamba
uses: mamba-org/setup-micromamba@v1
with:
environment-file: environment.yml

- name: Install deep-code in editable mode
shell: bash -l {0}
run: |
cd /home/runner/work/deep-code/deep-code
pip install -e .

- name: Run unit tests
shell: bash -l {0}
run: |
cd /home/runner/work/deep-code/deep-code
pytest --cov=deep_code --cov-report=xml

- name: Upload coverage reports to Codecov
uses: codecov/codecov-action@v5
with:
token: ${{ secrets.CODECOV_TOKEN }}
slug: deepesdl/deep-code
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -160,3 +160,7 @@ cython_debug/
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

# Exclude sensitive configuration files from version control
.gitaccess
dataset-config.yaml
88 changes: 87 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,87 @@
# deep-code
# deep-code

[![Build Status](https://github.com/deepesdl/deep-code/actions/workflows/unittest-workflow.yaml/badge.svg)](https://github.com/deepesdl/deep-code/actions/workflows/unittest-workflow.yaml)
[![codecov](https://codecov.io/gh/deepesdl/deep-code/graph/badge.svg?token=47MQXOXWOK)](https://codecov.io/gh/deepesdl/deep-code)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![License](https://img.shields.io/github/license/dcs4cop/xcube-smos)](https://github.com/deepesdl/deep-code/blob/main/LICENSE)

`deep-code` is a lightweight python tool that comprises a command line interface(CLI)
and Python API providing utilities that aid integration of DeepESDL datasets,
experiments with EarthCODE.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Describe general purpose here, the concept and its features.

## Setup

## Install
`deep-code` will be available in PyPI and conda-forge. Till the stable release,
developers/contributors can follow the below steps to install deep-code.

## Installing from the repository for Developer

To install deep-code directly from the git repository, clone the repository, and execute the steps below:

```commandline
conda env create
conda activate deep-code
pip install -e .
```

This installs all the dependencies of `deep-code` into a fresh conda environment,
and installs deep-code from the repository into the same environment.

## Testing

To run the unit test suite:

```commandline
pytest
```

To analyze test coverage
```shell
pytest --cov=deep-code
```

To produce an HTML coverage report

```commandline
pytest --cov-report html --cov=deep-code
```

## deep_code usage

`deep_code` provides a command-line tool called deep-code, which has several subcommands
providing different utility functions.
Use the --help option with these subcommands to get more details on usage.

### deep-code publish-product

Publish a dataset which is a result of an experiment to the EarthCODE
open-science catalog.

```commandline
deep-code publish-dataset /path/to/dataset-config.yaml
```

#### .gitaccess example

```
github-username: your-git-user
github-token: personal access token
```

#### dataset-config.yaml example

```
dataset-id: hydrology-1D-0.009deg-100x60x60-3.0.2.zarr
collection-id: hydrology

#non-mandatory
documentation-link: https://deepesdl.readthedocs.io/en/latest/datasets/hydrology-1D-0-009deg-100x60x60-3-0-2-zarr/
access-link: s3://test
dataset-status: completed
dataset-region: global
dataset-theme: ["ocean", "environment"]
cf-parameter: [{"Name" : "hydrology"}]
```

dataset-id has to be a valid dataset-id from `deep-esdl-public` s3 or your team bucket.
4 changes: 2 additions & 2 deletions deep_code/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# The MIT License (MIT)
# Copyright (c) 2024 by the xcube development team and contributors
# Copyright (c) 2024 by DeepESDL and Brockmann Consult GmbH
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
Expand All @@ -21,4 +21,4 @@

from .version import version

__version__ = version
__version__ = version
3 changes: 0 additions & 3 deletions deep_code/api/__init__.py

This file was deleted.

Empty file removed deep_code/api/check_repository.py
Empty file.
1 change: 0 additions & 1 deletion deep_code/api/new.py

This file was deleted.

1 change: 0 additions & 1 deletion deep_code/api/publish_experiments.py

This file was deleted.

1 change: 0 additions & 1 deletion deep_code/api/publish_products.py

This file was deleted.

1 change: 0 additions & 1 deletion deep_code/api/setup_ci.py

This file was deleted.

Empty file removed deep_code/api/test.py
Empty file.
3 changes: 3 additions & 0 deletions deep_code/cli/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Copyright (c) 2025 by Brockmann Consult GmbH
# Permissions are hereby granted under the terms of the MIT License:
# https://opensource.org/licenses/MIT.
20 changes: 20 additions & 0 deletions deep_code/cli/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#!/usr/bin/env python3

# Copyright (c) 2025 by Brockmann Consult GmbH
# Permissions are hereby granted under the terms of the MIT License:
# https://opensource.org/licenses/MIT.

import click

from deep_code.cli.publish import publish_dataset


@click.group()
def main():
"""Deep Code CLI."""
pass


main.add_command(publish_dataset)
if __name__ == "__main__":
main()
21 changes: 21 additions & 0 deletions deep_code/cli/publish.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/usr/bin/env python3

# Copyright (c) 2025 by Brockmann Consult GmbH
# Permissions are hereby granted under the terms of the MIT License:
# https://opensource.org/licenses/MIT.

import click

from deep_code.tools.publish import DatasetPublisher


@click.command(name="publish-dataset")
@click.argument(
"dataset_config",
type=click.Path(exists=True)
)
def publish_dataset(dataset_config):
"""Request publishing a dataset to the open science catalogue.
"""
publisher = DatasetPublisher()
publisher.publish_dataset(dataset_config_path=dataset_config)
11 changes: 11 additions & 0 deletions deep_code/constants.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/usr/bin/env python3

# Copyright (c) 2024 by Brockmann Consult GmbH
# Permissions are hereby granted under the terms of the MIT License:
# https://opensource.org/licenses/MIT.

OSC_SCHEMA_URI = "https://stac-extensions.github.io/osc/v1.0.0-rc.3/schema.json"
CF_SCHEMA_URI = "https://stac-extensions.github.io/cf/v0.2.0/schema.json"
OSC_REPO_OWNER = "ESA-EarthCODE"
OSC_REPO_NAME = "open-science-catalog-metadata-testing"
OSC_BRANCH_NAME = "add-new-collection"
3 changes: 3 additions & 0 deletions deep_code/tests/tools/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Copyright (c) 2025 by Brockmann Consult GmbH
# Permissions are hereby granted under the terms of the MIT License:
# https://opensource.org/licenses/MIT.
122 changes: 122 additions & 0 deletions deep_code/tests/tools/test_publish.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
import pytest
from unittest.mock import patch, MagicMock, mock_open

from deep_code.tools.publish import DatasetPublisher


class TestDatasetPublisher:
@patch("deep_code.tools.publish.fsspec.open")
def test_init_missing_credentials(self, mock_fsspec_open):
mock_fsspec_open.return_value.__enter__.return_value = mock_open(
read_data="{}"
)()

with pytest.raises(
ValueError, match="GitHub credentials are missing in the `.gitaccess` file."
):
DatasetPublisher()

@patch("deep_code.tools.publish.fsspec.open")
def test_publish_dataset_missing_ids(self, mock_fsspec_open):
git_yaml_content = """
github-username: test-user
github-token: test-token
"""
dataset_yaml_content = """
collection-id: test-collection
"""
mock_fsspec_open.side_effect = [
mock_open(read_data=git_yaml_content)(),
mock_open(read_data=dataset_yaml_content)(),
]

publisher = DatasetPublisher()

with pytest.raises(
ValueError,
match="Dataset ID or Collection ID is missing in the "
"dataset-config.yaml file.",
):
publisher.publish_dataset("/path/to/dataset-config.yaml")

@patch("deep_code.utils.github_automation.os.chdir")
@patch("deep_code.utils.github_automation.subprocess.run")
@patch("deep_code.utils.github_automation.os.path.expanduser", return_value="/tmp")
@patch("requests.post")
@patch("deep_code.utils.github_automation.GitHubAutomation")
@patch("deep_code.tools.publish.fsspec.open")
def test_publish_dataset_success(
self,
mock_fsspec_open,
mock_github_automation,
mock_requests_post,
mock_expanduser,
mock_subprocess_run,
mock_chdir,
):

# Mock the YAML reads
git_yaml_content = """
github-username: test-user
github-token: test-token
"""
dataset_yaml_content = """
dataset-id: test-dataset
collection-id: test-collection
documentation-link: http://example.com/doc
access-link: http://example.com/access
dataset-status: ongoing
dataset-region: Global
dataset-theme: ["climate"]
cf-parameter: []
"""
mock_fsspec_open.side_effect = [
mock_open(read_data=git_yaml_content)(),
mock_open(read_data=dataset_yaml_content)(),
]

# Mock GitHubAutomation methods
mock_git = mock_github_automation.return_value
mock_git.fork_repository.return_value = None
mock_git.clone_repository.return_value = None
mock_git.create_branch.return_value = None
mock_git.add_file.return_value = None
mock_git.commit_and_push.return_value = None
mock_git.create_pull_request.return_value = "http://example.com/pr"
mock_git.clean_up.return_value = None

# Mock subprocess.run & os.chdir
mock_subprocess_run.return_value = None
mock_chdir.return_value = None

# Mock STAC generator
mock_collection = MagicMock()
mock_collection.to_dict.return_value = {
"type": "Collection",
"id": "test-collection",
"description": "A test STAC collection",
"extent": {
"spatial": {"bbox": [[-180.0, -90.0, 180.0, 90.0]]},
"temporal": {"interval": [["2023-01-01T00:00:00Z", None]]},
},
"links": [],
"stac_version": "1.0.0",
}
with patch("deep_code.tools.publish.OSCProductSTACGenerator") as mock_generator:
mock_generator.return_value.build_stac_collection.return_value = (
mock_collection
)

# Instantiate & publish
publisher = DatasetPublisher()
publisher.publish_dataset("/fake/path/to/dataset-config.yaml")

# 6Assert that we called git clone with /tmp/temp_repo
# Because expanduser("~") is now patched to /tmp, the actual path is /tmp/temp_repo
auth_url = "https://test-user:test-token@github.com/test-user/open-science-catalog-metadata-testing.git"
mock_subprocess_run.assert_any_call(
["git", "clone", auth_url, "/tmp/temp_repo"], check=True
)

# Also confirm we changed directories to /tmp/temp_repo
mock_chdir.assert_any_call("/tmp/temp_repo")
3 changes: 3 additions & 0 deletions deep_code/tests/utils/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Copyright (c) 2025 by Brockmann Consult GmbH
# Permissions are hereby granted under the terms of the MIT License:
# https://opensource.org/licenses/MIT.
Loading
Loading