Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repodata API #20

Merged
merged 12 commits into from
Jan 11, 2024
Merged

Repodata API #20

merged 12 commits into from
Jan 11, 2024

Conversation

schuylermartin45
Copy link
Collaborator

@akabanovs @cbouss Adding you both for visibility. You two shouldn't feel required to review this one, unless you want to.

First pass at a new Repodata API that is type-safe and validates the JSON we pull in.

Also allows for easier version-ordering comparisons between two packages. This uses the VersionOrder class from conda and ALSO compares against the build_number we pulled from the repodata.json file.

Copy link
Collaborator

@markan markan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few quick comments, I'll take time to read this in more detail later.
This generally looks like a good starting point, but we may want to think about common access methods and indices.

anaconda_packaging_utils/api/repodata_api.py Outdated Show resolved Hide resolved
anaconda_packaging_utils/api/repodata_api.py Outdated Show resolved Hide resolved
sha256: str
name: str
size: int
version: str
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate that you are deferring parsing the version here.

repodata_version: int


_REPODATA_JSON_SCHEMA: Final[SchemaType] = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this derived from https://github.com/conda/schemas?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know how I missed this comment initially. No it isn't, I didn't know this existed. I reverse engineered this schema from examples and then ran an automated test against multiple real world examples.

There is a new schema in the latest code for channeldata.json but I only validate the subdirs field I use to look-up supported architectures.



@dataclass
class Repodata:
Copy link

@akabanovs akabanovs Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this seems to be something similar to the Repodata class in anaconda-channel-scanner. The only difference is that there are more checks in this implementation, and that it processes all the info. Anaconda-channel-scanner just reads what it needs for the dependency analysis and also structures the package data into a few levels: package-name/python-version/package-version/build-number. Although I was going to add a way to serialise it too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know. For Mark's needs (and I think percy's as well), all the data needs to be available at once. I wanted to create some more intelligent caching setup, but it also sounds like Cheng is on the warpath to replace these unscalable JSON files with something else.

So this is more of a stop-gap until we know what the "new and improved" version will be.

@schuylermartin45 schuylermartin45 force-pushed the smartin_repodata_api branch 2 times, most recently from 259b2e0 to e80a236 Compare December 14, 2023 22:20
- Minor changes to how the `ApiException` classes are maintained
- This will be useful for multiple applications for us.
- Also fixes static analyzer and linter issues
- Breaks out code that can be common to other API implementations
- Rearranges the unit tests accordingly.
- General updates to our latest style choices
- Clients can use 1 function to request a serialized version of a target
  `repodata.json` file
- Adds unit test coverage
- Increase unit test coverage minimum to a final target of 80%
- This should allow for all equivalency comparisons between two `PackageData`
  objects by `version` and `build_number`
- Adds unit tests
- Adds `TODO` remarks to add type-checks back in when changes have been made
  to `conda`
- Adds coverage around `==` and `gt` as sanity checks
- Channel/Architecture data is no longer being provided by a hard-coded mapping
  and is instead being pulled from a remote source
- Fixes type in the Makefile
- Adds unit testing
@schuylermartin45
Copy link
Collaborator Author

Marek made me realize on another PR yesterday that I accidentally duplicated myself by not getting this in sooner. So I'm going to merge this so we don't lose track of the rest of this good work.

@schuylermartin45 schuylermartin45 merged commit 28a6166 into main Jan 11, 2024
3 checks passed
@schuylermartin45 schuylermartin45 deleted the smartin_repodata_api branch January 11, 2024 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants