Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset Versions API: add version differences #11188

Open
ekraffmiller opened this issue Jan 24, 2025 · 8 comments
Open

Dataset Versions API: add version differences #11188

ekraffmiller opened this issue Jan 24, 2025 · 8 comments
Assignees
Labels
FY25 Sprint 16 FY25 Sprint 16 (2025-01-29 - 2025-02-12) GREI Re-arch Issues related to the GREI Dataverse rearchitecture Original size: 30 Size: 30 A percentage of a sprint. 21 hours. (formerly size:33) SPA.Q1.2 Dataset Page: Versions Tab SPA These changes are required for the Dataverse SPA Type: Feature a feature request

Comments

@ekraffmiller
Copy link
Contributor

Overview of the Feature Request
For the SPA, we need to update the List Dataset Versions API with the differences between each version, for displaying in the Dataset Versions tab.
For example,

Image

There is already an endpoint for getting the differences between any two versions - https://guides.dataverse.org/en/latest/api/native-api.html#compare-versions-of-a-dataset. The same logic can be re-used to return the differences between versions.

What kind of user is the feature intended for?
(Example users roles: API User, Curator, Depositor, Guest, Superuser, Sysadmin)
API user

What inspired the request?
SPA: Dataset Versions Tab

What existing behavior do you want changed?
Add version difference information to List Dataset Versions API

@ekraffmiller ekraffmiller added the Type: Feature a feature request label Jan 24, 2025
@ekraffmiller ekraffmiller added SPA.Q1.2 Dataset Page: Versions Tab SPA These changes are required for the Dataverse SPA GREI Re-arch Issues related to the GREI Dataverse rearchitecture labels Jan 24, 2025
@johannes-darms
Copy link
Contributor

Please reconsider this feature. We have datasets with >100 versions and quite a few differences between versions. I think the payload will be huge and calculating all those diffs will take some time, resulting in a slow response time. I'd rather suggest a separate request for each difference.

@GPortas GPortas added Size: 30 A percentage of a sprint. 21 hours. (formerly size:33) Original size: 30 labels Jan 29, 2025
@sekmiller sekmiller moved this to SPRINT READY in IQSS Dataverse Project Jan 29, 2025
@GPortas GPortas moved this from SPRINT READY to This Sprint 🏃‍♀️ 🏃 in IQSS Dataverse Project Jan 29, 2025
@GPortas GPortas added the FY25 Sprint 16 FY25 Sprint 16 (2025-01-29 - 2025-02-12) label Jan 29, 2025
@sekmiller
Copy link
Contributor

@johannes-darms, @ekraffmiller in the jsf version we use an asynchronous load/update of the dataset page, since the default tab is the file list. If the user clicks into the versions tab before the processing is finished they may get a "Loading..." message. I don't think we've gotten too many complaints about this.

@sekmiller sekmiller moved this from This Sprint 🏃‍♀️ 🏃 to In Progress 💻 in IQSS Dataverse Project Jan 30, 2025
@sekmiller sekmiller self-assigned this Jan 30, 2025
@sekmiller
Copy link
Contributor

@ekraffmiller If the response is something like this will it work? {"status":"OK","data":{"302":{"versionNumber":"DRAFT","summary":"Subject (1 Added, 1 Changed); Additional Citation Metadata: (2 Added); Terms of Use/Access Changed","contributors":"Dataverse Admin","publishedOn":""},"301":{"versionNumber":"3.1","summary":"Description (1 Changed); Title (Changed); Additional Citation Metadata: (1 Added); ","contributors":"Dataverse Admin","publishedOn":"2025-02-03"},"296":{"versionNumber":"3.0","summary":"Files (Added: 1)","contributors":"Dataverse Admin","publishedOn":"2025-01-29"},"288":{"versionNumber":"2.0","summary":"Files (Added: 1)","contributors":"Dataverse Admin","publishedOn":"2025-01-27"},"283":{"versionNumber":"1.0","summary":"This is the first published version.","contributors":"Dataverse Admin","publishedOn":"2025-01-27"}}}

sekmiller added a commit that referenced this issue Feb 4, 2025
sekmiller added a commit that referenced this issue Feb 5, 2025
sekmiller added a commit that referenced this issue Feb 5, 2025
@sekmiller
Copy link
Contributor

@ekraffmiller @qqmyers I have some odd ball cases in this example, but I was already picking up File Metadata changes by basing the api on the dataset-versions tab {"status":"OK","data":{"302":{"versionNumber":"DRAFT","summary":"Subject (1 Added, 1 Changed); Additional Citation Metadata: (2 Added); Files (Changed File Metadata: 1)Terms of Use/Access Changed","contributors":"Dataverse Admin","publishedOn":""},"301":{"versionNumber":"3.1","summary":"Description (1 Changed); Title (Changed); Additional Citation Metadata: (1 Added); Files (Added: 1)","contributors":"Dataverse Admin","publishedOn":"2025-02-03"},"296":{"versionNumber":"3.0","summary":"Deaccessioned Reason: There is identifiable data in one or more files. Something, something on a heap!","contributors":"Dataverse Admin","publishedOn":"2025-01-29"},"288":{"versionNumber":"2.0","summary":"Due to the previous version being deaccessioned, there are no difference notes available for this published version.","contributors":"Dataverse Admin","publishedOn":"2025-01-27"},"283":{"versionNumber":"1.0","summary":"Deaccessioned Reason: Not a valid dataset. Just cuz","contributors":"Dataverse Admin","publishedOn":"2025-01-27"}}}

@qqmyers
Copy link
Member

qqmyers commented Feb 5, 2025

Sorry - I was confusing this issue with #11198 for files. (For that - I was just seeing that the

public static Map<String,List<String>> compareFileMetadatas(FileMetadata fmdo, FileMetadata fmdn) {
creates a map of the differences and the code at turns that into Json which is sent back in the /api/datasets/{id}/versions/{versionId1}/compare/{versionId2} call.)

For this PR - would it make sense to just send the JSON differences as currently sent in /api/datasets/{id}/versions/{versionId1}/compare/{versionId2} and let the SPA create the text?

@sekmiller
Copy link
Contributor

We already had the summary code written into the difference object, so it's already working. Most of what I'm doing here is adapting the process of the jsf converting it for the versions tab to a json output response.

@sekmiller
Copy link
Contributor

Spoke to Ellen about this this morning. I will provide the elements of the summary objects as json and not do any translating/formatting.

sekmiller added a commit that referenced this issue Feb 6, 2025
@sekmiller
Copy link
Contributor

Proposed new response:

{
"status": "OK",
"data": {
"305": {
"versionNumber": "1.0",
"summary": "firstPublished",
"contributors": "Dataverse Admin",
"publishedOn": "2025-02-06"
},
"306": {
"versionNumber": "DRAFT",
"summary": {
"dsDescription": {
"added": 0,
"deleted": 0,
"changed": 1
},
"title": {
"added": 0,
"deleted": 0,
"changed": 1
},
"geospatial": {
"added": 1,
"deleted": 0,
"changed": 0
},
"citation": {
"added": 2,
"deleted": 0,
"changed": 0
},
"files": {
"added": 1,
"removed": 0,
"replaced": 0,
"changedFileMetaData": 2,
"changedVariableMetadata": 0
},
"termsAccessChanged": "false"
},
"contributors": "Dataverse Admin",
"publishedOn": ""
}
}
}

sekmiller added a commit that referenced this issue Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FY25 Sprint 16 FY25 Sprint 16 (2025-01-29 - 2025-02-12) GREI Re-arch Issues related to the GREI Dataverse rearchitecture Original size: 30 Size: 30 A percentage of a sprint. 21 hours. (formerly size:33) SPA.Q1.2 Dataset Page: Versions Tab SPA These changes are required for the Dataverse SPA Type: Feature a feature request
Projects
Status: In Progress 💻
Development

No branches or pull requests

5 participants