-
Notifications
You must be signed in to change notification settings - Fork 500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Harvesting Client name to the Metadata Source facet #10464
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
The Metadata Source facet has been updated to show the name of the harvesting client rather than grouping all such datasets under 'harvested' | ||
|
||
TODO: for the v6.13 release note: Please add a full re-index using http://localhost:8080/api/admin/index to the upgrade instructions. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,9 +15,7 @@ | |
import static jakarta.ws.rs.core.Response.Status.ACCEPTED; | ||
import static jakarta.ws.rs.core.Response.Status.OK; | ||
import static org.hamcrest.CoreMatchers.equalTo; | ||
import static org.junit.jupiter.api.Assertions.assertEquals; | ||
import static org.junit.jupiter.api.Assertions.*; | ||
import static org.junit.jupiter.api.Assertions.assertTrue; | ||
|
||
/** | ||
* This class tests Harvesting Client functionality. | ||
|
@@ -272,6 +270,12 @@ private void harvestingClientRun(boolean allowHarvestingMissingCVV) throws Inte | |
} while (i<maxWait); | ||
|
||
System.out.println("Waited " + i + " seconds for the harvest to complete."); | ||
|
||
Response searchHarvestedDatasets = UtilIT.search("metadataSource:" + nickName, normalUserAPIKey); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's keep an eye on it - will not surprise me if it needs an extra second here, occasionally, for the indexing of all the harvested datasets to complete. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jp-tosca @stevenwinship There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. JP, this was a bad call on my part - I should've asked you to run Jenkins 3 times in a row on the PR before merging it. |
||
searchHarvestedDatasets.prettyPrint(); | ||
searchHarvestedDatasets.then().assertThat() | ||
.statusCode(OK.getStatusCode()) | ||
.body("data.total_count", equalTo(expectedNumberOfSetsHarvested)); | ||
|
||
// Fail if it hasn't completed in maxWait seconds | ||
assertTrue(i < maxWait); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Probably need a release note about this feature that also notes that (async/background) reindexing is needed to populate the facet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @qqmyers 👋🏼
Thanks! I will add the release note in a few just looking at some other PR right now but I will add it ASAP. Regarding the tests, I am not sure I will also look at this.
Best,
Juan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note added and test condition added to the harvesting test to search by the new collection name. 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Putting what I said during standup in writing:
Rather than using the HarvestingClient's nickname for this facet, we should probably use the name of the local collection into which the client is harvesting.
The clear advantages of doing that:
solrInputDocument.addField(SearchFields.METADATA_SOURCE, rootDataverse.getName());
oai_3
and that's what's going to show up in the facet, the only way for them to address it would be to delete the client (and all the content associated with it), and re-create it with a better-looking nickname, then re-harvest.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would happen if someone has multiple clients harvested into the same collection? 🤔 should we consider this scenario! Also @DS-INRA made some comments a few minutes ago on the issue confirming that the name would be what they need which is as I understand is the other PR associated with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible to harvest into the same collection, yes. But then one could make an argument that if the local admin wants their users to see these datasets from different OAI archives (or sets/clients) show as the same collection to their users, they may actually prefer to have them under the same facet too...
All that said, I would agree that it makes sense to implement it the way the original requestor wants it to work. But let's make sure everybody is on the same page. Let me ask some followup questions in the issue.