Skip to content

Commit

Permalink
Merge branch 'develop' into 10022_upload_redirect_without_tagging #10022
Browse files Browse the repository at this point in the history


Conflicts:
src/main/java/edu/harvard/iq/dataverse/settings/JvmSettings.java
  • Loading branch information
pdurbin committed Apr 16, 2024
2 parents be2b783 + 131e76c commit 41eb617
Show file tree
Hide file tree
Showing 507 changed files with 23,188 additions and 9,237 deletions.
3 changes: 2 additions & 1 deletion .env
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
APP_IMAGE=gdcc/dataverse:unstable
POSTGRES_VERSION=13
POSTGRES_VERSION=16
DATAVERSE_DB_USER=dataverse
SOLR_VERSION=9.3.0
SKIP_DEPLOY=0
101 changes: 101 additions & 0 deletions .github/workflows/maven_cache_management.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
name: Maven Cache Management

on:
# Every push to develop should trigger cache rejuvenation (dependencies might have changed)
push:
branches:
- develop
# According to https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#usage-limits-and-eviction-policy
# all caches are deleted after 7 days of no access. Make sure we rejuvenate every 7 days to keep it available.
schedule:
- cron: '23 2 * * 0' # Run for 'develop' every Sunday at 02:23 UTC (3:23 CET, 21:23 ET)
# Enable manual cache management
workflow_dispatch:
# Delete branch caches once a PR is merged
pull_request:
types:
- closed

env:
COMMON_CACHE_KEY: "dataverse-maven-cache"
COMMON_CACHE_PATH: "~/.m2/repository"

jobs:
seed:
name: Drop and Re-Seed Local Repository
runs-on: ubuntu-latest
if: ${{ github.event_name != 'pull_request' }}
permissions:
# Write permission needed to delete caches
# See also: https://docs.github.com/en/rest/actions/cache?apiVersion=2022-11-28#delete-a-github-actions-cache-for-a-repository-using-a-cache-id
actions: write
contents: read
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Determine Java version from Parent POM
run: echo "JAVA_VERSION=$(grep '<target.java.version>' modules/dataverse-parent/pom.xml | cut -f2 -d'>' | cut -f1 -d'<')" >> ${GITHUB_ENV}
- name: Set up JDK ${{ env.JAVA_VERSION }}
uses: actions/setup-java@v4
with:
java-version: ${{ env.JAVA_VERSION }}
distribution: temurin
- name: Seed common cache
run: |
mvn -B -f modules/dataverse-parent dependency:go-offline dependency:resolve-plugins
# This non-obvious order is due to the fact that the download via Maven above will take a very long time (7-8 min).
# Jobs should not be left without a cache. Deleting and saving in one go leaves only a small chance for a cache miss.
- name: Drop common cache
run: |
gh extension install actions/gh-actions-cache
echo "🛒 Fetching list of cache keys"
cacheKeys=$(gh actions-cache list -R ${{ github.repository }} -B develop | cut -f 1 )
## Setting this to not fail the workflow while deleting cache keys.
set +e
echo "🗑️ Deleting caches..."
for cacheKey in $cacheKeys
do
gh actions-cache delete $cacheKey -R ${{ github.repository }} -B develop --confirm
done
echo "✅ Done"
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Save the common cache
uses: actions/cache@v4
with:
path: ${{ env.COMMON_CACHE_PATH }}
key: ${{ env.COMMON_CACHE_KEY }}
enableCrossOsArchive: true

# Let's delete feature branch caches once their PR is merged - we only have 10 GB of space before eviction kicks in
deplete:
name: Deplete feature branch caches
runs-on: ubuntu-latest
if: ${{ github.event_name == 'pull_request' }}
permissions:
# `actions:write` permission is required to delete caches
# See also: https://docs.github.com/en/rest/actions/cache?apiVersion=2022-11-28#delete-a-github-actions-cache-for-a-repository-using-a-cache-id
actions: write
contents: read
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Cleanup caches
run: |
gh extension install actions/gh-actions-cache
BRANCH=refs/pull/${{ github.event.pull_request.number }}/merge
echo "🛒 Fetching list of cache keys"
cacheKeysForPR=$(gh actions-cache list -R ${{ github.repository }} -B $BRANCH | cut -f 1 )
## Setting this to not fail the workflow while deleting cache keys.
set +e
echo "🗑️ Deleting caches..."
for cacheKey in $cacheKeysForPR
do
gh actions-cache delete $cacheKey -R ${{ github.repository }} -B $BRANCH --confirm
done
echo "✅ Done"
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
2 changes: 2 additions & 0 deletions .github/workflows/maven_unit_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,15 @@ on:
push:
paths:
- "**.java"
- "**.sql"
- "pom.xml"
- "modules/**/pom.xml"
- "!modules/container-base/**"
- "!modules/dataverse-spi/**"
pull_request:
paths:
- "**.java"
- "**.sql"
- "pom.xml"
- "modules/**/pom.xml"
- "!modules/container-base/**"
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -61,3 +61,4 @@ src/main/webapp/resources/images/dataverseproject.png.thumb140

# Docker development volumes
/docker-dev-volumes
/.vs
4 changes: 2 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,12 +56,12 @@ If you are interested in working on the main Dataverse code, great! Before you s

Please read http://guides.dataverse.org/en/latest/developers/version-control.html to understand how we use the "git flow" model of development and how we will encourage you to create a GitHub issue (if it doesn't exist already) to associate with your pull request. That page also includes tips on making a pull request.

After making your pull request, your goal should be to help it advance through our kanban board at https://github.com/orgs/IQSS/projects/2 . If no one has moved your pull request to the code review column in a timely manner, please reach out. Note that once a pull request is created for an issue, we'll remove the issue from the board so that we only track one card (the pull request).
After making your pull request, your goal should be to help it advance through our kanban board at https://github.com/orgs/IQSS/projects/34 . If no one has moved your pull request to the code review column in a timely manner, please reach out. Note that once a pull request is created for an issue, we'll remove the issue from the board so that we only track one card (the pull request).

Thanks for your contribution!

[dataverse-community Google Group]: https://groups.google.com/group/dataverse-community
[Community Call]: https://dataverse.org/community-calls
[dataverse-dev Google Group]: https://groups.google.com/group/dataverse-dev
[community contributors]: https://docs.google.com/spreadsheets/d/1o9DD-MQ0WkrYaEFTD5rF_NtyL8aUISgURsAXSL7Budk/edit?usp=sharing
[dev efforts]: https://github.com/orgs/IQSS/projects/2#column-5298405
[dev efforts]: https://github.com/orgs/IQSS/projects/34/views/6
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ Dataverse&#174;

Dataverse is an [open source][] software platform for sharing, finding, citing, and preserving research data (developed by the [Dataverse team](https://dataverse.org/about) at the [Institute for Quantitative Social Science](https://iq.harvard.edu/) and the [Dataverse community][]).

[dataverse.org][] is our home on the web and shows a map of Dataverse installations around the world, a list of [features][], [integrations][] that have been made possible through [REST APIs][], our development [roadmap][], and more.
[dataverse.org][] is our home on the web and shows a map of Dataverse installations around the world, a list of [features][], [integrations][] that have been made possible through [REST APIs][], our [project board][], our development [roadmap][], and more.

We maintain a demo site at [demo.dataverse.org][] which you are welcome to use for testing and evaluating Dataverse.

Expand All @@ -29,6 +29,7 @@ Dataverse is a trademark of President and Fellows of Harvard College and is regi
[Installation Guide]: https://guides.dataverse.org/en/latest/installation/index.html
[latest release]: https://github.com/IQSS/dataverse/releases
[features]: https://dataverse.org/software-features
[project board]: https://github.com/orgs/IQSS/projects/34
[roadmap]: https://www.iq.harvard.edu/roadmap-dataverse-project
[integrations]: https://dataverse.org/integrations
[REST APIs]: https://guides.dataverse.org/en/latest/api/index.html
Expand Down
3 changes: 3 additions & 0 deletions conf/localstack/buckets.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#!/usr/bin/env bash
# https://stackoverflow.com/questions/53619901/auto-create-s3-buckets-on-localstack
awslocal s3 mb s3://mybucket
12 changes: 12 additions & 0 deletions conf/proxy/Caddyfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# This configuration is intended to be used with Caddy, a very small high perf proxy.
# It will serve the application containers Payara Admin GUI via HTTP instead of HTTPS,
# avoiding the trouble of self signed certificates for local development.

:4848 {
reverse_proxy https://dataverse:4848 {
transport http {
tls_insecure_skip_verify
}
header_down Location "^https://" "http://"
}
}
10 changes: 6 additions & 4 deletions conf/solr/9.3.0/schema.xml
Original file line number Diff line number Diff line change
Expand Up @@ -229,6 +229,8 @@

<!-- incomplete datasets issue 8822 -->
<field name="datasetValid" type="boolean" stored="true" indexed="true" multiValued="false"/>

<field name="license" type="string" stored="true" indexed="true" multiValued="false"/>

<!--
METADATA SCHEMA FIELDS
Expand Down Expand Up @@ -327,7 +329,7 @@
<field name="keywordVocabularyURI" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="kindOfData" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="language" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="northLongitude" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="northLatitude" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="notesText" type="text_en" multiValued="false" stored="true" indexed="true"/>
<field name="originOfSources" type="text_en" multiValued="false" stored="true" indexed="true"/>
<field name="otherDataAppraisal" type="text_en" multiValued="false" stored="true" indexed="true"/>
Expand Down Expand Up @@ -370,7 +372,7 @@
<field name="software" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="softwareName" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="softwareVersion" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="southLongitude" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="southLatitude" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="state" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="studyAssayCellType" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="studyAssayMeasurementType" type="text_en" multiValued="true" stored="true" indexed="true"/>
Expand Down Expand Up @@ -566,7 +568,7 @@
<copyField source="keywordVocabularyURI" dest="_text_" maxChars="3000"/>
<copyField source="kindOfData" dest="_text_" maxChars="3000"/>
<copyField source="language" dest="_text_" maxChars="3000"/>
<copyField source="northLongitude" dest="_text_" maxChars="3000"/>
<copyField source="northLatitude" dest="_text_" maxChars="3000"/>
<copyField source="notesText" dest="_text_" maxChars="3000"/>
<copyField source="originOfSources" dest="_text_" maxChars="3000"/>
<copyField source="otherDataAppraisal" dest="_text_" maxChars="3000"/>
Expand Down Expand Up @@ -609,7 +611,7 @@
<copyField source="software" dest="_text_" maxChars="3000"/>
<copyField source="softwareName" dest="_text_" maxChars="3000"/>
<copyField source="softwareVersion" dest="_text_" maxChars="3000"/>
<copyField source="southLongitude" dest="_text_" maxChars="3000"/>
<copyField source="southLatitude" dest="_text_" maxChars="3000"/>
<copyField source="state" dest="_text_" maxChars="3000"/>
<copyField source="studyAssayCellType" dest="_text_" maxChars="3000"/>
<copyField source="studyAssayMeasurementType" dest="_text_" maxChars="3000"/>
Expand Down
3 changes: 3 additions & 0 deletions conf/solr/9.3.0/solrconfig.xml
Original file line number Diff line number Diff line change
Expand Up @@ -588,6 +588,7 @@
check for "Circuit Breakers tripped" in logs and the corresponding error message should tell
you what transpired (if the failure was caused by tripped circuit breakers).
-->

<!--
<str name="memEnabled">true</str>
<str name="memThreshold">75</str>
Expand All @@ -599,10 +600,12 @@
whether the circuit breaker is enabled and the average load over the last minute at which the
circuit breaker should start rejecting queries.
-->

<!--
<str name="cpuEnabled">true</str>
<str name="cpuThreshold">75</str>
-->

</circuitBreaker>

<!-- Request Dispatcher
Expand Down
13 changes: 0 additions & 13 deletions doc/release-notes/10001-datasets-files-api-user-permissions.md

This file was deleted.

1 change: 1 addition & 0 deletions doc/release-notes/10242-add-feature-dv-api
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
New api endpoints have been added to allow you to add or remove featured collections from a dataverse collection.
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
The API endpoint for getting the Dataset version has been extended to include latestVersionPublishingStatus.
3 changes: 3 additions & 0 deletions doc/release-notes/10339-workflow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
The computational workflow metadata block has been updated to present a clickable link for the External Code Repository URL field.

Release notes should include the usual instructions, for those who have installed this optional block, to update the computational_workflow block. (PR#10441)
6 changes: 6 additions & 0 deletions doc/release-notes/10389-metadatablocks-api-extension.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
New optional query parameters added to ``api/metadatablocks`` and ``api/dataverses/{id}/metadatablocks`` endpoints:

- ``returnDatasetFieldTypes``: Whether or not to return the dataset field types present in each metadata block. If not set, the default value is false.
- ``onlyDisplayedOnCreate``: Whether or not to return only the metadata blocks that are displayed on dataset creation. If ``returnDatasetFieldTypes`` is true, only the dataset field types shown on dataset creation will be returned within each metadata block. If not set, the default value is false.

Added new ``displayOnCreate`` field to the MetadataBlock and DatasetFieldType payloads.
3 changes: 3 additions & 0 deletions doc/release-notes/10464-add-name-harvesting-client-facet.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
The Metadata Source facet has been updated to show the name of the harvesting client rather than grouping all such datasets under 'harvested'

TODO: for the v6.13 release note: Please add a full re-index using http://localhost:8080/api/admin/index to the upgrade instructions.
1 change: 1 addition & 0 deletions doc/release-notes/10468-doc-datalad-integration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
DataLad has been integrated with Dataverse. For more information, see https://dataverse-guide--10470.org.readthedocs.build/en/10470/admin/integrations.html#datalad
Loading

0 comments on commit 41eb617

Please sign in to comment.