Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keycloak SPI for Builtin Users Authentication #11193

Open
wants to merge 43 commits into
base: develop
Choose a base branch
from

Conversation

GPortas
Copy link
Contributor

@GPortas GPortas commented Jan 28, 2025

What this PR does / why we need it:

This PR implements changes to enable authentication with OIDC using Dataverse Builtin users. A User Storage SPI has been implemented in Keycloak to access the Dataverse database and its API.

API

Added an endpoint to validate credentials (email/username and password).

/builtin-users/{username}/canLoginWithGivenCredentials

AuthenticationServiceBean

Introduced a new feature flag, API_BEARER_AUTH_USE_BUILTIN_USER_ON_ID_MATCH, which enables a new behavior in the lookupUserByOIDCBearerToken method. This behavior first checks whether the username from the bearer token belongs to a built-in user in AuthenticatedUser before querying AuthenticatedUserLookup.

For tokens issued by Keycloak for users authenticated through the SPI, impersonation is not a concern since these users are directly linked. However, as we currently do not differentiate tokens based on the IdP they originate from, this new behavior could lead to impersonation issues. To mitigate this risk, the feature flag has been introduced, along with docs explaining the risk.

In a future iteration, this implementation could be evolved to better handle impersonation issues, potentially removing the feature flag if necessary.

SPI

The SPI implementation follows the Keycloak User Storage SPI documentation.

Installation and containers

A custom Keycloak Docker image has been created using a new Dockerfile that builds the SPI Maven artifact and installs it in Keycloak. Currently, the artifact is not pushed to any Maven registry, but this could be done in a future iteration to simplify installation in Keycloak instances. The image is used directly from docker-compose-dev, along with a new keycloak-initializer container that configures the SPI in the test realm, as no option was found to configure it through test-realm.json parameters.

Could we publish this image in the GDCC GitHub registry to enable its use in other development environments, such as dataverse-frontend? I would appreciate some assistance with this step to successfully register the image in the GDCC registry.

Which issue(s) this PR closes:

Special notes for your reviewer:

This PR is part of an incremental development process. We still need to better understand how to manage duplicate accounts for the same user across multiple identity providers, as well as potential impersonation issues. For now, and before moving further in that direction, I believe it is valuable to review the current implementation and merge it if no objections are found. There should be no risks of merging this PR, as the associated logic is only activated through a feature flag.

To use the implemented mechanism in the beta environment, we will need to configure a remote Keycloak instance.

Suggestions on how to test this:

Once the containers are up and running:

curl -X POST \
  http://localhost:8090/realms/test/protocol/openid-connect/token \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "client_id=test" \
  -d "client_secret=94XHrfNRwXsjqTqApRrwWmhDLDHpIYV8" \
  -d "grant_type=password" \
  -d "username=dataverseAdmin" \
  -d "password=admin1" \
  -d "scope=openid"

You should be able to successfully obtain the OIDC tokens.

You can also create a different builtin user and test the same login call for its associated credentials.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

No

Is there a release notes update needed for this change?:

Yes, attached.

Additional documentation:

https://www.keycloak.org/docs/latest/server_development/index.html#_user-storage-spi

@coveralls
Copy link

coveralls commented Jan 28, 2025

Coverage Status

coverage: 23.043% (+0.3%) from 22.747%
when pulling 3ca4940 on 11157-builtin-users-oidc-auth
into d3cba3f on develop.

This comment has been minimized.

This comment has been minimized.

1 similar comment

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

1 similar comment

This comment has been minimized.

This comment has been minimized.

1 similar comment

This comment has been minimized.

Copy link

📦 Pushed preview images as

ghcr.io/gdcc/dataverse:11157-builtin-users-oidc-auth
ghcr.io/gdcc/configbaker:11157-builtin-users-oidc-auth

🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name.

@GPortas GPortas marked this pull request as ready for review February 17, 2025 14:28
@GPortas GPortas added Size: 10 A percentage of a sprint. 7 hours. SPA These changes are required for the Dataverse SPA Original size: 10 SPA.Q1.1 Authentication: OIDC for Builtin users GREI Re-arch Issues related to the GREI Dataverse rearchitecture labels Feb 17, 2025
@GPortas GPortas assigned GPortas and unassigned pdurbin, qqmyers and GPortas Feb 17, 2025
@GPortas GPortas removed their assignment Feb 17, 2025
Copy link
Member

@qqmyers qqmyers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks good - glad to see that built-in users can get tied to Keycloak. In general, I think my comments are either minor/about docs, or probably out of scope (updating Keycloak version.) There's one place I think code can be simplified and one issue where we may want tighter security - that one may be a tech hour item to discuss.

<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>16</source>
<target>16</target>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

17 for these?

</build>

<properties>
<keycloak.version>22.0.0</keycloak.version>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW: This is pretty old now. 26.1.0 is current and I think versions earlier than 24.x have sec. issues. Perhaps we need an issue to update the Keycloak in use?


@Override
public UserModel getUserById(RealmModel realm, String id) {
logger.infof("Fetching user by ID: %s", id);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these change to debugf?

public boolean isValid(RealmModel realm, UserModel user, CredentialInput input) {
logger.infof("Validating credentials for user: %s", user.getUsername());

if (!supportsCredentialType(input.getType()) || !(input instanceof UserCredentialModel userCredential)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(aside) weirdly conservative but not wrong - the second clause catches cases where there's a class implementing CredentialInput and claiming the same TYPE as PasswordCredentialModel that is not a subclass of UserCredentialModel (as PasswordCredentialModel is.) I guess it does give the userCredential variable easily though.


@Override
public boolean supportsCredentialType(String credentialType) {
return PasswordCredentialModel.TYPE.equals(credentialType);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that PasswordCredentialModel is deprecated in the current Keycloak - but I didn't immediately see a forward compatible change to use with v22

authenticatedUser.setAuthProviderId(aul.getAuthenticationProviderId());


if (authenticatedUser != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if null the AuthenticatedUserLookup query isn't needed either.

if (FeatureFlags.API_BEARER_AUTH_USE_BUILTIN_USER_ON_ID_MATCH.enabled()) {
AuthenticatedUser builtinAuthenticatedUser = getBuiltinAuthenticatedUserByIdentifier(oAuth2UserRecord.getUsername());
return (builtinAuthenticatedUser != null) ? builtinAuthenticatedUser : lookupUser(oAuth2UserRecord.getUserRecordIdentifier());
}
return lookupUser(oAuth2UserRecord.getUserRecordIdentifier());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a simplification - just do a lookupUser(BuiltInAuthenticationProvider.PROVIDER_ID, oauth2Record.getUserRecordIdentifier().getUserIdInRepo()) here? I think that elimiates the need for the getBuiltinAuthenticatedUserByIdentifier method (which scans all providers before filtering to builtin).

I'm not sure what the priority should be when the flag is set should be - should the lookup by the real provider happen first? I guess it doesn't matter as only one of these can be true since ids are unique.

account when an identity match is found during OIDC API bearer token authentication.

This feature enables automatic association of an incoming IdP identity with an existing built-in user account, bypassing
the need for additional user registration steps.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest adding something high-level, e.g. there's a new dv-builtin-users-authenticator that allows use of Keycloak on instances that have builtin accounts, which enables use of the SPA on those instances. (Another aside - at some point it could also support mapping Shib users coming in through Keycloak to the existing Shib users (w/o changing the provider in the Dataverse db, but that would require changes to the storage provider to support more than built-in.)

Validate Builtin User Credentials
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Validates the provided credentials to determine if the user can log in with them.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should address security here - which possibly depends on changes - see comment the API call.

@@ -122,6 +128,16 @@ public Response create(BuiltinUser user, @PathParam("password") String password,
public Response createWithNotification(BuiltinUser user, @PathParam("password") String password, @PathParam("key") String key, @PathParam("sendEmailNotification") Boolean sendEmailNotification) {
return internalSave(user, password, key, sendEmailNotification);
}

@GET
@Path("{username}/canLoginWithGivenCredentials")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this secure enough? I believe it is blocked by the default :BlockedApiEndpoints setting and restricted to localhost by default, but it is a way to scan passwords. Should it have its own Key, like the BuiltinUsers.KEY, or a throttle? Would it be safer to copy the validation logic from the authenticationService into the storage provider and just query the db (straight-forward except for the rarely/never used algorithm update code?)?

Separate q: Unless I missed it, this doesn't check for a user being deactivated - is that what's desired? I think w/o that you'd get logged in at Keycloak, but would then get errors when other API calls are made for the user? Should login just be denied here?

@GPortas GPortas self-assigned this Feb 18, 2025
@cmbz cmbz added the FY25 Sprint 17 FY25 Sprint 17 (2025-02-12 - 2025-02-26) label Feb 26, 2025
@cmbz cmbz added this to the 6.6 milestone Feb 26, 2025
@cmbz cmbz added the FY25 Sprint 18 FY25 Sprint 18 (2025-02-26 - 2025-03-12) label Feb 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FY25 Sprint 17 FY25 Sprint 17 (2025-02-12 - 2025-02-26) FY25 Sprint 18 FY25 Sprint 18 (2025-02-26 - 2025-03-12) GREI Re-arch Issues related to the GREI Dataverse rearchitecture Original size: 10 Size: 10 A percentage of a sprint. 7 hours. SPA.Q1.1 Authentication: OIDC for Builtin users SPA These changes are required for the Dataverse SPA
Projects
Status: In Review 🔎
Development

Successfully merging this pull request may close these issues.

Implement OIDC flow/integration for builtin Dataverse users
5 participants