Skip to content

Conversation

@nuwang
Copy link
Member

@nuwang nuwang commented Nov 2, 2025

This PR:

  1. Removes code that relies on defunct custos services (custos auth provider and the custos vault)
  2. Enhances the generic PSA oidc provider to support PKCE
  3. Reimplements keycloak as an extension of this generic PSA provider
  4. Reimplements cilogon as an extension of this generic PSA provider
  5. Adds additional tests
  6. Migration scripts for CustosAuthnzToken model so that users do no need to re-associate accounts + tests

supercedes: #21090
closes: #20789

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

@nuwang
Copy link
Member Author

nuwang commented Nov 3, 2025

@jdavcs I made a commit which adds the migration scripts + tests but I'm not sure I got the alembic migration process right - would be great if you could check: 7f1a254

(I ran python scripts/run_alembic.py revision -m "migrate custos to psa tokens" --head=gxy@head)

@jdavcs jdavcs self-requested a review November 3, 2025 17:35
@nuwang nuwang force-pushed the combine_psa_keycloak branch from fcf60d1 to fcf9637 Compare November 3, 2025 18:19
@nuwang
Copy link
Member Author

nuwang commented Nov 3, 2025

@jdavcs I changed the down_revision and rebased on dev, which seems to have solved the migration problem, but please let me know whether that's ok

@nuwang nuwang force-pushed the combine_psa_keycloak branch from fcf9637 to 870ad98 Compare November 3, 2025 18:50
@jdavcs
Copy link
Member

jdavcs commented Nov 4, 2025

@nuwang Sorry for the delayed reply!

In my opinion, the migration should be a standalone script. We've called a data migration script from a db revision (aka db schema migration) a couple of times in the past. I didn't think that was the right approach in both cases. However, in those cases the data migration was tightly coupled with the schema changes, and in the end we agreed it was the safer/more straightforward thing to do. But here, we have just the data migration with no change to the db schema. Alembic (or SQLAlchemy) documentation does not offer a definitive best practice on how to do data migrations, but there are strong arguments against doing it as a migration (in the docs https://alembic.sqlalchemy.org/en/latest/cookbook.html#data-migrations-general-techniques and in this discussion: sqlalchemy/alembic#972 (reply in thread)).

A standalone script, on the other hand, is straightforward. Here's one example: #18079. (Also, here's an example of a test for such a scenario: https://github.com/galaxyproject/galaxy/pull/18079/files#diff-8bd6ad37d9a5dd34e50119ed4467325c4c5a145629d41926c484c4c1efe9dc27 )

@nuwang
Copy link
Member Author

nuwang commented Nov 5, 2025

Thanks for the feedback @jdavcs. There is a schema change here - the CustosAuthnzToken table has been dropped. I just didn't do it in the migration itself in-order to be as non-destructive as possible. Do you think that changes anything or should we go ahead with a separate data migration script? I'm happy to do it either way, but without an automatic migration, all oidc/cilogon accounts will become dissociated, and users will need to reassociate their accounts. How would that situation be handled with a data migration script? Would that be part of the release notes or something?

@jdavcs
Copy link
Member

jdavcs commented Nov 5, 2025

There is a schema change here - the CustosAuthnzToken table has been dropped. I just didn't do it in the migration itself in-order to be as non-destructive as possible.

Definitely should be in the migration. Extra tables in the database that are not in the model will cause problems (here's a summary #20809, and here's one example of the problems that may cause #20614). There are galaxy utility scripts for that (create_table, drop_table). Also, you'd need to change the model definition: dropping the table must go together with dropping the model (CustosAuthnzToken), and, of course, all references to that model (and there are tests that will fail if the model does not mirror the db schema state as per migrations).

How would that situation be handled with a data migration script? Would that be part of the release notes or something?

Sorry, I was going to mention it in my previous comment. If we had a separate data migration script, we'd definitely mention it (very prominently) in the admin notes section of the release notes.

That said, since a table is dropped, having the data migration logic referenced from the migration script makes sense here (otherwise, to preserve the data, admins would be required to run the script before the database upgrade - which, of course, is a recipe for disaster). We try not to put the actual script in the migration (to keep it clean and testable); instead you can place it in the data_fixes directory. Here's an example.

I hope I'm not missing anything (like any requirement that makes this a special case, etc.)

@nuwang
Copy link
Member Author

nuwang commented Nov 6, 2025

Thanks @jdavcs. I've moved the migration logic to data_fixes as you suggested, dropped the CustosAuthnzToken table from the model and adjusted the upgrade/downgrade script to drop/restore the table and migrate/restore the data.

@jdavcs
Copy link
Member

jdavcs commented Nov 6, 2025

@nuwang thank you for addressing all the comments! I'll run some more tests tomorrow (I was able to break this once - so now I want to make sure my edge cases were reasonable). I suppose one potential concern is this: let's say a token is corrupted: how should the script behave? Currently, the error prevents the migration from proceeding. We could leave it as is (but maybe handle it more gracefully and print a helpful message), or we could skip the problem record. I think that depends, in part, on whom we want to accommodate more - the admin or the users.

EDIT: No, my edge case was not reasonable: statement execution broke due to invalid json, which is not going to happen with these field types, managed by SQLAlchemy.

@bgruening
Copy link
Member

@nuwang is it possible with this PR to have multiple keycloak providers?

@nuwang
Copy link
Member Author

nuwang commented Nov 7, 2025

@nuwang is it possible with this PR to have multiple keycloak providers?

No, I tried to avoid making additional changes, just so the PR doesn't balloon too much, but that item is definitely on the radar. Maybe better done in a follow up PR?

@nuwang nuwang force-pushed the combine_psa_keycloak branch from 32dc1c6 to a16d1d0 Compare November 12, 2025 10:18
@nuwang
Copy link
Member Author

nuwang commented Nov 12, 2025

I've tested upgrades and downgrades as follows:

  1. Use latest dev branch
  2. Setup a PSA idp for Google auth
  3. Setup a Custos idp for Keycloak running locally
  4. Verify that login via Google and Keycloak work
  5. Switch to combine_psa_keycloak branch
  6. run sh manage_db.sh upgrade
  7. Verify that table content is migrated correctly by inspecting db (custos table dropped, and data migrated to psa)
  8. Run Galaxy and verify that Google and Keycloak continue to work (that is, logging in again with a keycloak account does not reassociate the account, and instead, reuses the existing account)

@nuwang nuwang force-pushed the combine_psa_keycloak branch from 2eef9d4 to e7cc7e9 Compare December 2, 2025 16:43
@ahmedhamidawan
Copy link
Member

On a related note, should we then update the list here too: https://galaxyproject.org/authnz/use/oidc/

Copy link
Member

@ahmedhamidawan ahmedhamidawan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The client changes for removing custos look good. I fixed some redundancies (but couldn't push since I don't have push access to your fork @nuwan)

commit 08690857b0e49e91c2a9073bc9c4f0e3998bad52
Author: Ahmed Awan <qe66653@umbc.edu>
Date:   Sat Dec 6 11:30:30 2025 +0500

    remove redundant toggle between custos and cilogon

diff --git a/client/src/components/User/ExternalIdentities/ExternalIDHelper.ts b/client/src/components/User/ExternalIdentities/ExternalIDHelper.ts
index ee947c262d..bf3de12447 100644
--- a/client/src/components/User/ExternalIdentities/ExternalIDHelper.ts
+++ b/client/src/components/User/ExternalIdentities/ExternalIDHelper.ts
@@ -74,12 +74,11 @@ export async function submitOIDCLogon(idp: string, redirectParam: string | null
 
 /**
  * CILogon login.
- * @param idp        "cilogon"
  * @param useIDPHint If true, append ?idphint=
  * @param idpHint    The entityID to hint with (ignored when useIDPHint = false)
  */
-export async function submitCILogon(idp: string, useIDPHint = false, idpHint?: string): Promise<string | null> {
-    let url = withPrefix(`/authnz/${idp}/login/`);
+export async function submitCILogon(useIDPHint = false, idpHint?: string): Promise<string | null> {
+    let url = withPrefix("/authnz/cilogon/login/");
     if (useIDPHint && idpHint) {
         url += `?idphint=${encodeURIComponent(idpHint)}`;
     }
@@ -109,7 +108,7 @@ export async function redirectToSingleProvider(config: OIDCConfig): Promise<stri
     }
 
     if (idp === "cilogon") {
-        const redirectUri = await submitCILogon(idp, false);
+        const redirectUri = await submitCILogon(false);
         return redirectUri;
     } else {
         const redirectUri = await submitOIDCLogon(idp, "");
diff --git a/client/src/components/User/ExternalIdentities/ExternalLogin.vue b/client/src/components/User/ExternalIdentities/ExternalLogin.vue
index 1fe05849ae..8bac792fbd 100644
--- a/client/src/components/User/ExternalIdentities/ExternalLogin.vue
+++ b/client/src/components/User/ExternalIdentities/ExternalLogin.vue
@@ -17,7 +17,6 @@ import { errorMessageAsString } from "@/utils/simple-error";
 import { capitalizeFirstLetter } from "@/utils/strings";
 
 import GButton from "@/components/BaseComponents/GButton.vue";
-import GButtonGroup from "@/components/BaseComponents/GButtonGroup.vue";
 import VerticalSeparator from "@/components/Common/VerticalSeparator.vue";
 import LoadingSpan from "@/components/LoadingSpan.vue";
 
@@ -50,33 +49,22 @@ const messageVariant = ref<string | null>(null);
 const cILogonIdps = ref<Idp[]>([]);
 const selected = ref<Idp | null>(null);
 const rememberIdp = ref(false);
-const cilogon = ref<"cilogon" | null>(null);
-const toggleCilogon = ref(false);
 
 const oIDCIdps = computed<OIDCConfig>(() => (isConfigLoaded.value ? config.value.oidc : {}));
 
 const filteredOIDCIdps = computed(() => getFilteredOIDCIdps(oIDCIdps.value, props.excludeIdps));
 
-const cilogonListShow = computed(() => getNeedShowCilogonInstitutionList(oIDCIdps.value));
-
-const cILogonEnabled = computed(() => oIDCIdps.value.cilogon);
+const cILogonConfigured = computed(() => getNeedShowCilogonInstitutionList(oIDCIdps.value));
 
 onMounted(async () => {
     rememberIdp.value = getIdpPreference() !== null;
 
     // Only fetch CILogonIDPs if cilogon configured
-    if (cilogonListShow.value) {
+    if (cILogonConfigured.value) {
         await getCILogonIdps();
     }
 });
 
-function toggleCILogon(idp: "cilogon") {
-    if (cilogon.value === idp || cilogon.value === null) {
-        toggleCilogon.value = !toggleCilogon.value;
-    }
-    cilogon.value = toggleCilogon.value ? idp : null;
-}
-
 async function clickOIDCLogin(idp: string) {
     if (loading.value) {
         return;
@@ -98,7 +86,7 @@ async function clickOIDCLogin(idp: string) {
     }
 }
 
-async function clickCILogin(idp: string | null) {
+async function clickCILogonLogin() {
     if (loading.value) {
         return;
     }
@@ -106,7 +94,7 @@ async function clickCILogin(idp: string | null) {
         setIdpPreference();
     }
 
-    if (!selected.value || !idp) {
+    if (!selected.value) {
         messageVariant.value = "danger";
         messageText.value = "Please select an institution.";
         return;
@@ -115,9 +103,9 @@ async function clickCILogin(idp: string | null) {
     loading.value = true;
 
     try {
-        const redirectUri = await submitCILogon(idp, true, selected.value.EntityID);
+        const redirectUri = await submitCILogon(true, selected.value.EntityID);
 
-        localStorage.setItem("galaxy-provider", idp);
+        localStorage.setItem("galaxy-provider", "cilogon");
 
         if (redirectUri) {
             window.location.href = redirectUri;
@@ -182,11 +170,9 @@ function getIdpPreference() {
         </BAlert>
 
         <div :class="{ 'd-flex h-100': !props.columnDisplay }">
-            <!-- OIDC login-->
-            <BForm v-if="cilogonListShow" id="externalLogin" class="cilogon">
-                <div v-if="props.loginPage">
-                    <!--Only Display if CILogon is configured-->
-                    <BFormGroup label="Use existing institutional login">
+            <BForm v-if="cILogonConfigured" id="externalLogin" class="cilogon">
+                <div>
+                    <BFormGroup :label="`Use ${props.loginPage ? `existing` : ``} institutional login`">
                         <Multiselect
                             v-model="selected"
                             placeholder="Select your institution"
@@ -204,46 +190,12 @@ function getIdpPreference() {
                         </BFormCheckbox>
                     </BFormGroup>
 
-                    <GButton
-                        v-if="cILogonEnabled"
-                        :disabled="loading || selected === null"
-                        @click="clickCILogin('cilogon')">
+                    <GButton :disabled="loading || selected === null" @click="clickCILogonLogin">
                         <LoadingSpan v-if="loading" message="Signing In" />
                         <span v-else>Sign in with Institutional Credentials*</span>
                     </GButton>
                 </div>
 
-                <div v-else>
-                    <GButtonGroup class="w-100">
-                        <GButton
-                            v-if="cILogonEnabled"
-                            :pressed="cilogon === 'cilogon'"
-                            @click="toggleCILogon('cilogon')">
-                            Sign in with Institutional Credentials*
-                        </GButton>
-                    </GButtonGroup>
-
-                    <BFormGroup v-if="toggleCilogon" class="mt-1">
-                        <Multiselect
-                            v-model="selected"
-                            placeholder="Select your institution"
-                            :options="cILogonIdps"
-                            label="DisplayName"
-                            select-label=""
-                            deselect-label=""
-                            :allow-empty="false"
-                            track-by="EntityID" />
-
-                        <GButton
-                            v-if="toggleCilogon"
-                            class="mt-1"
-                            :disabled="loading || selected === null"
-                            @click="clickCILogin(cilogon)">
-                            Login via CILogon *
-                        </GButton>
-                    </BFormGroup>
-                </div>
-
                 <p class="mt-3">
                     <small class="text-muted">
                         * Galaxy uses CILogon to enable you to log in from this organization. By clicking 'Sign In', you
@@ -254,7 +206,7 @@ function getIdpPreference() {
                 </p>
             </BForm>
 
-            <template v-if="cilogonListShow && Object.keys(filteredOIDCIdps).length > 0">
+            <template v-if="cILogonConfigured && Object.keys(filteredOIDCIdps).length > 0">
                 <VerticalSeparator v-if="!props.columnDisplay">
                     <span v-localize>or</span>
                 </VerticalSeparator>

@nuwang
Copy link
Member Author

nuwang commented Dec 6, 2025

Thanks for reviewing @ahmedhamidawan. And thanks for catching the list of providers - will update.

I'm puzzled by why you don't have push access.
image
I've added you as a collaborator just in case.

@ahmedhamidawan
Copy link
Member

Fair point, maybe I need to fix something locally 🤔

@nuwang
Copy link
Member Author

nuwang commented Dec 6, 2025

Have managed to manually apply your patch.

@nuwang
Copy link
Member Author

nuwang commented Dec 6, 2025

Removed custos references in: galaxyproject/galaxy-hub#3484

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Migrate Keycloak integration to PSA

4 participants