Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve TGT deletion behavior #6156

Closed
wants to merge 3 commits into from
Closed

Improve TGT deletion behavior #6156

wants to merge 3 commits into from

Conversation

leleuj
Copy link
Contributor

@leleuj leleuj commented Sep 20, 2024

Currently, there is a flaw in the current deletion behavior of the expired TGT: depending on the settings and the user browsing, the expired TGT can be removed by the ticket registry during a "get" operation or by the ticket registry cleaner.
Both cases are not the same: in the first case, the TGT only is deleted while in the second case, a whole single logout is performed.

So I propose that the ticket registry does not delete an expired TGT on a "get" operation if the cleaner is enabled, letting the cleaner eventually removes the TGT and performs a full SLO.

@apereocas-bot apereocas-bot added this to the 7.2.0-RC1 milestone Sep 20, 2024
@mmoayyed
Copy link
Member

Thank you for the patch. Unfortunately, this has many side effects.

In most deployments, cleaner is turned off to assist with performance; this means the deployment relies on the ticket storage backend itself to remove tickets. This only works in scenarios where the ticket expiration policy defined in CAS is or can be directly supported by the ticket registry (i.e. max-time-to-live). If the policy is not supported directly by the registry, then CAS needs to evaluate the expiration criteria for a ticket itself, and then delete it if it sees it expired, on demand. That is, a ticket could be perfectly valid from a registry technology point of view, but expired from the CAS point of view. So we cannot let the expired ticket just sit there, specially if you're paying for storage costs.

My suggestion would be that we publish an event when we delete the TGT (this might already exist in fact). An event listener can capture the event, and then optionally invoke SLO when necessary, and if turned on. If SLO is disabled, then we simply delete the ticket and nothing further happens. The configuration of the cleaner is of no consequence and should not be taken into account.

@leleuj
Copy link
Contributor Author

leleuj commented Sep 23, 2024

My bad: I forgot that the cleaner must be considered disabled at the ticket registry level for the NoOpTicketRegistryCleaner: a2393f6

I don't understand your comment : "In most deployments, cleaner is turned off to assist with performance...".

If the cleaner is turned off, then, the behavior of the ticket registry remains the same as it was previously: the TGT (like any other ticket) is removed if expired.

My change only occurs when the cleaner is enabled: in that case, the expired TGT is not removed from the ticket registry to give the opportunity to the cleaner to perform a SLO for this TGT.

The most relevant part of the PR (https://github.com/apereo/cas/pull/6156/files#diff-87c2fc38c2877ea256bebb5ab765351cc1d4352f72c99069359f6645da58927cR112):

                val isTgt = ticket instanceof TicketGrantingTicket;
                if (!isTgt || !cleanerEnabled) {
                    deleteSingleTicket(ticket);
                }
GitHub
Currently, there is a flaw in the current deletion behavior of the expired TGT: depending on the settings and the user browsing, the expired TGT can be removed by the ticket registry during a "...

@mmoayyed
Copy link
Member

My change only occurs when the cleaner is enabled: in that case, the expired TGT is not removed from the ticket registry to give the opportunity to the cleaner to perform a SLO for this TGT.

Yes, that is exactly the problem that we should address based on your original explanation. At the moment, the function of SLO is very much tried to the cleaner running active. We need to aim for and cover the following scenario:

  1. Cleaner is turned off if someone does not care to keep it on for performance reasons.
  2. Expired tickets are deleted by the registry on demand as necessary as they are today
  3. But when deleted, SLO should continue to trigger regardless of cleaner status. We are missing this. This is what I was suggesting, that if there is an async event, a piece of code can react to that event and trigger SLO if it's enabled (Naturally, we can only support backchannel SLOs in this model but we support those nonetheless)

Conflating SLO with the cleaner status is the problem here.

@leleuj
Copy link
Contributor Author

leleuj commented Sep 23, 2024

OK. I get it. I will adjust the PR.

@leleuj
Copy link
Contributor Author

leleuj commented Sep 23, 2024

I have studied the source code: we already have a CasTicketGrantingTicketDestroyedEvent triggered after a logoutManager.performLogout call and before a ticketRegistry.deleteTicket call in the DefaultSingleLogoutRequestExecutor.

I guess I could add a CasRequestSingleLogoutEvent event triggered from the DefaultTicketRegistryCleaner component and from the AbstractTicketRegistry.getTicket method. This would perform a logout manager call in the new CasAuthenticationAuthenticationEventListener.handleCasRequestSingleLogoutEvent method.

That said, we have another issue in the source code: if we call the deleteSingleTicket method for an expired TGT in the AbstractTicketRegistry.getTicket method, only the TGT is removed, the result is not the same as calling the deleteTicket method which will remove all the tickets related to the TGT (this is what is called in the cleaner or for an explicit logout in the DefaultSingleLogoutRequestExecutor).

@leleuj
Copy link
Contributor Author

leleuj commented Sep 23, 2024

I forgot to mention that the CasTicketGrantingTicketDestroyedEvent is not published from the cleaner, nor from the AbstractTicketRegistry.getTicket method although we removed the TGT: is this done on purpose? Is this event reserved to explicit TGT removal?

@leleuj
Copy link
Contributor Author

leleuj commented Sep 23, 2024

In the DefaultTicketRegistryCleaner, I would change:

    @Override
    public int cleanTicket(final Ticket ticket) {
        return lockRepository.execute(ticket.getId(), () -> {
            try {
                if (ticket instanceof final TicketGrantingTicket tgt) {
                    applicationContext.publishEvent(new CasRequestSingleLogoutEvent(this, tgt, null));
                    applicationContext.publishEvent(new CasTicketGrantingTicketDestroyedEvent(this, tgt, null));
                }
            } catch (final Throwable e) {
                LoggingUtils.error(LOGGER, e);
            }

            try {
                LOGGER.debug("Cleaning up expired ticket [{}]", ticket.getId());
                return ticketRegistry.deleteTicket(ticket);
            } catch (final Throwable e) {
                LoggingUtils.error(LOGGER, e);
                return 0;
            }
        }).orElse(0);
    }

@leleuj
Copy link
Contributor Author

leleuj commented Sep 23, 2024

In the AbstractTicketRegistry, I would change:

    @Override
    public Ticket getTicket(final String ticketId) {
        val returnTicket = getTicket(ticketId, ticket -> {
            if (ticket.isExpired()) {
                val ticketAgeSeconds = getTicketAgeSeconds(ticket);
                LOGGER.debug("Ticket [{}] has expired according to policy [{}] after [{}] seconds and [{}] uses and will be removed from the ticket registry",
                    ticketId, ticket.getExpirationPolicy().getName(), ticketAgeSeconds, ticket.getCountOfUses());
                if (ticket instanceof final TicketGrantingTicket tgt) {
                    val clientInfo = ClientInfoHolder.getClientInfo();
                    applicationContext.publishEvent(new CasRequestSingleLogoutEvent(this, tgt, clientInfo));
                    applicationContext.publishEvent(new CasTicketGrantingTicketDestroyedEvent(this, tgt, clientInfo));
                }
                try {
                    deleteTicket(ticket);
                } catch (final Exception e) {
                    LOGGER.warn("Deletion failed for ticket [{}]", ticket.getId());
                }
                return false;
            }
            return true;
        });
        if (returnTicket != null) {
            val ticketAgeSeconds = getTicketAgeSeconds(returnTicket);
            if (ticketAgeSeconds < -1) {
                LOGGER.warn("Ticket created [{}] second(s) in the future. Check time synchronization on all servers.", ticketAgeSeconds * -1);
            }
        }
        return returnTicket;
    }

What do you think?

@leleuj
Copy link
Contributor Author

leleuj commented Sep 23, 2024

As our discussion led us to a new approach, I opened a new PR: #6161

Some tests may fail. Waiting for the build to pass and give a better overview.

@leleuj leleuj closed this Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants