Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapter config cache miss #38

Open
c0c0n3 opened this issue Apr 23, 2020 · 3 comments
Open

Adapter config cache miss #38

c0c0n3 opened this issue Apr 23, 2020 · 3 comments
Labels
bug Something isn't working

Comments

@c0c0n3
Copy link
Member

c0c0n3 commented Apr 23, 2020

We cache adapter Istio config to be able to request an ID token to add to HTTP requests originating from the mesh. This is a consequence of #25 which implemented a workaround to #24. However if there's a cache miss, we won't be able to add the ID token to the outbound HTTP request at hand.

Notice the Mixer passes the current config to the adapter on each mesh inbound HTTP call and the adapter caches it every time, replacing any old cached config with the most recent one. So a cache miss can only happen if

  1. The adapter process starts.
  2. An outbound HTTP request is intercepted by the Envoy filter---think Orion notification.
  3. The filter calls the adapter to get an ID token to add to the request.

and no mesh inbound HTTP request gets intercepted before (3). My gut tells me this isn't very likely to happen in practice, especially for high-volume sites and also considering that client requests will typically hit the mesh before a mesh service tries to call an external service---e.g. a subscription/entity update has to happen before Orion sends out a notification. However no matter how likely, the issue is there, so we might fail to inject ID tokens into outbound Orion notifications occasionally.

We should look for an easy way to solve this, e.g. in principle we should be able request current configuration from Galley or the Mixer itself so that if there's a cache miss, we would have a fallback.

@c0c0n3 c0c0n3 added the bug Something isn't working label Apr 23, 2020
@c0c0n3
Copy link
Member Author

c0c0n3 commented Apr 23, 2020

Come to think of it, I'm not sure we should spend too much time on this. In fact, the current architecture doesn't cater for reliable delivery of notifications anyway---think message queues. So we might actually fail to inject ID tokens into outbound Orion notifications occasionally for reasons that seem more likely to me:

  • DAPS server is down
  • Galley or Mixer call fails

and there's no easy way to recover from those in the current synchronous message passing setting.

@gboege
Copy link
Collaborator

gboege commented Apr 28, 2020

We have to see how this comes out.

  • DAPS server down is bumby in current test scenarios but should be better in production.
  • Galley or Mixer fail ( I think we can tune this Istio resilience features.)

But I can think of many situations where you have a one time subscription and then ONLY notifications for days/weeks/months.

Maybe we should install a trigger to call the /version endpoint any 100ms... or so... then we only "lose" data for some time... but even there we might be able to stabilize a bit with resilience features, if not our own Mixer constructions hinder us there.

But yes... notifications might fail. because they are only webhooks.

@c0c0n3
Copy link
Member Author

c0c0n3 commented Apr 29, 2020

situations where you have a one time subscription and then ONLY notifications for days/weeks/months.

Um, that shouldn't be a problem though since before a notification goes out some new data must come in---e.g. from an agent. The adapter would catch the HTTP request to update the entity and cache the current config as it gets it from the Mixer. Am I missing something here?

Maybe we should install a trigger to call the /version endpoint any 100ms... or so

Yes, why not. It should be easy enough to do and would give us some peace of mind :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants