Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service binding doesn't work reliably ever since support for inactive datasources has been introduced #44297

Open
michalvavrik opened this issue Nov 4, 2024 · 19 comments
Assignees
Labels
area/hibernate-orm Hibernate ORM area/hibernate-reactive Hibernate Reactive area/kubernetes kind/bug Something isn't working triage/needs-feedback We are waiting for feedback.

Comments

@michalvavrik
Copy link
Member

Describe the bug

Now, this is very hard to reproduce for me, but our periodic builds testing service binding fails pretty often now that #41929 is merged. It fails both in JVM and native mode, though native mode seems to fail more often. And both Hibernate ORM and Hibernate Reactive are affected.

Expected behavior

This makes using service binding in production tricky, because you never know how many attempts it will take to deploy pod (e.g. what constellation of starts will be and so on). So IMO if service binding is in place, datasource should not be deactivated.

Actual behavior

There are 2 different behaviors, one for Hibernate ORM, one for Hibernate Reactive. Both results on application startup failure.

Hibernate ORM (native mode logs):

03:59:49,778 INFO  [app] 03:59:48,321 Failed to start application: java.lang.RuntimeException: Failed to start quarkus
03:59:49,778 INFO  [app] 	at io.quarkus.runner.ApplicationImpl.doStart(Unknown Source)
03:59:49,778 INFO  [app] 	at io.quarkus.runtime.Application.start(Application.java:101)
03:59:49,778 INFO  [app] 	at io.quarkus.runtime.ApplicationLifecycleManager.run(ApplicationLifecycleManager.java:121)
03:59:49,778 INFO  [app] 	at io.quarkus.runtime.Quarkus.run(Quarkus.java:71)
03:59:49,778 INFO  [app] 	at io.quarkus.runtime.Quarkus.run(Quarkus.java:44)
03:59:49,778 INFO  [app] 	at io.quarkus.runtime.Quarkus.run(Quarkus.java:124)
03:59:49,778 INFO  [app] 	at io.quarkus.runner.GeneratedMain.main(Unknown Source)
03:59:49,778 INFO  [app] Caused by: io.quarkus.runtime.configuration.ConfigurationException: Unable to find datasource '<default>' for persistence unit '<default>': Bean is not active: SYNTHETIC bean [class=io.agroal.api.AgroalDataSource, id=sqqLi56D50iCdXmOjyjPSAxbLu0]
03:59:49,778 INFO  [app] Reason: Datasource '<default>' was deactivated automatically because its URL is not set. To activate the datasource, set configuration property 'quarkus.datasource.jdbc.url'. Refer to https://quarkus.io/guides/datasource for guidance.
03:59:49,778 INFO  [app] To avoid this exception while keeping the bean inactive:
03:59:49,778 INFO  [app] 	- Configure all extensions consuming this bean as inactive as well, if they allow it, e.g. 'quarkus.someextension.active=false'
03:59:49,781 INFO  [app] 	- Make sure that custom code only accesses this bean if it is active
03:59:49,781 INFO  [app] 	at io.quarkus.hibernate.orm.runtime.PersistenceUnitUtil.unableToFindDataSource(PersistenceUnitUtil.java:115)
03:59:49,781 INFO  [app] 	at io.quarkus.hibernate.orm.runtime.FastBootHibernatePersistenceProvider.injectDataSource(FastBootHibernatePersistenceProvider.java:392)
03:59:49,781 INFO  [app] 	at io.quarkus.hibernate.orm.runtime.FastBootHibernatePersistenceProvider.buildRuntimeSettings(FastBootHibernatePersistenceProvider.java:209)
03:59:49,781 INFO  [app] 	at io.quarkus.hibernate.orm.runtime.FastBootHibernatePersistenceProvider.getEntityManagerFactoryBuilderOrNull(FastBootHibernatePersistenceProvider.java:180)
03:59:49,781 INFO  [app] 	at io.quarkus.hibernate.orm.runtime.FastBootHibernatePersistenceProvider.createEntityManagerFactory(FastBootHibernatePersistenceProvider.java:66)
03:59:49,782 INFO  [app] 	at jakarta.persistence.Persistence.createEntityManagerFactory(Persistence.java:80)
03:59:49,782 INFO  [app] 	at jakarta.persistence.Persistence.createEntityManagerFactory(Persistence.java:55)
03:59:49,782 INFO  [app] 	at io.quarkus.hibernate.orm.runtime.JPAConfig$LazyPersistenceUnit.get(JPAConfig.java:154)
03:59:49,782 INFO  [app] 	at io.quarkus.hibernate.orm.runtime.JPAConfig$1.run(JPAConfig.java:61)
03:59:49,782 INFO  [app] 	at java.base@21.0.5/java.lang.Thread.runWith(Thread.java:1596)
03:59:49,782 INFO  [app] 	at java.base@21.0.5/java.lang.Thread.run(Thread.java:1583)
03:59:49,782 INFO  [app] 	at org.graalvm.nativeimage.builder/com.oracle.svm.core.thread.PlatformThreads.threadStartRoutine(PlatformThreads.java:896)
03:59:49,782 INFO  [app] 	at org.graalvm.nativeimage.builder/com.oracle.svm.core.thread.PlatformThreads.threadStartRoutine(PlatformThreads.java:872)
03:59:49,782 INFO  [app] Caused by: io.quarkus.arc.InactiveBeanException: Bean is not active: SYNTHETIC bean [class=io.agroal.api.AgroalDataSource, id=sqqLi56D50iCdXmOjyjPSAxbLu0]
03:59:49,782 INFO  [app] Reason: Datasource '<default>' was deactivated automatically because its URL is not set. To activate the datasource, set configuration property 'quarkus.datasource.jdbc.url'. Refer to https://quarkus.io/guides/datasource for guidance.
03:59:49,782 INFO  [app] To avoid this exception while keeping the bean inactive:
03:59:49,782 INFO  [app] 	- Configure all extensions consuming this bean as inactive as well, if they allow it, e.g. 'quarkus.someextension.active=false'
03:59:49,782 INFO  [app] 	- Make sure that custom code only accesses this bean if it is active
03:59:49,782 INFO  [app] 	at io.agroal.api.AgroalDataSource_sqqLi56D50iCdXmOjyjPSAxbLu0_Synthetic_Bean.doCreate(Unknown Source)
03:59:49,782 INFO  [app] 	at io.agroal.api.AgroalDataSource_sqqLi56D50iCdXmOjyjPSAxbLu0_Synthetic_Bean.create(Unknown Source)
03:59:49,782 INFO  [app] 	at io.agroal.api.AgroalDataSource_sqqLi56D50iCdXmOjyjPSAxbLu0_Synthetic_Bean.create(Unknown Source)
03:59:49,782 INFO  [app] 	at io.quarkus.arc.impl.AbstractSharedContext.createInstanceHandle(AbstractSharedContext.java:119)
03:59:49,782 INFO  [app] 	at io.quarkus.arc.impl.AbstractSharedContext$1.get(AbstractSharedContext.java:38)
03:59:49,783 INFO  [app] 	at io.quarkus.arc.impl.AbstractSharedContext$1.get(AbstractSharedContext.java:35)
03:59:49,783 INFO  [app] 	at io.quarkus.arc.generator.Default_jakarta_enterprise_context_ApplicationScoped_ContextInstances.c9(Unknown Source)
03:59:49,783 INFO  [app] 	at io.quarkus.arc.generator.Default_jakarta_enterprise_context_ApplicationScoped_ContextInstances.computeIfAbsent(Unknown Source)
03:59:49,784 INFO  [app] 	at io.quarkus.arc.impl.AbstractSharedContext.get(AbstractSharedContext.java:35)
03:59:49,784 INFO  [app] 	at io.quarkus.arc.impl.ClientProxies.getApplicationScopedDelegate(ClientProxies.java:21)
03:59:49,784 INFO  [app] 	at io.agroal.api.AgroalDataSource_sqqLi56D50iCdXmOjyjPSAxbLu0_Synthetic_ClientProxy.arc$delegate(Unknown Source)
03:59:49,784 INFO  [app] 	at io.agroal.api.AgroalDataSource_sqqLi56D50iCdXmOjyjPSAxbLu0_Synthetic_ClientProxy.arc_contextualInstance(Unknown Source)
03:59:49,784 INFO  [app] 	at io.quarkus.arc.ClientProxy.unwrap(ClientProxy.java:52)
03:59:49,784 INFO  [app] 	at io.quarkus.hibernate.orm.runtime.FastBootHibernatePersistenceProvider.injectDataSource(FastBootHibernatePersistenceProvider.java:390)
03:59:49,784 INFO  [app] 	... 11 more
03:59:49,785 INFO  [app] [app-5cbd9d6b76-pzxdf-app] __  ____  __  _____   ___  __ ____  ______ 
03:59:49,794 INFO  [app]  --/ __ \/ / / / _ | / _ \/ //_/ / / / __/ 
03:59:49,795 INFO  [app]  -/ /_/ / /_/ / __ |/ , _/ ,< / /_/ /\ \   
03:59:49,795 INFO  [app] --\___\_\____/_/ |_/_/|_/_/|_|\____/___/   
03:59:49,795 INFO  [app] 03:59:47,706 Unrecognized configuration key "quarkus.service-binding.app-postgresql.host" was provided; it will be ignored; verify that the dependency extension for this configuration is set or that you did not make a typo
03:59:49,795 INFO  [app] 03:59:47,706 Unrecognized configuration key "quarkus.service-binding.app-postgresql.port" was provided; it will be ignored; verify that the dependency extension for this configuration is set or that you did not make a typo
03:59:49,795 INFO  [app] 03:59:47,706 Unrecognized configuration key "quarkus.service-binding.app-postgresql.dbname" was provided; it will be ignored; verify that the dependency extension for this configuration is set or that you did not make a typo
03:59:49,795 INFO  [app] 03:59:47,706 Unrecognized configuration key "quarkus.service-binding.app-postgresql.tls.crt" was provided; it will be ignored; verify that the dependency extension for this configuration is set or that you did not make a typo
03:59:49,795 INFO  [app] 03:59:47,706 Unrecognized configuration key "quarkus.service-binding.app-postgresql.password" was provided; it will be ignored; verify that the dependency extension for this configuration is set or that you did not make a typo
03:59:49,795 INFO  [app] 03:59:47,706 Unrecognized configuration key "quarkus.service-binding.app-postgresql.database" was provided; it will be ignored; verify that the dependency extension for this configuration is set or that you did not make a typo
03:59:49,795 INFO  [app] 03:59:47,706 Unrecognized configuration key "quarkus.service-binding.app-postgresql.user" was provided; it will be ignored; verify that the dependency extension for this configuration is set or that you did not make a typo
03:59:49,796 INFO  [app] 03:59:47,706 Unrecognized configuration key "quarkus.service-binding.app-postgresql.ca.crt" was provided; it will be ignored; verify that the dependency extension for this configuration is set or that you did not make a typo
03:59:49,796 INFO  [app] 03:59:47,706 Unrecognized configuration key "quarkus.service-binding.app-postgresql.username" was provided; it will be ignored; verify that the dependency extension for this configuration is set or that you did not make a typo
03:59:49,796 INFO  [app] 03:59:47,706 Unrecognized configuration key "quarkus.service-binding.app-postgresql.verifier" was provided; it will be ignored; verify that the dependency extension for this configuration is set or that you did not make a typo
03:59:49,796 INFO  [app] 03:59:47,706 Unrecognized configuration key "quarkus.service-binding.app-postgresql.jdbc-uri" was provided; it will be ignored; verify that the dependency extension for this configuration is set or that you did not make a typo
03:59:49,796 INFO  [app] 03:59:47,706 Unrecognized configuration key "quarkus.service-binding.app-postgresql.tls.key" was provided; it will be ignored; verify that the dependency extension for this configuration is set or that you did not make a typo
03:59:49,796 INFO  [app] 03:59:47,706 Unrecognized configuration key "quarkus.service-binding.app-postgresql.uri" was provided; it will be ignored; verify that the dependency extension for this configuration is set or that you did not make a typo

Hibernate Reactive (native mode logs):

04:07:27,906 INFO  [app] 04:07:26,176 Failed to start application: java.lang.RuntimeException: Failed to start quarkus
04:07:27,906 INFO  [app] 	at io.quarkus.runner.ApplicationImpl.doStart(Unknown Source)
04:07:27,906 INFO  [app] 	at io.quarkus.runtime.Application.start(Application.java:101)
04:07:27,906 INFO  [app] 	at io.quarkus.runtime.ApplicationLifecycleManager.run(ApplicationLifecycleManager.java:121)
04:07:27,906 INFO  [app] 	at io.quarkus.runtime.Quarkus.run(Quarkus.java:71)
04:07:27,906 INFO  [app] 	at io.quarkus.runtime.Quarkus.run(Quarkus.java:44)
04:07:27,906 INFO  [app] 	at io.quarkus.runtime.Quarkus.run(Quarkus.java:124)
04:07:27,906 INFO  [app] 	at io.quarkus.runner.GeneratedMain.main(Unknown Source)
04:07:27,906 INFO  [app] Caused by: io.quarkus.runtime.configuration.ConfigurationException: Unable to find datasource '<default>' for persistence unit 'default-reactive': Bean is not active: SYNTHETIC bean [class=io.vertx.pgclient.PgPool, id=WVL9cdM2vfa8AHSEpmajClhheoQ]
04:07:27,906 INFO  [app] Reason: Datasource '<default>' was deactivated automatically because its URL is not set. To activate the datasource, set configuration property 'quarkus.datasource.reactive.url'. Refer to https://quarkus.io/guides/datasource for guidance.
04:07:27,906 INFO  [app] To avoid this exception while keeping the bean inactive:
04:07:27,906 INFO  [app] 	- Configure all extensions consuming this bean as inactive as well, if they allow it, e.g. 'quarkus.someextension.active=false'
04:07:27,906 INFO  [app] 	- Make sure that custom code only accesses this bean if it is active
04:07:27,907 INFO  [app] 	- Inject the bean with 'Instance<io.vertx.pgclient.PgPool>' instead of 'io.vertx.pgclient.PgPool'
04:07:27,907 INFO  [app] This bean is injected into:
04:07:27,907 INFO  [app] 	- 
04:07:27,907 INFO  [app] 	at io.quarkus.hibernate.orm.runtime.PersistenceUnitUtil.unableToFindDataSource(PersistenceUnitUtil.java:115)
04:07:27,907 INFO  [app] 	at io.quarkus.hibernate.reactive.runtime.FastBootHibernateReactivePersistenceProvider.registerVertxAndPool(FastBootHibernateReactivePersistenceProvider.java:294)
04:07:27,907 INFO  [app] 	at io.quarkus.hibernate.reactive.runtime.FastBootHibernateReactivePersistenceProvider.rewireMetadataAndExtractServiceRegistry(FastBootHibernateReactivePersistenceProvider.java:218)
04:07:27,907 INFO  [app] 	at io.quarkus.hibernate.reactive.runtime.FastBootHibernateReactivePersistenceProvider.getEntityManagerFactoryBuilderOrNull(FastBootHibernateReactivePersistenceProvider.java:195)
04:07:27,907 INFO  [app] 	at io.quarkus.hibernate.reactive.runtime.FastBootHibernateReactivePersistenceProvider.createEntityManagerFactory(FastBootHibernateReactivePersistenceProvider.java:91)
04:07:27,907 INFO  [app] 	at jakarta.persistence.Persistence.createEntityManagerFactory(Persistence.java:80)
04:07:27,907 INFO  [app] 	at jakarta.persistence.Persistence.createEntityManagerFactory(Persistence.java:55)
04:07:27,907 INFO  [app] 	at io.quarkus.hibernate.orm.runtime.JPAConfig$LazyPersistenceUnit.get(JPAConfig.java:154)
04:07:27,907 INFO  [app] 	at io.quarkus.hibernate.orm.runtime.JPAConfig$1.run(JPAConfig.java:61)
04:07:27,907 INFO  [app] 	at java.base@21.0.5/java.lang.Thread.runWith(Thread.java:1596)
04:07:27,907 INFO  [app] 	at java.base@21.0.5/java.lang.Thread.run(Thread.java:1583)
04:07:27,907 INFO  [app] 	at org.graalvm.nativeimage.builder/com.oracle.svm.core.thread.PlatformThreads.threadStartRoutine(PlatformThreads.java:896)
04:07:27,908 INFO  [app] 	at org.graalvm.nativeimage.builder/com.oracle.svm.core.thread.PlatformThreads.threadStartRoutine(PlatformThreads.java:872)
04:07:27,908 INFO  [app] Caused by: io.quarkus.arc.InactiveBeanException: Bean is not active: SYNTHETIC bean [class=io.vertx.pgclient.PgPool, id=WVL9cdM2vfa8AHSEpmajClhheoQ]
04:07:27,908 INFO  [app] Reason: Datasource '<default>' was deactivated automatically because its URL is not set. To activate the datasource, set configuration property 'quarkus.datasource.reactive.url'. Refer to https://quarkus.io/guides/datasource for guidance.
04:07:27,908 INFO  [app] To avoid this exception while keeping the bean inactive:
04:07:27,908 INFO  [app] 	- Configure all extensions consuming this bean as inactive as well, if they allow it, e.g. 'quarkus.someextension.active=false'
04:07:27,908 INFO  [app] 	- Make sure that custom code only accesses this bean if it is active
04:07:27,908 INFO  [app] 	- Inject the bean with 'Instance<io.vertx.pgclient.PgPool>' instead of 'io.vertx.pgclient.PgPool'
04:07:27,908 INFO  [app] This bean is injected into:
04:07:27,908 INFO  [app] 	- 
04:07:27,908 INFO  [app] 	at io.vertx.pgclient.PgPool_WVL9cdM2vfa8AHSEpmajClhheoQ_Synthetic_Bean.doCreate(Unknown Source)
04:07:27,908 INFO  [app] 	at io.vertx.pgclient.PgPool_WVL9cdM2vfa8AHSEpmajClhheoQ_Synthetic_Bean.create(Unknown Source)
04:07:27,908 INFO  [app] 	at io.vertx.pgclient.PgPool_WVL9cdM2vfa8AHSEpmajClhheoQ_Synthetic_Bean.create(Unknown Source)
04:07:27,908 INFO  [app] 	at io.quarkus.arc.impl.AbstractSharedContext.createInstanceHandle(AbstractSharedContext.java:119)
04:07:27,908 INFO  [app] 	at io.quarkus.arc.impl.AbstractSharedContext$1.get(AbstractSharedContext.java:38)
04:07:27,908 INFO  [app] 	at io.quarkus.arc.impl.AbstractSharedContext$1.get(AbstractSharedContext.java:35)
04:07:27,908 INFO  [app] 	at io.quarkus.arc.generator.Default_jakarta_enterprise_context_ApplicationScoped_ContextInstances.c4(Unknown Source)
04:07:27,909 INFO  [app] 	at io.quarkus.arc.generator.Default_jakarta_enterprise_context_ApplicationScoped_ContextInstances.computeIfAbsent(Unknown Source)
04:07:27,909 INFO  [app] 	at io.quarkus.arc.impl.AbstractSharedContext.get(AbstractSharedContext.java:35)
04:07:27,909 INFO  [app] 	at io.quarkus.arc.impl.ClientProxies.getApplicationScopedDelegate(ClientProxies.java:21)
04:07:27,909 INFO  [app] 	at io.vertx.pgclient.PgPool_WVL9cdM2vfa8AHSEpmajClhheoQ_Synthetic_ClientProxy.arc$delegate(Unknown Source)
04:07:27,909 INFO  [app] 	at io.vertx.pgclient.PgPool_WVL9cdM2vfa8AHSEpmajClhheoQ_Synthetic_ClientProxy.arc_contextualInstance(Unknown Source)
04:07:27,909 INFO  [app] 	at io.quarkus.arc.ClientProxy.unwrap(ClientProxy.java:52)
04:07:27,909 INFO  [app] 	at io.quarkus.hibernate.reactive.runtime.FastBootHibernateReactivePersistenceProvider.registerVertxAndPool(FastBootHibernateReactivePersistenceProvider.java:292)
04:07:27,909 INFO  [app] 	... 11 more

How to Reproduce?

Steps to reproduce the behavior:

  1. you need to be logged into OpenShift cluster (both 4.12 and 4.17 are failing)
  2. git clone git@github.com:michalvavrik/quarkus-test-suite.git -b feature/sb-reactive-reproducer
  3. quarkus-test-suite/service-binding/postgresql-crunchy-reactive (for Hibernate Reactive) or quarkus-test-suite/service-binding/postgresql-crunchy-classic (for Hibernate ORM)
  4. mvn clean verify -Dopenshift

Output of uname -a or ver

RHEL 8

Output of java -version

Temurin 21

Quarkus version or git rev

999-SNAPSHOT

Build tool (ie. output of mvnw --version or gradlew --version)

Maven 3.9.4

Additional information

No response

@michalvavrik michalvavrik added area/hibernate-orm Hibernate ORM area/hibernate-reactive Hibernate Reactive area/kubernetes kind/bug Something isn't working labels Nov 4, 2024
Copy link

quarkus-bot bot commented Nov 4, 2024

/cc @geoand (openshift), @iocanel (openshift)

@michalvavrik
Copy link
Member Author

/cc @yrodiere @radcortez

@yrodiere
Copy link
Member

yrodiere commented Nov 4, 2024

Bear with me, I have never used service binding, but:

Expected behavior

This makes using service binding in production tricky, because you never know how many attempts it will take to deploy pod (e.g. what constellation of starts will be and so on). So IMO if service binding is in place, datasource should not be deactivated.

You are expecting the datasource to not be deactivated, even though no URL was configured? But that means the datasource won't work... Do you just want the app to start without error, but fail on first request?

@yrodiere yrodiere added the triage/needs-feedback We are waiting for feedback. label Nov 4, 2024
@michalvavrik
Copy link
Member Author

michalvavrik commented Nov 4, 2024

You are expecting the datasource to not be deactivated, even though no URL was configured? But that means the datasource won't work... Do you just want the app to start without error, but fail on first request?

No, I expect that environment variables set by the service binding operator are used and they contain the datasource URL (or they contain link to where the ds URL value is, don't remember how it works). There is some race.

Regarding triage/needs-feedback label, @yrodiere , this test worked for years (not sure how long, 2 +?). So I think it's a bug.

@michalvavrik
Copy link
Member Author

michalvavrik commented Nov 4, 2024

For context @yrodiere , our test requests entity from database and receives it, so no failure is expected.

@yrodiere
Copy link
Member

yrodiere commented Nov 4, 2024

No, I expect that environment variables set by the service binding operator are used and they contain the datasource URL (or they contain link to where the ds URL value is, don't remember how it works). There is some race.

Okay. The checking of whether a datasource is active (and its initialization, when active) happens on CDI startup now. If service binding is manipulating config concurrently to CDI startup, or as part of CDI startup (CDI bean startup order is undefined), that could explain the problem.

@iocanel could you please point us to how/where the datasource URL is set in the service binding extension(s)?

Regarding triage/needs-feedback label, @yrodiere , this test worked for years (not sure how long, 2 +?). So I think it's a bug.

That label was just because I was waiting for your answer... Nobody is questioning whether it's a bug, I'm not sure why you're bringing this up.

@michalvavrik
Copy link
Member Author

That label was just because I was waiting for your answer... Nobody is questioning whether it's a bug, I'm not sure why you're bringing this up.

Sorry.

@geoand
Copy link
Contributor

geoand commented Nov 4, 2024

could you please point us to how/where the datasource URL is set in the service binding extension(s)?

It looks that information in a specific file on the file system - see this for example

@michalvavrik
Copy link
Member Author

I think they are just config sources like any other (judging by KubernetesConfigSourceFactoryBuilder and io.quarkus.kubernetes.service.binding.runtime.DatasourceServiceBindingConfigSourceFactory). Is there a chance that DS code that determines whether bean is active or inactive is executed before runtime config is ready? I'll leave debugging to you. Thanks for looking into it. I cannot reproduce it locally, but I have 5+ failures (counted so far) in Jenkins, so if you bring potential fix, I suppose I can run it several times and see if it helped.

@iocanel
Copy link
Contributor

iocanel commented Nov 4, 2024

I haven't touched that code since the service binding operator got deprecated.
So, I'll have to dig to refresh my memory.
Also, I am wondering if we should just deprecate and drop the extension ourselves. @maxandersen thoughts?

@geoand
Copy link
Contributor

geoand commented Nov 4, 2024

I think they are just config sources like any other

Correct

@geoand
Copy link
Contributor

geoand commented Nov 4, 2024

Also, I am wondering if we should just deprecate and drop the extension ourselves. @maxandersen thoughts?

I was wondering the same...

@maxandersen
Copy link
Member

lets make sure we dont conflate deprecation of service binding operator with support for service binding API's.

I'll check on where service binding apis support actually is outside openshift setup.

But that still leaves a concerns on why the new "inactive" somehow causes race conditions - that shouldn't happen?

@geoand geoand removed the triage/needs-feedback We are waiting for feedback. label Nov 5, 2024
@michalvavrik
Copy link
Member Author

@yrodiere I can help you to run it against OCP if you want (give you temporary access to OCP cluster inside VPN), but as I said, it's flaky, it won't happen everytime, so you will need patience.

@yrodiere
Copy link
Member

yrodiere commented Nov 6, 2024

But that still leaves a concerns on why the new "inactive" somehow causes race conditions - that shouldn't happen?

I don't see how the new "inactive" stuff could be the root cause.

Though introducing that feature required to change when exactly we check the JDBC URL (we do it earlier, I think, during CDI initialization)... so maybe that made some pre-existing race condition between CDI init and config more visible? Seems unlikely, but at this point it's all I have :/

@yrodiere I can help you to run it against OCP if you want (give you temporary access to OCP cluster inside VPN), but as I said, it's flaky, it won't happen everytime, so you will need patience.

I'm afraid I'll also need an ungodly amount of time... And without a debugger I'm not even sure it would help investigate.

Could you perhaps enable more logging so that we can compare what happens on your CI when the bug appears and when it doesn't? I'm thinking of enabling TRACE or at least DEBUG for io.quarkus.datasource, io.quarkus.agroal, io.quarkus.datasource, io.quarkus.config, io.quarkus.deployment.configuration, io.quarkus.runtime.configuration, io.quarkus.kubernetes.service.

EDIT: Oh, and also io.quarkus.arc, io.quarkus.deployment.bean, io.quarkus.runtime.bean.

@yrodiere yrodiere self-assigned this Nov 6, 2024
@yrodiere
Copy link
Member

yrodiere commented Nov 6, 2024

Hey @Ladicek @radcortez , can you confirm configuration doesn't depend on CDI at all and thus should be initialized completely once CDI starts initializing beans?

I.e. do you know if the relative order beteen these two events is clearly defined?

  1. Initialization of runtime CDI beans, in particular active checks:
    public Supplier<ActiveResult> agroalDataSourceCheckActiveSupplier(String dataSourceName) {
  2. Definition of runtime config sources, in particular this one:

@Ladicek
Copy link
Contributor

Ladicek commented Nov 6, 2024

Config does not depend on CDI in any way, yes, but I'm not sure if we have a defined ordering between runtime config init and runtime CDI init.

@radcortez
Copy link
Member

Yes, Config does not depend on CDI in any way, but we had cases where we added such dependencies, which I've removed. I think we don't have such cases anymore.

Remember that Arc does some things during STATIC_INIT, so only the STATIC_INIT configuration is available.

In the presented code, the AgroalRecorder method executes in RUNTIME, and the KubernetesConfigSourceFactoryBuilder is only set to execute for the RUNTIME config, so it should be fine. If the recorder method was for STATIC_INIT, then the issue could be there. Maybe some other previous build step in STATIC_INIT is expecting the Kubernetes config?

@michalvavrik
Copy link
Member Author

Could you perhaps enable more logging so that we can compare what happens on your CI when the bug appears and when it doesn't? I'm thinking of enabling TRACE or at least DEBUG for io.quarkus.datasource, io.quarkus.agroal, io.quarkus.datasource, io.quarkus.config, io.quarkus.deployment.configuration, io.quarkus.runtime.configuration, io.quarkus.kubernetes.service.
EDIT: Oh, and also io.quarkus.arc, io.quarkus.deployment.bean, io.quarkus.runtime.bean.

It's bit problematic because if QE periodic builds are going to have failures, I need to get message across the team it's expected. Perhaps I'll just create job that runs in many iterations so that I can get required logs. I'll report back next week.

@yrodiere yrodiere added the triage/needs-feedback We are waiting for feedback. label Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/hibernate-orm Hibernate ORM area/hibernate-reactive Hibernate Reactive area/kubernetes kind/bug Something isn't working triage/needs-feedback We are waiting for feedback.
Projects
None yet
Development

No branches or pull requests

7 participants