-
Notifications
You must be signed in to change notification settings - Fork 627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NumberFormatException while getting sender sequence number #601
Comments
Looks like the file was existing but empty? Could this be related to #540 ? Are you sure that the access to the store is only done by one thread, I.e. could it be that you started up two services for the same session? |
Yes, file is empty for the members who have not yet done session login. I can't relate this issue with #540, in that issue during session reset StreamClosed would occur and server itself might be running as expected. Switching to other available session would solve our problem. In this case, fix engine itself did not start correctly (Happened only once until now) We have a failover mechanism which is running in active-active mode. In case, one node goes down. users request will be redirected to node2 via WAF where they can do session logon again and continue. So, yes, they share common store files. But application itself is deployed on different machines. |
Do you happen to have logs from both instances during the time of session test and startup? |
Attached are logs from both instances, for security reasons I have masked SenderCompId |
Hi @chrjohn - are the logs from @ajinkyatankhiwale1510 what you need to see? His firm is working with Connamara -- if you need something different, we can help him get what is needed to see to progress the issue. |
Hi @gbirchmeier , I will take another look tomorrow but I think the logs are the correct ones. I'm just unsure at the moment how the problem could happen. To me it looks like a concurrency issue. |
Hi @chrjohn , you are correct, both this machines have a central storage maintained on NFS file share. |
Actually I think I cannot give you a recommendation on this. I don't have any experience using the Is #540 also happening on this infrastructure? That might explain the |
For both nodes, start time and end time is 00:00:00, at this time only initialization should happen. Apart from that, in case initiator sends sequence reset flag then store gets re initialized or system is restarted to load new cfg file other wise during the normal operation store continues as it should be. In fact, node2(erps0jb02) is just a failover node, request comes at this node only in case node 1(erps0jb01) goes down. In that case too, user session is terminated and they have to make session login again on node 2. For performance reasons we decided to go to FileStore. You are correct, #540 is happening on the same node. For Stream Closed, we have custom implementation of Message Store in place currently which is under test. I would update in #540 with the results to close the discussion. |
Do you have a log file from both nodes at that time at the problematic day?
Just out of own interest: NFS |
Yes, I have uploaded logs for both nodes with my earlier comment Just out of own interest: NFS FileStore was faster than using data base? |
I have asked for a log around the time of 00:00. There should be something like "session state not current" in the log. |
server-node 1.log |
I did not find any occurrence of "session state not current". Could you check with grep or similar if it appears at some time? |
I checked event and messages logs but could not find this statement. Any suggestion how we can mitigate this problem? We are anyways implementing MessageStore to handle StreamClosed error with our upcoming release. |
Without knowing exactly what caused the problem, it is impossible to know how to prevent it. I asked already here #601 (comment) : how do you prevent concurrent access to the file store when you have two instances that could possibly initialize the store concurrently? Even when only one node has an active connection, on startup the store is initializing its files. |
Until now, we don't have any mechanism which would prevent concurrent file store access. I got your point, that after restarting both nodes will individually will initialize the file store. There would be couple of scenario (restart, session start/end etc) where concurrency would happen and we will have to live with it. In any case, our expectation is that, file store should not get corrupt. |
The |
Describe the bug
After restarting fix acceptor, we saw an unusual issue, where config was not loaded as expected and saw below error in server logs
11-01-2023 21:14:04,920 INFO [quickfix.ThreadedSocketAcceptor] (ServerService Thread Pool -- 79) Logging out all sessions
11-01-2023 21:14:04,921 ERROR [org.apache.camel.impl.engine.AbstractCamelContext] (ServerService Thread Pool -- 79) Error starting CamelContext (camel-1) due to exception thrown: quickfix.ConfigError: error during session initialization: org.apache.camel.RuntimeCamelException: quickfix.ConfigError: error during session initialization
at org.apache.camel.RuntimeCamelException.wrapRuntimeException(RuntimeCamelException.java:66) [camel-api-3.9.0.jar:3.9.0]
at org.apache.camel.support.service.BaseService.doFail(BaseService.java:409) [camel-api-3.9.0.jar:3.9.0]
at org.apache.camel.support.service.BaseService.fail(BaseService.java:338) [camel-api-3.9.0.jar:3.9.0]
at org.apache.camel.support.service.BaseService.start(BaseService.java:128) [camel-api-3.9.0.jar:3.9.0]
at org.apache.camel.component.quickfixj.QuickfixjComponent.startQuickfixjEngine(QuickfixjComponent.java:147) [camel-quickfix-3.9.0.jar:3.9.0]
at org.apache.camel.component.quickfixj.QuickfixjComponent.onCamelContextStarted(QuickfixjComponent.java:226) [camel-quickfix-3.9.0.jar:3.9.0]
at org.apache.camel.impl.engine.InternalRouteStartupManager.safelyStartRouteServices(InternalRouteStartupManager.java:196) [camel-base-engine-3.9.0.jar:3.9.0]
at org.apache.camel.impl.engine.InternalRouteStartupManager.doStartOrResumeRoutes(InternalRouteStartupManager.java:147) [camel-base-engine-3.9.0.jar:3.9.0]
at org.apache.camel.impl.engine.AbstractCamelContext.doStartCamel(AbstractCamelContext.java:3177) [camel-base-engine-3.9.0.jar:3.9.0]
at org.apache.camel.impl.engine.AbstractCamelContext.doStartContext(AbstractCamelContext.java:2861) [camel-base-engine-3.9.0.jar:3.9.0]
at org.apache.camel.impl.engine.AbstractCamelContext.doStart(AbstractCamelContext.java:2812) [camel-base-engine-3.9.0.jar:3.9.0]
at org.apache.camel.support.service.BaseService.start(BaseService.java:115) [camel-api-3.9.0.jar:3.9.0]
at org.apache.camel.impl.engine.AbstractCamelContext.start(AbstractCamelContext.java:2507) [camel-base-engine-3.9.0.jar:3.9.0]
at org.apache.camel.impl.DefaultCamelContext.start(DefaultCamelContext.java:174) [camel-core-engine-3.9.0.jar:3.9.0]
at com.eurexrepo.ntp.fix.acceptor.StartAcceptor.startQuickFIxCamelContext(StartAcceptor.java:141) [classes:]
at com.eurexrepo.ntp.fix.acceptor.StartAcceptor.initInstance(StartAcceptor.java:78) [classes:]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.8.0_322]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [rt.jar:1.8.0_322]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_322]
at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_322]
at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleElement.invoke(InitDestroyAnnotationBeanPostProcessor.java:389) [spring-beans-5.3.5.jar:5.3.5]
at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleMetadata.invokeInitMethods(InitDestroyAnnotationBeanPostProcessor.java:333) [spring-beans-5.3.5.jar:5.3.5]
at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor.postProcessBeforeInitialization(InitDestroyAnnotationBeanPostProcessor.java:157) [spring-beans-5.3.5.jar:5.3.5]
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.applyBeanPostProcessorsBeforeInitialization(AbstractAutowireCapableBeanFactory.java:422) [spring-beans-5.3.5.jar:5.3.5]
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1778) [spring-beans-5.3.5.jar:5.3.5]
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:602) [spring-beans-5.3.5.jar:5.3.5]
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:524) [spring-beans-5.3.5.jar:5.3.5]
at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:335) [spring-beans-5.3.5.jar:5.3.5]
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:234) [spring-beans-5.3.5.jar:5.3.5]
at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:333) [spring-beans-5.3.5.jar:5.3.5]
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:208) [spring-beans-5.3.5.jar:5.3.5]
at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:944) [spring-beans-5.3.5.jar:5.3.5]
at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:918) [spring-context-5.3.5.jar:5.3.5]
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:583) [spring-context-5.3.5.jar:5.3.5]
at org.springframework.web.context.ContextLoader.configureAndRefreshWebApplicationContext(ContextLoader.java:401) [spring-web-5.2.22.RELEASE.jar:5.2.22.RELEASE]
at org.springframework.web.context.ContextLoader.initWebApplicationContext(ContextLoader.java:292) [spring-web-5.2.22.RELEASE.jar:5.2.22.RELEASE]
at org.springframework.web.context.ContextLoaderListener.contextInitialized(ContextLoaderListener.java:103) [spring-web-5.2.22.RELEASE.jar:5.2.22.RELEASE]
at io.undertow.servlet.core.ApplicationListeners.contextInitialized(ApplicationListeners.java:187) [undertow-servlet-2.2.5.Final-redhat-00001.jar:2.2.5.Final-redhat-00001]
at io.undertow.servlet.core.DeploymentManagerImpl$1.call(DeploymentManagerImpl.java:219) [undertow-servlet-2.2.5.Final-redhat-00001.jar:2.2.5.Final-redhat-00001]
at io.undertow.servlet.core.DeploymentManagerImpl$1.call(DeploymentManagerImpl.java:187) [undertow-servlet-2.2.5.Final-redhat-00001.jar:2.2.5.Final-redhat-00001]
at io.undertow.servlet.core.ServletRequestContextThreadSetupAction$1.call(ServletRequestContextThreadSetupAction.java:42) [undertow-servlet-2.2.5.Final-redhat-00001.jar:2.2.5.Final-redhat-00001]
at io.undertow.servlet.core.ContextClassLoaderSetupAction$1.call(ContextClassLoaderSetupAction.java:43) [undertow-servlet-2.2.5.Final-redhat-00001.jar:2.2.5.Final-redhat-00001]
at org.wildfly.extension.undertow.security.SecurityContextThreadSetupAction.lambda$create$0(SecurityContextThreadSetupAction.java:105)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1530)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1530)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1530)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1530)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1530)
at io.undertow.servlet.core.DeploymentManagerImpl.deploy(DeploymentManagerImpl.java:255) [undertow-servlet-2.2.5.Final-redhat-00001.jar:2.2.5.Final-redhat-00001]
at org.wildfly.extension.undertow.deployment.UndertowDeploymentService.startContext(UndertowDeploymentService.java:96)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentService$1.run(UndertowDeploymentService.java:78)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_322]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_322]
at org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1990)
at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486)
at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1377)
at java.lang.Thread.run(Thread.java:750) [rt.jar:1.8.0_322]
at org.jboss.threads.JBossThread.run(JBossThread.java:513)
Caused by: quickfix.ConfigError: error during session initialization
at quickfix.mina.acceptor.AbstractSocketAcceptor.createSessions(AbstractSocketAcceptor.java:251) [quickfixj-core-2.3.0.jar:2.3.0]
at quickfix.mina.acceptor.AbstractSocketAcceptor.startAcceptingConnections(AbstractSocketAcceptor.java:98) [quickfixj-core-2.3.0.jar:2.3.0]
at quickfix.ThreadedSocketAcceptor.start(ThreadedSocketAcceptor.java:103) [quickfixj-core-2.3.0.jar:2.3.0]
at org.apache.camel.component.quickfixj.QuickfixjEngine.doStart(QuickfixjEngine.java:237) [camel-quickfix-3.9.0.jar:3.9.0]
at org.apache.camel.support.service.BaseService.start(BaseService.java:115) [camel-api-3.9.0.jar:3.9.0]
... 55 more
Caused by: java.lang.RuntimeException: java.lang.NumberFormatException: For input string: ""
at quickfix.FileStoreFactory.create(FileStoreFactory.java:82) [quickfixj-core-2.3.0.jar:2.3.0]
at quickfix.Session.(Session.java:526) [quickfixj-core-2.3.0.jar:2.3.0]
at quickfix.DefaultSessionFactory.create(DefaultSessionFactory.java:224) [quickfixj-core-2.3.0.jar:2.3.0]
at quickfix.mina.acceptor.AbstractSocketAcceptor.setupSession(AbstractSocketAcceptor.java:214) [quickfixj-core-2.3.0.jar:2.3.0]
at quickfix.mina.acceptor.AbstractSocketAcceptor.createSessions(AbstractSocketAcceptor.java:245) [quickfixj-core-2.3.0.jar:2.3.0]
... 59 more
Caused by: java.lang.NumberFormatException: For input string: ""
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) [rt.jar:1.8.0_322]
at java.lang.Integer.parseInt(Integer.java:592) [rt.jar:1.8.0_322]
at java.lang.Integer.parseInt(Integer.java:615) [rt.jar:1.8.0_322]
at quickfix.FileStore.initializeSequenceNumbers(FileStore.java:160) [quickfixj-core-2.3.0.jar:2.3.0]
at quickfix.FileStore.initializeCache(FileStore.java:118) [quickfixj-core-2.3.0.jar:2.3.0]
at quickfix.FileStore.initialize(FileStore.java:112) [quickfixj-core-2.3.0.jar:2.3.0]
at quickfix.FileStore.(FileStore.java:96) [quickfixj-core-2.3.0.jar:2.3.0]
at quickfix.FileStoreFactory.create(FileStoreFactory.java:80) [quickfixj-core-2.3.0.jar:2.3.0]
... 63 more
Log suggests that issue might be during initializing sequence store, below is the snippet from where exception was thrown.
In statement cache.setNextSenderMsgSeqNum(Integer.parseInt(s));
private void initializeSequenceNumbers() throws IOException {
senderSequenceNumberFile.seek(0);
if (senderSequenceNumberFile.length() > 0) {
final String s = senderSequenceNumberFile.readUTF();
cache.setNextSenderMsgSeqNum(Integer.parseInt(s));
}
targetSequenceNumberFile.seek(0);
if (targetSequenceNumberFile.length() > 0) {
final String s = targetSequenceNumberFile.readUTF();
cache.setNextTargetMsgSeqNum(Integer.parseInt(s));
}
}
To Reproduce
Happened once, not able to reproduce.
Expected behavior
When server is started, sequence should be reset as expected and should not throw numberformat exception
system information:
fix-acceptor-settings.txt
The text was updated successfully, but these errors were encountered: