-
Notifications
You must be signed in to change notification settings - Fork 624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stale file handle and Stream Closed issues after session end time #540
Comments
Seems to be related to #399 Any idea how this could be reproduced without two nodes since I don't have such a setup? |
Node 1(machine 1) fix-acceptor-settings.cfg [SESSION] Node 2(machine 2 failover) fix-acceptor-settings.cfg [SESSION] Without two nodes its difficult to reproduce, on single node, starting two instances of acceptor and giving different SocketAcceptPort would be the identical scenario in this case. e.g. node 2's SocketAcceptPort can be 22002 |
Are you able to reproduce this in a test environment? If yes, could you test a pull request branch in your test environment? |
Yes, we were able to replicate the scenario in test environment, we used two separate machines hosted our acceptor on both machines and had FileStore shared between machines. However, I'm not sure if I could test any pull request in our environment. could you please elaborate your solution of ignoring the IOException and how it will help us to establish FIX session again? |
Currently the |
Thanks for your explanation, could you please let me know how can we fix this issue? Generally, in case of IOException there should be a finally block which must close this files or this is something which is currently missing? For time being, do you suggest any settings which would not reset the session, so such IOException do not occur? let's say using NonStopSession |
Currently the Using |
Sequence reset can be done by initiator at SessionLogon, and that is perfectly fine for us. I could try NonStopSession and see if that helps our case. Otherwise, we will have to somehow handle the exception to ignore IOException. Do you have any idea in that case what can be done? |
I am not sure, but you could try to extend the |
do you see any issues with NonStopSession if the sequence number are not reset? |
At some point (depending on how much messages you will send/receive) the sequence number will probably overflow (AFAIR we have a bug there). Moreover, the larger your store gets, the slower it will become. Potentially the counterparty could ask for a resend of all of your messages which could be billions of messages. |
I do agree with your approach of having our own implementation of FileStore, on that lines I have managed to implement MessageStore and in reset method of my implementation I have handled exceptions instead of throwing, do you think this would be a fine enough? |
Describe the bug
We have two nodes(different physical machines) working in active-active mode, where only one node is supposed to handle all incoming traffic. In case of failover, F5 would transfer all the traffic to node 2 and then node 2 will refresh its state and continue processing. Both nodes share the same store settings and files. Its observed that during daily end time at 00:00:00 UTC. Node 1 first throws stale file handle error and after some time throws stream closed error indefinitely. End user then cannot establish FIX session, until both node 1 and node 2 are restarted. It's our requirement to have file based store.
20220824-00:00:00.582: Error reading/writing in MessageStore
java.io.IOException: Stale file handle
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:326)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.io.DataOutputStream.flush(DataOutputStream.java:123)
at quickfix.FileStore.set(FileStore.java:373)
at quickfix.SessionState.set(SessionState.java:301)
at quickfix.Session.persist(Session.java:2563)
at quickfix.Session.sendRaw(Session.java:2622)
at quickfix.Session.generateLogout(Session.java:1479)
at quickfix.Session.generateLogout(Session.java:1448)
at quickfix.Session.reset(Session.java:884)
at quickfix.Session.resetIfSessionNotCurrent(Session.java:3022)
at quickfix.Session.next(Session.java:1921)
at quickfix.mina.SessionConnector$SessionTimerTask.run(SessionConnector.java:347)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
and then
20220824-00:06:50.046: Error during message processing
quickfix.RuntimeError: quickfix.RuntimeError: java.io.IOException: Stream Closed
at quickfix.Session.next(Session.java:1181)
at quickfix.Session.next(Session.java:1228)
at quickfix.mina.ThreadPerSessionEventHandlingStrategy$MessageDispatchingThread.doRun(ThreadPerSessionEventHandlingStrategy.java:222)
at quickfix.mina.ThreadPerSessionEventHandlingStrategy$ThreadAdapter.run(ThreadPerSessionEventHandlingStrategy.java:146)
at java.lang.Thread.run(Thread.java:750)
Caused by: quickfix.RuntimeError: java.io.IOException: Stream Closed
at quickfix.SessionState.reset(SessionState.java:387)
at quickfix.Session.resetState(Session.java:2665)
at quickfix.Session.reset(Session.java:887)
at quickfix.Session.resetIfSessionNotCurrent(Session.java:3022)
at quickfix.Session.nextLogon(Session.java:2119)
at quickfix.Session.next(Session.java:1050)
... 4 more
Caused by: java.io.IOException: Stream Closed
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:326)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.io.DataOutputStream.flush(DataOutputStream.java:123)
at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
at quickfix.FileStore.close(FileStore.java:218)
at quickfix.FileStore.close(FileStore.java:209)
at quickfix.FileStore.closeAndDeleteFiles(FileStore.java:223)
at quickfix.FileStore.initialize(FileStore.java:101)
at quickfix.FileStore.reset(FileStore.java:405)
at quickfix.SessionState.reset(SessionState.java:385)
... 9 more
Suppressed: java.io.IOException: Stream Closed
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:326)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
at java.io.FilterOutputStream.close(FilterOutputStream.java:159)
... 15 more
Cause: java.io.IOException: Stream Closed
quickfix.RuntimeError: java.io.IOException: Stream Closed
at quickfix.SessionState.reset(SessionState.java:387)
at quickfix.Session.resetState(Session.java:2665)
at quickfix.Session.reset(Session.java:887)
at quickfix.Session.resetIfSessionNotCurrent(Session.java:3022)
at quickfix.Session.nextLogon(Session.java:2119)
at quickfix.Session.next(Session.java:1050)
at quickfix.Session.next(Session.java:1228)
at quickfix.mina.ThreadPerSessionEventHandlingStrategy$MessageDispatchingThread.doRun(ThreadPerSessionEventHandlingStrategy.java:222)
at quickfix.mina.ThreadPerSessionEventHandlingStrategy$ThreadAdapter.run(ThreadPerSessionEventHandlingStrategy.java:146)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.IOException: Stream Closed
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:326)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.io.DataOutputStream.flush(DataOutputStream.java:123)
at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
at quickfix.FileStore.close(FileStore.java:218)
at quickfix.FileStore.close(FileStore.java:209)
at quickfix.FileStore.closeAndDeleteFiles(FileStore.java:223)
at quickfix.FileStore.initialize(FileStore.java:101)
at quickfix.FileStore.reset(FileStore.java:405)
at quickfix.SessionState.reset(SessionState.java:385)
... 9 more
Suppressed: java.io.IOException: Stream Closed
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:326)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
at java.io.FilterOutputStream.close(FilterOutputStream.java:159)
... 15 more
To Reproduce
Node 1 and Node 2 has Start Time and End Time as 00:00:00-UTC - 00:00:00-UTC sharing same FileStorePath (configured for failover scenario) and an active FIX connection on node 1, during session end time, it's sometimes seen that file resetting does not work as expected.
Expected behavior
In case of failover setup, if members are connected to FIX environment. Session reset should happen seamlessly without any error.
system information:
Additional context
It works well with single node, if both nodes are up and running then the issue is seen.
The text was updated successfully, but these errors were encountered: