Description
Describe the bug
We use the Kinesis Client at Databricks as part of one of the Structured Streaming sources. We noticed an issue when executing multiple parallel getRecords calls on a single Amazon Kinesis client, a deadlock occurs if one of the calls fails while holding a lock on the connection pool. This leads to other calls waiting indefinitely on that lock. For more context we create a threadpool per core that each performs a fetch and if there is an error in one of the threads we shutdown the threadpool. Is this an issue you have seen before?
Expected Behavior
Each getRecords call should either complete successfully or release the connection pool lock in case of failure, allowing other calls to proceed.
Current Behavior
One of the getRecords calls fails and exits while holding the connection pool lock, causing a deadlock. Other getRecords calls wait indefinitely for that lock to be released.
Reproduction Steps
Create an Amazon Kinesis client.
Execute multiple parallel getRecords calls using the same client.
Simulate a failure in one of the getRecords calls after it has acquired a connection pool lock.
Possible Solution
Free the locks before the connection pool is shutdown possibly.
Additional Information/Context
No response
AWS Java SDK version used
1.12
JDK version used
8
Operating System and version
Linux
Activity
[-](short issue description)[/-][+]Deadlock on Kinesis Client connection pool when making parallel requests[/+]github-actions commentedon Oct 23, 2023
Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.