Skip to content

Does elbencho have retry or timeout parameters for HDFS IO or are there any internally? #101

@panghubaobao777

Description

@panghubaobao777

Hello,thank you very much for having such a comprehensive tool as elbencho. I hava a question.
When using the elbencho tool to issue HDFS IO, the HDFS service on the server side was restarted(restart hdfs), causing the client-side business to be interrupted. I would like to ask if there are any parameters set for business retry on the client side? Such as timeout settings?
I'm not sure this key ‘65000 millis timeout ‘ in client is IO hold time ?

Step1:
elbencho --hdfs /test1122 -n 0 -N 1000000 -s 100m -w -t 120 --hosts=node8,client171

Step2:
server: systemctl restart hdfs

Result:
2026-01-20 20:49:24,788 WARN hdfs.DataStreamer: Exception for BP-514898428-170.254.21.33-1768826004374:blk_1141293069631736_1
java.net.SocketTimeoutException: 65000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.56.8:49150 remote=/192.168.56.61:9866]
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
at java.base/java.io.FilterInputStream.read(FilterInputStream.java:71)
at java.base/java.io.FilterInputStream.read(FilterInputStream.java:71)
at org.apache.hadoop.hdfs.protocolPB.PBHelperClient.vintPrefixed(PBHelperClient.java:548)
at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:213)
at org.apache.hadoop.hdfs.DataStreamer$ResponseProcessor.run(DataStreamer.java:1086)
FSDataOutputStream#close error:
IOException: All datanodes [DatanodeInfoWithStorage[192.168.56.61:9866,DS-41e62dd2-9004-493f-a826-a999c45af618,DISK]] are bad. Aborting...java.io.IOException: All datanodes [DatanodeInfoWithStorage[192.168.56.61:9866,DS-41e62dd2-9004-493f-a826-a999c45af618,DISK]] are bad. Aborting...
at org.apache.hadoop.hdfs.DataStreamer.handleBadDatanode(DataStreamer.java:1561)
at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1495)
at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1481)
at org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1256)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:667)
2026-01-20 20:49:26,604 WARN hdfs.DataStreamer: Exception for BP-514898428-170.254.21.33-1768826004374:blk_1131397464981796_1
java.net.SocketTimeoutException: 65000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.56.8:49170 remote=/192.168.56.61:9866]
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)

Metadata

Metadata

Assignees

Labels

questionFurther information is requested

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions