Attempting to connect to wrong port (PG post), hangs #370

matriv · 2022-06-27T11:22:01Z

crash --host "localhost:5433" --verbose

Runs for long time, without returning any error.

The text was updated successfully, but these errors were encountered:

amotl · 2022-06-27T13:14:51Z

Hi Marios,

thank you for reporting this. I can confirm it by using, for example:

nc -l -p 5433
crash --host "localhost:5433" --verbose

Because crash uses timeout=None, the default timeout value for "connect" and "read" will be used, which is "infinite". The requests library behaves equally. There is some discussion around the topic at urllib3/urllib3#655.

In different spots, I observed others to use default timeout values like those:

from urllib3.util.timeout import Timeout
Timeout(connect=2, read=7)
Timeout(connect=3.05, read=27)
Timeout(connect=5.0, read=?)
Timeout(connect=10.0, read=?)

Do you think it is a good idea to also set reasonable non-infinite default timeout values for crash? If so, what do you think which values we should use?

With kind regards,
Andreas.

matriv · 2022-06-27T14:36:12Z

I think yes we should. It seems to me that a user would expect to see and error message if connection cannot be established.
I would say that 5 or even 10 seconds is a reasonable value, and probably it would be nice to be able to configure it on demand.

proddata · 2022-06-27T14:39:09Z

Hm, I am wondering how this would affect long-running queries, considering crash uses the http-endpoint

amotl · 2022-06-27T14:44:48Z

@proddata: Long-running queries would fall into the category of the read timeout. We should probably leave this set to "infinite" and only adjust the connect timeout?

Otherwise, do you think we should also configure the read timeout to a maximum value by default? If yes, what do think would be an optimal default value? 24 hours?

mfussenegger · 2022-06-28T08:40:39Z

+1 for connect timeout. E.g. 5sec by default
-1 for read timeout.

matriv · 2022-06-28T09:09:06Z

Could we also have some "keep-alive" test that can happen in the background, so even if there is a long running query, we can still get an error that the server(s) is/are disconnected?

proddata · 2022-06-28T09:12:57Z

Otherwise, do you think we should also configure the read timeout to a maximum value by default? If yes, what do think would be an optimal default value? 24 hours?

I think there should be none. 24h might be even worse

matriv · 2022-07-01T09:20:34Z

Could we also have some "keep-alive" test that can happen in the background, so even if there is a long running query, we can still get an error that the server(s) is/are disconnected?

This would address the issue that a client issues a long-run query and the hosts it's connected go down, but the client still "thinks" the query is still running. WDYT?

amotl · 2022-07-01T10:57:12Z

Dear Marios,

sure we can think about such an "availability background check" feature. However, let's track it on behalf of a different issue instead? I think the scope of this one would be to only adjust the connect timeout accordingly. I also think five seconds as a default value, with an option to adjust it, would be fine.

With kind regards,
Andreas.

matriv · 2022-07-01T11:59:06Z

@amotl Of course, I just wanted to ask here and get opinions, since it's somehow relevant, so we can open a separate issue.

matriv · 2022-07-05T10:44:24Z

@amotl Created a new issue for that proposal: #371

amotl · 2024-01-12T01:44:27Z

Report

Hi again. GH-421 adds the --timeout command-line argument, for the purpose outlined here. As suggested, it also configures the default connect timeout to five seconds, while still using an infinite read timeout.

On this particular case, at least when emulating a server through nc -l -p 5433, this does not help much, because the client actually connects, but would need to limit the read timeout to not block forever.

You can exercise it like this:

# Blocks, because read timeout is still infinite.
crash --host "localhost:5433" --timeout=0.42

# Does not block, because read timeout is decreased.
crash --host "localhost:5433" --timeout=0.42,0.42

Conclusion

In this way, GH-421 does not improve the situation as reported by @matriv.

Outlook

I have no other idea how this can be improved when not adjusting the default read timeout, which I am also -1 about.

Maybe it would be a good idea to check if the user aims to unintentionally connect to a port in the 543x range, and yield a corresponding WARNING message only, in the same spirit as the Homebrew package manager is doing it:

Figure out how to make error handling and messaging as informative as possible. The more you can help your users help themselves, [the better].

-- https://dev.to/maximwheatley/how-to-build-great-open-source-devtools-with-max-howell-creator-of-homebrew-5eoc

seut · 2024-02-13T10:42:49Z

How are other clients like e.g. psql behave?

One idea could be to do a fast probing query with a short read timeout at the connection instantiation (SELECT 1), but not sure if this is really good (what if the cluster is overloaded? other issues?)

amotl · 2024-03-26T04:00:00Z

How do other clients like e.g. psql behave?

It looks like psql behaves in the same way. Running that on macOS, it does not return for the time of writing this.

nc -l -p 5433
time psql postgresql://postgres@localhost:5433/

One idea could be to do a fast probing query with a short read timeout at the connection instantiation (SELECT 1).

That will not work. What would "fast" actually mean? What if this "quick/fast probe check" with a lower timeout than the configured one would be below the value needed to actually make the connection work, because the server is far away, or, as you may be referring to, overloaded, or such?

amotl · 2024-03-27T17:34:22Z

Final summary and report, cheers.

Problem Report

crash --host "localhost:5433" --verbose  
Runs for long time, without returning any error.

Evaluation

When connecting to a TCP endpoint "that does not talk the right protocol" ¹, and partners can't establish a good conversation, so the request sent by the client will not get answered by the server at all, the conversation on this TCP connection will naturally stall. Unfortunately, the network connection will not be terminated, because there are no measures in place that could make that happen, other than client- and server-side timeout settings.

By default, the behaviour is the same with psql, we've confirmed it on localhost both on Linux and macOS.

psql postgresql://postgres@localhost:5433/

To reproduce it with both psql and crash, a server that does not talk the right protocol can easily be emulated using netcat ².

Conclusion and Thoughts

Therefore, we are closing this ticket. Please re-open if you feel your issue has not been addressed properly. Thank you for GH-421, which probably would not have happened without your request.

Long version: Where an application is bound to the corresponding TCP socket in LISTEN mode, which implements certain application-level protocols. ↩
nc -l -p 5433. It works well for both HTTP and the PostgreSQL wire protocol, because such a dummy server understands neither of both. ↩

matriv added the bug label Jun 27, 2022

amotl mentioned this issue Jan 12, 2024

Add command-line argument --timeout, to configure network timeout values #421

Merged

amotl self-assigned this Jan 12, 2024

amotl removed the bug label Mar 26, 2024

amotl closed this as completed Mar 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attempting to connect to wrong port (PG post), hangs #370

Attempting to connect to wrong port (PG post), hangs #370

matriv commented Jun 27, 2022

amotl commented Jun 27, 2022

matriv commented Jun 27, 2022

proddata commented Jun 27, 2022

amotl commented Jun 27, 2022 •

edited

Loading

mfussenegger commented Jun 28, 2022

matriv commented Jun 28, 2022

proddata commented Jun 28, 2022 •

edited by matriv

Loading

matriv commented Jul 1, 2022

amotl commented Jul 1, 2022

matriv commented Jul 1, 2022

matriv commented Jul 5, 2022

amotl commented Jan 12, 2024 •

edited

Loading

seut commented Feb 13, 2024 •

edited

Loading

amotl commented Mar 26, 2024

amotl commented Mar 27, 2024 •

edited

Loading

Attempting to connect to wrong port (PG post), hangs #370

Attempting to connect to wrong port (PG post), hangs #370

Comments

matriv commented Jun 27, 2022

amotl commented Jun 27, 2022

matriv commented Jun 27, 2022

proddata commented Jun 27, 2022

amotl commented Jun 27, 2022 • edited Loading

mfussenegger commented Jun 28, 2022

matriv commented Jun 28, 2022

proddata commented Jun 28, 2022 • edited by matriv Loading

matriv commented Jul 1, 2022

amotl commented Jul 1, 2022

matriv commented Jul 1, 2022

matriv commented Jul 5, 2022

amotl commented Jan 12, 2024 • edited Loading

Report

Conclusion

Outlook

seut commented Feb 13, 2024 • edited Loading

amotl commented Mar 26, 2024

amotl commented Mar 27, 2024 • edited Loading

Problem Report

Evaluation

Conclusion and Thoughts

Footnotes

amotl commented Jun 27, 2022 •

edited

Loading

proddata commented Jun 28, 2022 •

edited by matriv

Loading

amotl commented Jan 12, 2024 •

edited

Loading

seut commented Feb 13, 2024 •

edited

Loading

amotl commented Mar 27, 2024 •

edited

Loading