Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TransportException: connection to server 'myServer.com:10011' lost #204

Open
MatthiasHeinz opened this issue Jul 25, 2023 · 17 comments
Open
Labels
bug Confirmed as a valid bug and requires fix.

Comments

@MatthiasHeinz
Copy link

When calling

$tsServer = // some object containing the server config
$options = array("host" => $tsServer->getHost(), "port" => $tsServer->getQueryPort(), "timeout" => 10, "blocking" => true);
$object = new ServerQuery($options);

the ServerQuery::__destruct() and thus this->request("quit") is being called (c.f. https://github.com/planetteamspeak/ts3phpframework/blob/dev/src/Adapter/ServerQuery.php#L106).

This line used to catch Exception instead of AdapterException (c.f.

) which used to catch the TransportException thrown in readLine(...) (c.f. https://github.com/planetteamspeak/ts3phpframework/blob/dev/src/Transport/TCP.php#L162), when calling this->request("quit").

For that reason, the TransportException is thrown and not caught anymore.

@MatthiasHeinz
Copy link
Author

2 possible quick resolutions come to mind:

  1. Revert to catching Exception instead of AdapterException in ServerQuery::__destruct().
  2. Add an additional catch-clause in ServerQuery::__destruct() for TransportException.

@MatthiasHeinz
Copy link
Author

PR #195 might also resolve this issue.

@Sebbo94BY
Copy link
Collaborator

Nice analysis!

Adding a catch (TransportException) { return; } should also solve the issue, yeah.

I would need to double-check this, but it makes sense in theory.

But great. Then we have now two options to solve this issue:

Do you want to create a PR for your mentioned solution @MatthiasHeinz or shall I?

Then we only need to decide, which one the better is. 😅

I'm still waiting for feedback from @ronindesign regarding this (and all other open PRs). 😢

If nobody reviews my PR soon, I can also simply merge them and then everyone can at least test the DEV branch.

@ronindesign
Copy link
Collaborator

Appreciate the patience, #195 has been merged. Please reopen this if not resolved!

@MatthiasHeinz
Copy link
Author

I'll probably test this next week and give you guys a nudge, if the issue still persists by then; because I'm not allowed to reopen this issue.

@MatthiasHeinz
Copy link
Author

Unfortunately, #195 does not seem to resolve the issue entirely; instead the CPU spins in an endless loop.

When calling ServerQuery::__destruct() and thus this->request("quit"), the following loop in ServerQuery.php probably does not terminate anymore, which previously was broken out of by the thrown TransportException.

do {
    $str = $this->getTransport()->readLine();
    $rpl[] = $str;
} while ($str->section(TeamSpeak3::SEPARATOR_CELL) != TeamSpeak3::ERROR);

Now though, $this->getTransport()->readLine() terminates gracefully and thus does not break out of this loop.

Beware: The current issue probably concerns more than just the previously mentioned use-case!

@MatthiasHeinz
Copy link
Author

@ronindesign Would you mind to reopen this issue based on my previous post?

@Sebbo94BY Sebbo94BY reopened this Aug 15, 2023
Sebbo94BY pushed a commit to Sebbo94BY/ts3phpframework that referenced this issue Aug 15, 2023
This change should help to exit stuck loops, when the library is not connected to the remote server anymore.

- Adds a new function to check if a connection is established or not
- Adds a PHPUnit test for the new function
- Adds the connection status as condition to loops in order to abort, when it disconnected
Sebbo94BY pushed a commit to Sebbo94BY/ts3phpframework that referenced this issue Aug 15, 2023
This change should help to exit stuck loops, when the library is not connected to the remote server anymore.

- Adds a new function to check if a connection is established or not
- Adds a PHPUnit test for the new function
- Adds the connection status as condition to loops in order to abort, when it disconnected
@Sebbo94BY
Copy link
Collaborator

@MatthiasHeinz I've created a pull request to hopefully fix this issue: #206

Can you please let me know, if this solves the issue for you? I'll also try to reproduce and test this change on my side.

Sebbo94BY pushed a commit to Sebbo94BY/ts3phpframework that referenced this issue Aug 15, 2023
This change should help to exit stuck loops, when the library is not connected to the remote server anymore.

- Adds a new function to check if a connection is established or not
- Adds a PHPUnit test for the new function
- Adds the connection status as condition to loops in order to abort, when it disconnected
@MatthiasHeinz
Copy link
Author

In order to recreate the endless-loop, I simply had to call this custom function (- obviously, you might want to adjust the hostand port):

public function getUserCount() {
	$options = array("host" => "localhost", "port" => "10011", "timeout" => 10, "blocking" => true);
	$object = new ServerQuery($options);
	$node = $object->getHost();
	$node->login($tsServer->getQueryUser(), $tsServer->getQueryPass());
	$node->setExcludeQueryClients(true);
	$node = $node->serverGetByPort($tsServer->getPort());
	assert($node instanceof Server);
	return $node->clientCount();
}

As far as the reasoning goes, when the getUserCount() function returns, the ServerQuery::__destruct() is being called, because the $object variable gets disposed of. Thus triggering the chain of events I've stated in my previous post.

Hope, all of this information helps in your testing efforts.

I'll be testing the fix probably next week, if it's merged by then.

Sebbo94BY pushed a commit to Sebbo94BY/ts3phpframework that referenced this issue Aug 16, 2023
This change should help to exit stuck loops, when the library is not connected to the remote server anymore.

- Fixes the existing `isConnected()` function to properly return TRUE or FALSE
- Adds a PHPUnit test for the `isConnected()` function
- Adds the connection status as condition to loops in order to abort, when it disconnected
Sebbo94BY pushed a commit to Sebbo94BY/ts3phpframework that referenced this issue Aug 16, 2023
When connected with `blocking=0` (non-blocking mode), the PHP script usually wants to stay connected, so the library shouldn't automatically disconnect it all the time. Instead, the user needs to manually disconnect.
@Sebbo94BY
Copy link
Collaborator

I believe we have two problems here:

  1. Loops do not properly break / exit, when the remote server connection is gone: 5d86f29
  2. Running in non-blocking mode (blocking=0) should not disconnect at all except the project code wants it explicitly: 19e2e7e

I will test this further the next days.

Sebbo94BY pushed a commit to Sebbo94BY/ts3phpframework that referenced this issue Aug 16, 2023
This change should help to exit stuck loops, when the library is not connected to the remote server anymore.

- Fixes the existing `isConnected()` function to properly return TRUE or FALSE
- Adds a PHPUnit test for the `isConnected()` function
- Adds the connection status as condition to loops in order to abort, when it disconnected
Sebbo94BY pushed a commit to Sebbo94BY/ts3phpframework that referenced this issue Aug 16, 2023
When connected with `blocking=0` (non-blocking mode), the PHP script usually wants to stay connected, so the library shouldn't automatically disconnect it all the time. Instead, the user needs to manually disconnect.
Sebbo94BY added a commit that referenced this issue Aug 16, 2023
Issue #204: Fix high CPU utilization after disconnects / connection losses
@Sebbo94BY
Copy link
Collaborator

I was able to reproduce the high CPU utilization. I was also able to fix it with the above merged pull request #206 .

Please let me know, if this also solved the issue for you @MatthiasHeinz . :)

@MatthiasHeinz
Copy link
Author

MatthiasHeinz commented Aug 29, 2023

Using $options with "blocking" => false yields an expected CPU utilization. On the other hand using $options with "blocking" => true still yields a blasting CPU fan.

I'm honestly not quite sure, what value I should ideally set blocking to. I just want a short request to the server reading the current user-count. I'm fine with changing the value from trueto false on my end, if that's what I should use anyways.

Just wanted to let you know, that there's still some excessive CPU utilization.

@Sebbo94BY
Copy link
Collaborator

@MatthiasHeinz

I'm honestly not quite sure, what value I should ideally set blocking to.

This depends on the use-case, so your PHP script and its purpose.

By default, this option is set to true, which is the standard for a short running PHP script, which connects to your TeamSpeak server, does something (e. g. fetching data) and disconnects afterwards again as the PHP script also exits then.

However, if you build a bot for a TeamSpeak server, this is usually a long running PHP script, which should stay connected 24x7 on your TeamSpeak server for regular tasks (e. g. fetching data, moving clients, granting server groups, etc.). Repeatingly connecting and disconnecting would result in flood messages on your TeamSpeak server (spam the server chat about regular reconnects of this bot). In this case you set the option to false and use the wait() function as described here: https://docs.planetteamspeak.com/ts3/php/framework/index.html#example12

And then your code logic needs to handle the proper disconnect. Otherwise, your client will time out when the PHP script exits.

I just want a short request to the server reading the current user-count.

This can be archived with both explained scenarios above.

Depending on how often you want to fetch this data it maybe makes sense to build a bot for it to stay connected in order to avoid spaming the server chat about the regular reconnects of this client / PHP script.

Just wanted to let you know, that there's still some excessive CPU utilization.

I've tested a few things as normal PHP script and bot PHP script, but I could not reproduce the high CPU utilization anymore after I've applied the above recent changes / fixes.

This can be still caused by your personal code, which I don't know in detail. But your code excerpt above seems to always establish a new connection, so I guess it also closes it all the time. Not sure, what the rest of your code does all the time.

If you're having a class with functions, I would recommend you to create one main function, where you setup your connection and establish it and afterwards you call your respective functions / code, which should do the actual things. AFTER calling all necessary functions, tell your code to disconnect and exit the script.

@MatthiasHeinz
Copy link
Author

I'm still trying to fiddle with the previous code snippet.

  • When setting "blocking" = false", then the cpu utilization is acceptable and does NOT skyrocket. => Okay.
  • When setting "blocking" = true", I do suspect running into an endless loop, because the respective CPU-core/process runs on max. To test, I'm still using the same script as before:
// my code
public function getUserCount() {
        $port = "10011";
	$options = array("host" => "localhost", "port" => $port, "timeout" => 10, "blocking" => true);
	$object = new ServerQuery($options);
	$node = $object->getHost();
	$node->login("my_bot_username", "my_bot_password");
	$node->setExcludeQueryClients(true);
	$node = $node->serverGetByPort($port);
	assert($node instanceof Server);
	//$object->getTransport()->disconnect(); // This line fixes the symptom, but not the underlying issue.
	return $node->clientCount();
}

To debug the underlying issue, I did add a couple of error_log statements (my debugger currently isn't hooked up properly...):

// ServerQuery.php
    public function __destruct()
    {
        error_log("ServerQuery::__destruct 0");
        // do not disconnect, when acting as bot in non-blocking mode
        if (! $this->getTransport()->getConfig("blocking")) {
            return;
        }
        error_log("ServerQuery::__destruct 1");
        if ($this->getTransport() instanceof Transport && $this->transport->isConnected()) {
            try {
                error_log("ServerQuery::__destruct 1.1");
                $this->request("quit");
                error_log("ServerQuery::__destruct 1.2");
            } catch (AdapterException) {
                error_log("ServerQuery::__destruct 1.3");
                return;
            }
        }
        error_log("ServerQuery::__destruct 2");
    }

// Transport.php
    public function __destruct()
    {
        error_log("Transport::__destruct");
        if ($this->adapter instanceof Adapter) {
            $this->adapter->__destruct();
        }

        $this->disconnect();
    }

When running the script once, the apache log yields:

[Tue Sep 05 21:09:03.188126 2023] [php:notice] [pid 9264:tid 1860] [client 127.0.0.1:65192] ServerQuery::__destruct 0, referer: http://localhost/src/public/
[Tue Sep 05 21:09:03.188126 2023] [php:notice] [pid 9264:tid 1860] [client 127.0.0.1:65192] ServerQuery::__destruct 1, referer: http://localhost/src/public/
[Tue Sep 05 21:09:03.188126 2023] [php:notice] [pid 9264:tid 1860] [client 127.0.0.1:65192] ServerQuery::__destruct 1.1, referer: http://localhost/src/public/
[Tue Sep 05 21:09:03.212125 2023] [php:notice] [pid 9264:tid 1860] [client 127.0.0.1:65192] ServerQuery::__destruct 1.2, referer: http://localhost/src/public/
[Tue Sep 05 21:09:03.212125 2023] [php:notice] [pid 9264:tid 1860] [client 127.0.0.1:65192] ServerQuery::__destruct 2, referer: http://localhost/src/public/
[Tue Sep 05 21:09:03.212125 2023] [php:notice] [pid 9264:tid 1860] [client 127.0.0.1:65192] Transport::__destruct, referer: http://localhost/src/public/
[Tue Sep 05 21:09:03.212125 2023] [php:notice] [pid 9264:tid 1860] [client 127.0.0.1:65192] ServerQuery::__destruct 0, referer: http://localhost/src/public/
[Tue Sep 05 21:09:03.212125 2023] [php:notice] [pid 9264:tid 1860] [client 127.0.0.1:65192] ServerQuery::__destruct 1, referer: http://localhost/src/public/
[Tue Sep 05 21:09:03.213126 2023] [php:notice] [pid 9264:tid 1860] [client 127.0.0.1:65192] ServerQuery::__destruct 1.1, referer: http://localhost/src/public/

Note, that the second time ServerQuery::__destruct is called, it does not log the checkpoint "ServerQuery::__destruct 2" and thus there has to be an endless loop!

Combined with the fact, that prior to the second call to ServerQuery::__destruct, (according to the logs) we're calling Transport::__destruct, I'm assuming, that the ServerQuery::__destruct triggers a Transport::__destruct when being destroyed. But ServerQuery::__destruct also calls $this->request("quit");, which in turn tries to gracefully terminate the connection, thus I'd assume there's some misbehavior.

Hopefully this information helps you to track the underlying issue.

@MatthiasHeinz
Copy link
Author

@Sebbo94BY Did you have any chance to take an other look at the endless loop issue? Did you manage to reproduce?

@Sebbo94BY
Copy link
Collaborator

Hi @MatthiasHeinz, I had a quick look at it in the past, but since I was on vacation and am currently busy with a few other topics, I had not the time and head to check this further yet.

But I'll definitley take a deeper look at it within the next weeks.

@Sebbo94BY Sebbo94BY added the bug Confirmed as a valid bug and requires fix. label Sep 21, 2023
@MatthiasHeinz
Copy link
Author

@Sebbo94BY I'm still interested to see a fix for the issue, if possible. Hopefully, your schedule has cleared a bit in the meantime. ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed as a valid bug and requires fix.
Projects
None yet
Development

No branches or pull requests

3 participants