Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help needed with connectionStatusCallback not triggered when sending control packet data failed #2627

Closed
pcoll-jcm opened this issue Jun 26, 2024 · 2 comments
Labels

Comments

@pcoll-jcm
Copy link

pcoll-jcm commented Jun 26, 2024

We are encountering an issue with the Azure IoT SDK for ESP32.

SDK version: LTS_03_2024

When the following error trace appears:

Error: Time: File:./components/esp-azure/port/src/tlsio_esp_tls.c Func:tlsio_esp_tls_send_async Line:552 tlsio_esp_tls_send_async without a prior successful open

Error: Time: File:./components/esp-azure/azure-iot-sdk-c/umqtt/src/mqtt_client.c Func:sendPacketItem Line:393 Failure sending control packet data

the connectionStatusCallback with the result IOTHUB_CLIENT_CONNECTION_UNAUTHENTICATED is not being triggered, and the error appears indefinitely. In all other cases of connection and disconnection, the callback is triggered correctly. (so , in general, the callback worked excepting this case)

In this case, the hub sees the client as disconnected, but our software cannot take any action to reconnect without the callback, as it is our only tool to get connectivity feedback.

What other method can we use to confirm that the periodic control packet has been sent successfully?

It's crucial for us to keep data consumption minimal. We know that if any telemetry is sent in this state, your SDK reports that the message wasn't delivered and our software can start the reconnection procedure.. However, constantly sending messages for connectivity checks isn't feasible.

Essentially, we need a method to detect and handle this situation effectively.

Thank you.

@pcoll-jcm pcoll-jcm changed the title Help needed with connectionStatusCallback not triggered when sending control packet data Help needed with connectionStatusCallback not triggered when sending control packet data failed Jun 28, 2024
@pcoll-jcm
Copy link
Author

Good morning,

We've been investigating this issue on our devices. Our theory is that the problem comes from the MQTT connection between the device and IoT Hub.

Below there's the expected SDK behaviour based on the code review:

  1. We connect our device to IoT Hub
  2. Once connected, in order to check the IoT Hub connection the device executes a ping to the server every 240 seconds (4 minutes).
  3. If the ping is successful, the device interprets that it is connected, does nothing and waits for another 240 seconds to ping again. This is done by setting the TimeToPing counter variable again to 240 seconds.
  4. If the ping is unsuccessful, it waits for a ping response for 80 seconds and after that, the callback error is catched by the SDK.

Here's our steps to reproduce the behaviour at SDK level (We've been doing the tests with LTS_08_2024 version of SDK):

  1. We connect our device to IoT Hub
  2. The device starts the process to execute a ping to the server every 240 seconds (4 minutes).
  3. During a TimeToPing waiting time, we disable the device from IoTHub.
  4. After doing that, the IoTHub closes the MQTT connection to the device (the only trace we see in our ESP32 is esp-tls-mbedtls: read error :-0x004c). So in this state the connection is closed.
  5. Once the ping is executed again, an error appears 'Failure Sending control packet data', and the TimeToPing is setted again to 240 seconds, entering into an infinite loop and avoiding to catch the callback error and starting the reconnection process again. From this status, if we enable the device again, the reconnection process is never started and the device needs to be rebooted.

image (1)

From there, we started checking the code:

  • On the mqtt SDK module there is a process called mqtt_client_dowork(), this process is the responsible to check if the connection is alive and in case that it is not alive it generates a callback indication MQTT_CLIENT_NO_PING_RESPONSE. The ping workflow is as described above and if the answer from Ping does not arrive after more than 80 seconds it raises set_error_callback(mqtt_client, MQTT_CLIENT_NO_PING_RESPONSE);
  • After the MQTT connection has been closed, when time from last request makes the requirement of a PING, a new PING is executed-> sendPacketItem(mqtt_client, BUFFER_u_char(pingPacket), size);

The ping is sended from the highlighted lines below (see mqtt_client.c -> mqtt_client_dowork

image (2)

The implementation of sendPacket controls some causes of failure when trying to execute. If it fails, it generates a MU_FAILURE that does not update the packetSendTimeMs.

image (3)

Coming back to the mqtt_client_dowork routine, the result of the sendPacketItem is not catched. Because of that, the tickcounter_get_current_ms is never executed.

image (4)

The code that generates the callback is never executed, so the device does not know it is not connected. And does not try to connect again.

image (5)

After reviewing the code, we applied a patch on the mqtt module in order to avoid this infinite loop (file mqtt_client_modified attached):

In the file mqtt_client_modified.c we have modified the mqtt_client.c to add some logs on our screen and apply the patch

If you define
#define PATCHED_MQTT_PING_JCM
the patch is applied

If you define
#define LOG_TEST_PINGREQ
the log to see what is happening is activated

Here is the behaviour with the patch applied, the device did not enter into the infinite loop and started the reconnection process successfully.

image (6)

In the following zip there are the original and modified files. The modified includes the patch code.

mqtt_client - original & modified.zip

@pcoll-jcm
Copy link
Author

We opened a new issue as a bug as a continuation of this thread. You can find it here:
#2649

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant