-
-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(device): prevent exception on network failure #291
Conversation
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThe changes enhance the Changes
Assessment against linked issues
Possibly related PRs
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #291 +/- ##
==========================================
+ Coverage 37.79% 39.97% +2.17%
==========================================
Files 84 84
Lines 7477 7515 +38
==========================================
+ Hits 2826 3004 +178
+ Misses 4651 4511 -140 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- midealocal/device.py (2 hunks)
Files skipped from review as they are similar to previous changes (1)
- midealocal/device.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- midealocal/device.py (3 hunks)
Additional comments not posted (1)
midealocal/device.py (1)
204-232
: Approve changes in theconnect
method.The modifications to the
connect
method enhance its robustness by handling various exceptions and logging them appropriately. The socket timeout is set to a reasonable value, which is suitable for network operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
midealocal/device.py
Outdated
def _check_state(self) -> ParseMessageResult: | ||
if not self._socket: | ||
_LOGGER.warning("[%s] Socket closed", self._device_id) | ||
return ParseMessageResult.ERROR | ||
now = time.time() | ||
self._check_refresh(now) | ||
self._check_heartbeat(now) | ||
msg = self._socket.recv(512) | ||
if len(msg) == 0: | ||
if self._is_run: | ||
_LOGGER.warning( | ||
"[%s] Socket error - Connection closed by peer", | ||
self._device_id, | ||
) | ||
self.close_socket() | ||
return ParseMessageResult.ERROR | ||
return self.parse_message(msg) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review of the _check_state
method with suggestions for enhanced error handling.
The _check_state
method has been significantly modified to handle socket state checks and message parsing more robustly. The method now includes checks for socket closure and empty messages, which are correctly handled by logging warnings and closing the socket if necessary.
However, the method could benefit from additional error handling, particularly in managing exceptions that may occur during message reception or parsing.
Consider adding exception handling around the message reception and parsing to prevent the thread from exiting unexpectedly on exceptions:
- msg = self._socket.recv(512)
- if len(msg) == 0:
+ try:
+ msg = self._socket.recv(512)
+ if len(msg) == 0:
+ if self._is_run:
+ _LOGGER.warning(
+ "[%s] Socket error - Connection closed by peer",
+ self._device_id,
+ )
+ self.close_socket()
+ return ParseMessageResult.ERROR
+ except OSError as e:
+ _LOGGER.warning("[%s] Failed to receive message: %s", self._device_id, str(e))
+ self.close_socket()
+ return ParseMessageResult.ERROR
+ except Exception as e:
+ _LOGGER.exception("[%s] Unexpected error during message reception: %s", self._device_id, str(e))
+ self.close_socket()
+ return ParseMessageResult.ERROR
This change ensures that all potential exceptions during message reception are caught and handled appropriately, preventing the thread from terminating unexpectedly and allowing for more graceful error recovery.
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def _check_state(self) -> ParseMessageResult: | |
if not self._socket: | |
_LOGGER.warning("[%s] Socket closed", self._device_id) | |
return ParseMessageResult.ERROR | |
now = time.time() | |
self._check_refresh(now) | |
self._check_heartbeat(now) | |
msg = self._socket.recv(512) | |
if len(msg) == 0: | |
if self._is_run: | |
_LOGGER.warning( | |
"[%s] Socket error - Connection closed by peer", | |
self._device_id, | |
) | |
self.close_socket() | |
return ParseMessageResult.ERROR | |
return self.parse_message(msg) | |
def _check_state(self) -> ParseMessageResult: | |
if not self._socket: | |
_LOGGER.warning("[%s] Socket closed", self._device_id) | |
return ParseMessageResult.ERROR | |
now = time.time() | |
self._check_refresh(now) | |
self._check_heartbeat(now) | |
try: | |
msg = self._socket.recv(512) | |
if len(msg) == 0: | |
if self._is_run: | |
_LOGGER.warning( | |
"[%s] Socket error - Connection closed by peer", | |
self._device_id, | |
) | |
self.close_socket() | |
return ParseMessageResult.ERROR | |
except OSError as e: | |
_LOGGER.warning("[%s] Failed to receive message: %s", self._device_id, str(e)) | |
self.close_socket() | |
return ParseMessageResult.ERROR | |
except Exception as e: | |
_LOGGER.exception("[%s] Unexpected error during message reception: %s", self._device_id, str(e)) | |
self.close_socket() | |
return ParseMessageResult.ERROR | |
return self.parse_message(msg) |
midealocal/device.py
Outdated
try: | ||
self._socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM) | ||
self._socket.settimeout(10) | ||
_LOGGER.debug( | ||
"[%s] Connecting to %s:%s", | ||
self._device_id, | ||
self._ip_address, | ||
self._port, | ||
) | ||
self._socket.connect((self._ip_address, self._port)) | ||
_LOGGER.debug("[%s] Connected", self._device_id) | ||
connected = True | ||
except TimeoutError: | ||
_LOGGER.debug("[%s] Connection timed out", self._device_id) | ||
except OSError: | ||
_LOGGER.debug("[%s] Connection error", self._device_id) | ||
except Exception as e: | ||
file = None | ||
lineno = None | ||
if e.__traceback__: | ||
file = e.__traceback__.tb_frame.f_globals["__file__"] # pylint: disable=E1101 | ||
lineno = e.__traceback__.tb_lineno | ||
_LOGGER.exception( | ||
"[%s] Unknown error : %s, %s", | ||
self._device_id, | ||
file, | ||
lineno, | ||
) | ||
self.set_available(connected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review of the connect
method with suggestions for error handling improvements.
The connect
method has been updated to handle various exceptions more gracefully by logging them. However, the handling of generic exceptions could be improved by providing more specific error messages or handling specific cases differently.
Consider refining the exception handling to provide more actionable insights or recover from specific errors when possible. For example, network-related exceptions could trigger a re-attempt or a different recovery mechanism.
Here's a proposed change to enhance the exception handling:
- except Exception as e:
- file = None
- lineno = None
- if e.__traceback__:
- file = e.__traceback__.tb_frame.f_globals["__file__"] # pylint: disable=E1101
- lineno = e.__traceback__.tb_lineno
- _LOGGER.exception(
- "[%s] Unknown error : %s, %s",
- self._device_id,
- file,
- lineno,
- )
+ except Exception as e:
+ _LOGGER.exception("[%s] Unexpected error occurred: %s", self._device_id, str(e))
This change simplifies the logging and focuses on the exception message, which is typically sufficient for debugging unexpected errors.
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
try: | |
self._socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM) | |
self._socket.settimeout(10) | |
_LOGGER.debug( | |
"[%s] Connecting to %s:%s", | |
self._device_id, | |
self._ip_address, | |
self._port, | |
) | |
self._socket.connect((self._ip_address, self._port)) | |
_LOGGER.debug("[%s] Connected", self._device_id) | |
connected = True | |
except TimeoutError: | |
_LOGGER.debug("[%s] Connection timed out", self._device_id) | |
except OSError: | |
_LOGGER.debug("[%s] Connection error", self._device_id) | |
except Exception as e: | |
file = None | |
lineno = None | |
if e.__traceback__: | |
file = e.__traceback__.tb_frame.f_globals["__file__"] # pylint: disable=E1101 | |
lineno = e.__traceback__.tb_lineno | |
_LOGGER.exception( | |
"[%s] Unknown error : %s, %s", | |
self._device_id, | |
file, | |
lineno, | |
) | |
self.set_available(connected) | |
try: | |
self._socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM) | |
self._socket.settimeout(10) | |
_LOGGER.debug( | |
"[%s] Connecting to %s:%s", | |
self._device_id, | |
self._ip_address, | |
self._port, | |
) | |
self._socket.connect((self._ip_address, self._port)) | |
_LOGGER.debug("[%s] Connected", self._device_id) | |
connected = True | |
except TimeoutError: | |
_LOGGER.debug("[%s] Connection timed out", self._device_id) | |
except OSError: | |
_LOGGER.debug("[%s] Connection error", self._device_id) | |
except Exception as e: | |
_LOGGER.exception("[%s] Unexpected error occurred: %s", self._device_id, str(e)) | |
self.set_available(connected) |
@rokam add some changes to improve it .
one more comments:
|
update: add try/except to all the socket send/recv, try to catch the exception and close socket, the next loop will reconnect when socket is none. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- midealocal/device.py (5 hunks)
Additional comments not posted (1)
midealocal/device.py (1)
281-289
: LGTM!The exception handling in
send_message_v2
looks good. It logs the exception and closes the socket, which is appropriate for handling unexpected errors during message sending.
midealocal/device.py
Outdated
connection_retries = 0 | ||
while self._is_run: | ||
if not self.connect(): | ||
raise CannotConnect | ||
if not self._socket: | ||
raise SocketException | ||
self._authenticate_refresh_capabilities() | ||
while self._socket is None: | ||
_LOGGER.debug("[%s] No open socket, try to connect", self._device_id) | ||
if not self.connect(): | ||
self.close_socket() | ||
connection_retries += 1 | ||
sleep_time = min(60 * connection_retries, 600) | ||
_LOGGER.warning( | ||
"[%s] Unable to connect, sleep %s seconds", | ||
self._device_id, | ||
sleep_time, | ||
) | ||
time.sleep(sleep_time) | ||
continue | ||
self._authenticate_refresh_capabilities() | ||
connection_retries = 0 | ||
timeout_counter = 0 | ||
start = time.time() | ||
self._previous_refresh = self._previous_heartbeat = start | ||
self._socket.settimeout(1) | ||
while True: | ||
try: | ||
now = time.time() | ||
self._check_refresh(now) | ||
self._check_heartbeat(now) | ||
msg = self._socket.recv(512) | ||
if len(msg) == 0: | ||
if self._is_run: | ||
_LOGGER.error( | ||
"[%s] Socket error - Connection closed by peer", | ||
self._device_id, | ||
) | ||
self.close_socket() | ||
break | ||
result = self.parse_message(msg) | ||
result = self._check_state() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approve the modifications in run
method with a suggestion for improving connection retry logic.
The modifications to the run
method significantly enhance its robustness by implementing a retry mechanism with exponential backoff. This is crucial for maintaining connectivity under network instability.
However, the method could be further refined to handle exceptions more gracefully and to ensure that the connection retries do not lead to resource exhaustion or overly long wait times.
Consider refining the exponential backoff mechanism to include a maximum retry limit and to handle exceptions more gracefully:
- while self._is_run:
- while self._socket is None:
- _LOGGER.debug("[%s] No open socket, try to connect", self._device_id)
- if not self.connect():
- self.close_socket()
- connection_retries += 1
- sleep_time = min(60 * connection_retries, 600)
- _LOGGER.warning(
- "[%s] Unable to connect, sleep %s seconds",
- self._device_id,
- sleep_time,
- )
- time.sleep(sleep_time)
- continue
- self._authenticate_refresh_capabilities()
+ while self._is_run and connection_retries < MAX_RETRIES:
+ if self._socket is None and not self.connect():
+ connection_retries += 1
+ sleep_time = min(60 * connection_retries, 600)
+ _LOGGER.warning(
+ "[%s] Unable to connect, retry %d/%d, sleeping for %s seconds",
+ self._device_id,
+ connection_retries,
+ MAX_RETRIES,
+ sleep_time,
+ )
+ time.sleep(sleep_time)
+ else:
+ connection_retries = 0
+ self._authenticate_refresh_capabilities()
+ # Handle socket operations here
This change introduces a maximum retry limit (MAX_RETRIES
) to prevent infinite retry loops and provides more detailed logging that includes the current retry count and the maximum allowed retries. It also simplifies the control flow by reducing nested conditions.
Committable suggestion was skipped due to low confidence.
except Exception as e: | ||
file = None | ||
lineno = None | ||
if e.__traceback__: | ||
file = e.__traceback__.tb_frame.f_globals["__file__"] # pylint: disable=E1101 | ||
lineno = e.__traceback__.tb_lineno | ||
_LOGGER.exception( | ||
"[%s] Unknown error : %s, %s", | ||
self._device_id, | ||
file, | ||
lineno, | ||
) | ||
# set _socket to None when connect exception matched | ||
self._socket = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider refining the exception handling to provide more actionable insights or recover from specific errors when possible.
The handling of generic exceptions could be improved by providing more specific error messages or handling specific cases differently. For example, network-related exceptions could trigger a re-attempt or a different recovery mechanism.
Here's a proposed change to enhance the exception handling:
- except Exception as e:
- file = None
- lineno = None
- if e.__traceback__:
- file = e.__traceback__.tb_frame.f_globals["__file__"] # pylint: disable=E1101
- lineno = e.__traceback__.tb_lineno
- _LOGGER.exception(
- "[%s] Unknown error : %s, %s",
- self._device_id,
- file,
- lineno,
- )
+ except Exception as e:
+ _LOGGER.exception("[%s] Unexpected error occurred: %s", self._device_id, str(e))
This change simplifies the logging and focuses on the exception message, which is typically sufficient for debugging unexpected errors.
Committable suggestion was skipped due to low confidence.
def _check_state(self) -> ParseMessageResult: | ||
if not self._socket: | ||
_LOGGER.warning("[%s] Socket closed", self._device_id) | ||
return ParseMessageResult.ERROR | ||
now = time.time() | ||
self._check_refresh(now) | ||
self._check_heartbeat(now) | ||
try: | ||
msg = self._socket.recv(512) | ||
if len(msg) == 0: | ||
if self._is_run: | ||
_LOGGER.warning( | ||
"[%s] Socket error - Connection closed by peer", | ||
self._device_id, | ||
) | ||
self.close_socket() | ||
return ParseMessageResult.ERROR | ||
return self.parse_message(msg) | ||
except TimeoutError: | ||
_LOGGER.debug("[%s] _check_state Socket timed out", self._device_id) | ||
self.close_socket() | ||
return ParseMessageResult.ERROR | ||
except Exception as e: | ||
_LOGGER.exception( | ||
"[%s] Unexpected socket error", | ||
self._device_id, | ||
exc_info=e, | ||
) | ||
self.close_socket() | ||
return ParseMessageResult.ERROR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approve the new _check_state
method with suggestions for enhanced error handling.
The _check_state
method significantly improves the code by consolidating the logic for socket state checks and message parsing. The method includes appropriate checks for socket closure and empty messages, which are handled by logging warnings and closing the socket if necessary.
However, the method could benefit from additional error handling, particularly in managing exceptions that may occur during message reception or parsing.
Consider adding exception handling around the message reception and parsing to prevent the thread from exiting unexpectedly on exceptions:
- msg = self._socket.recv(512)
- if len(msg) == 0:
+ try:
+ msg = self._socket.recv(512)
+ if len(msg) == 0:
+ if self._is_run:
+ _LOGGER.warning(
+ "[%s] Socket error - Connection closed by peer",
+ self._device_id,
+ )
+ self.close_socket()
+ return ParseMessageResult.ERROR
+ except OSError as e:
+ _LOGGER.warning("[%s] Failed to receive message: %s", self._device_id, str(e))
+ self.close_socket()
+ return ParseMessageResult.ERROR
+ except Exception as e:
+ _LOGGER.exception("[%s] Unexpected error during message reception: %s", self._device_id, str(e))
+ self.close_socket()
+ return ParseMessageResult.ERROR
This change ensures that all potential exceptions during message reception are caught and handled appropriately, preventing the thread from terminating unexpectedly and allowing for more graceful error recovery.
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def _check_state(self) -> ParseMessageResult: | |
if not self._socket: | |
_LOGGER.warning("[%s] Socket closed", self._device_id) | |
return ParseMessageResult.ERROR | |
now = time.time() | |
self._check_refresh(now) | |
self._check_heartbeat(now) | |
try: | |
msg = self._socket.recv(512) | |
if len(msg) == 0: | |
if self._is_run: | |
_LOGGER.warning( | |
"[%s] Socket error - Connection closed by peer", | |
self._device_id, | |
) | |
self.close_socket() | |
return ParseMessageResult.ERROR | |
return self.parse_message(msg) | |
except TimeoutError: | |
_LOGGER.debug("[%s] _check_state Socket timed out", self._device_id) | |
self.close_socket() | |
return ParseMessageResult.ERROR | |
except Exception as e: | |
_LOGGER.exception( | |
"[%s] Unexpected socket error", | |
self._device_id, | |
exc_info=e, | |
) | |
self.close_socket() | |
return ParseMessageResult.ERROR | |
def _check_state(self) -> ParseMessageResult: | |
if not self._socket: | |
_LOGGER.warning("[%s] Socket closed", self._device_id) | |
return ParseMessageResult.ERROR | |
now = time.time() | |
self._check_refresh(now) | |
self._check_heartbeat(now) | |
try: | |
try: | |
msg = self._socket.recv(512) | |
if len(msg) == 0: | |
if self._is_run: | |
_LOGGER.warning( | |
"[%s] Socket error - Connection closed by peer", | |
self._device_id, | |
) | |
self.close_socket() | |
return ParseMessageResult.ERROR | |
except OSError as e: | |
_LOGGER.warning("[%s] Failed to receive message: %s", self._device_id, str(e)) | |
self.close_socket() | |
return ParseMessageResult.ERROR | |
except Exception as e: | |
_LOGGER.exception("[%s] Unexpected error during message reception: %s", self._device_id, str(e)) | |
self.close_socket() | |
return ParseMessageResult.ERROR | |
return self.parse_message(msg) | |
except TimeoutError: | |
_LOGGER.debug("[%s] _check_state Socket timed out", self._device_id) | |
self.close_socket() | |
return ParseMessageResult.ERROR | |
except Exception as e: | |
_LOGGER.exception( | |
"[%s] Unexpected socket error", | |
self._device_id, | |
exc_info=e, | |
) | |
self.close_socket() | |
return ParseMessageResult.ERROR |
mark to draft as the timeout and process is error, need to continue work on it. will finish it ASAP and mark to ready for review |
Replaced by #296 |
Fixes: #290
Summary by CodeRabbit
New Features
Bug Fixes