Skip to content

Commit cd975cf

Browse files
Fix unfinished future in Python Thrift client
Summary: noideadog2 My team has a Python library that uses Thrift to talk to a WWW service that my team also maintains. This Python library is used in Dataswarm to process lots of data. The problem that we are seeing is that, ocasionally, a Thrift deserialization error happens. When this error happens, what we also observe is that our entire Dataswarm pipeline freezes. It is as if something within the Thrift client becomes frozen and it ends up blocking our entire Dataswarm pipeline. What do we see in our logs? What we see in our logs is that, ocasionally, these Dataswarm pipelines log the following error: ``` Traceback (most recent call last): File "thrift.python/serializer.pyx", line 59, in thrift.python.serializer.deserialize File "thrift.python/serializer.pyx", line 55, in thrift.python.serializer.deserialize_with_length thrift.python.exceptions.Error: Encountered invalid field/element type (166) during skipping Exception ignored in: 'thrift.python.client.async_client._async_client_send_request_callback' ``` After we see this message, the Dataswarm pipeline freezes, and we can see that our Python code does not move forward. So, it seems that something is frozen, somewhere. Given the above error message, we suspect it's somewhere in the Thrift infra. After looking at that error message what I did is decided to look at the `_async_client_send_request_callback` function that the stack trace is mentioning. That function is here https://fburl.com/code/nwufgygd According to that stack trace, what seems to be happening is that this call to deserialize here https://fburl.com/code/zajanbby is throwing an exception, and this exception is escaping the `_async_client_send_request_callback` call. Now, this is ** JUST A GUESS **, but I'm wondering: could it be that, because the exception is escaping the `_async_client_send_request_callback` call, we're just leaving the pyfuture unfinalized, and thus everything depending on that future will freeze? This would explain what we're seeing on our end (that is, our Python code in Dataswarm not moving forward). This diff is an attempt at fixing that. Reviewed By: Filip-F Differential Revision: D68503986 fbshipit-source-id: 7e8c2c5f3931fb0008b49ac2c99e6e2ab2a54d12
1 parent cc8868a commit cd975cf

File tree

1 file changed

+14
-7
lines changed

1 file changed

+14
-7
lines changed

third-party/thrift/src/thrift/lib/python/client/async_client.pyx

Lines changed: 14 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ from libcpp.utility cimport move as cmove
3030
from thrift.python.client.omni_client cimport cOmniClientResponseWithHeaders, RpcKind, cOmniInteractionClient, createOmniInteractionClient, cData, FunctionQualifier, InteractionMethodPosition
3131
from thrift.python.client.request_channel cimport RequestChannel
3232
from thrift.python.exceptions cimport create_py_exception
33-
from thrift.python.exceptions import ApplicationError, ApplicationErrorType
33+
from thrift.python.exceptions import ApplicationError, ApplicationErrorType, ProtocolError, ProtocolErrorType
3434
from thrift.python.serializer import serialize_iobuf, deserialize
3535
from thrift.python.mutable_serializer import (
3636
serialize_iobuf as serialize_iobuf_mutable,
@@ -255,12 +255,19 @@ cdef void _async_client_send_request_callback(
255255
stream_cls,
256256
protocol,
257257
)
258-
py_resp = (
259-
deserialize(response_cls, response_iobuf, protocol=protocol)
260-
if not is_mutable_types
261-
else deserialize_mutable(response_cls, response_iobuf, protocol=protocol)
262-
)
263-
pyfuture.set_result(py_resp if py_stream is None else (py_resp, py_stream))
258+
try:
259+
py_resp = (
260+
deserialize(response_cls, response_iobuf, protocol=protocol)
261+
if not is_mutable_types
262+
else deserialize_mutable(response_cls, response_iobuf, protocol=protocol)
263+
)
264+
265+
pyfuture.set_result(py_resp if py_stream is None else (py_resp, py_stream))
266+
except Exception as e:
267+
pyfuture.set_exception(ProtocolError(
268+
ProtocolErrorType.UNKNOWN,
269+
"Deserialization failed, following exception thrown: " + str(e)
270+
))
264271

265272
cdef void _interaction_client_callback(cFollyTry[unique_ptr[cOmniInteractionClient]]&& result,
266273
PyObject* userData,) noexcept:

0 commit comments

Comments
 (0)