Async support. #149

camlee · 2021-10-08T04:48:34Z

I'd like to use statsd for an async codebase I have. Hoping to start adding proper support. I believe this can be done without doubling the API surface area (not every part of the API should be async) and without dropping support for Python 2 (although if we only had to support Python 3.5+ as per milestone 4.0, the code would be a bit cleaner).

The first commit just starts simple by allowing timing of coroutine functions. Still outstanding is actually doing asynchronous IO or synchronous IO on a background thread. I'll work on that next. Submitting this PR now to potentially kick off any required discussion and review the general approach.

Doc changes included in this PR.

I couldn't figure out how the tests work so haven't run or touched those. Maybe someone can help.

jsocol · 2021-10-12T13:21:59Z

4.0 seems like a great time to drop support for Python <3.5.

What do you have in mind for the pieces of the API that should have async versions? Timing coroutines like you've done here is definitely important.

I couldn't figure out how the tests work so haven't run or touched those. Maybe someone can help.

Ah I've never moved this package over to using GitHub Actions for tests. In the meantime, running tox or nosetests statsd should work.

camlee · 2021-10-13T00:38:34Z

So should I change this PR to be Python 3.5 or greater? Things would be simpler but a lot is not async specific (ex. super() vs super(StatsClient, self), from __future__ import division, etc...).

I just added a commit for what I'm thinking for an async version. It's basically... not async. We could have async versions of StatsClient._send(), StatsClient._send_stat(), incr(), decr(), timing(), etc... This would mean non-blocking IO, but it's only non-blocking from the perspective of the event loop. It would still block the code being written. Or could block: in practice, particularly with UDP, you're not going to see any blocking. One of the key considerations for async code is back-pressure: when you have something producing network traffic (like pystatsd does) and something else is consuming it (the statsd server in this case), you generally don't want your producer to go faster than your consumer. That will fill up network buffers, application memory, increase latency, eventually crash servers, etc... Generally, the producer should slow down to not outpace the consumer. But for metrics, this isn't really what we want: we're talking about slowing down the application so as to not overwhelm the metrics infrastructure. For metrics, you likely want to keep the application running full-speed even if it means dropping individual data points.
So what I've written up is a new StatsdClient class: ThreadStatsClient. This can safely be used in both synchronous and asynchronous code and does all the network activity in a background thread (one without an event loop running). Simpler interface: it's all synchronous and non-blocking which can be used in either sync or async functions. If there happens to be some sort of network blocking (even for UDP, if that can happen), it'll just queue the stats up and send when it can: no risk of blocking the event loop. Of course, it's possible the queue will fill up, in which case metrics are dropped (or someone prefers not to risk losing metrics, they can set no_fail=True in which case they'll get a queue.Full exception just like the TCP StatsClient can raise socket exceptions right now.
Let me know what you think of this approach after checking the code.

Got the tests running. Thanks. Doesn't look like I've broken anything but I'll try adding some tests for this new stuff.

orsinium · 2021-11-01T14:26:17Z

If performance isn't a consideration here, just out of curiosity, how this change affects the performance? And how would it affect the performance if the network requests also use asyncio?

camlee · 2021-11-01T23:16:47Z

Not sure I follow 100%. I think performance is a consideration, it's what impact we want a performance degradation to have that I was trying to describe. Specifically, this change means that in an async environment, if there is a performance issue in doing network IO related to metrics, that will only impact metrics and NOT block the event loop (which would typically be considered really bad).

There's also the performance-related question of whether it's faster to put something in a Python queue or to send data over a socket. I think for the default (UDP) client, it's not a significant difference.

If statsd did it's network requests using asyncio (or trio or anyio: worth keeping in mind that asyncio isn't the only game in town) on the same thread, that could negatively impact application performance. It wouldn't block the event loop, but the particular statsd call would slow down the coroutine calling it. Like it does right now for synchronous code, hence why UDP is desirable.
If on the other hand you're asking about if network requests were doing both in a thread and using asyncio, that's a possibility that could make sense. I haven't implemented that here.

* Clone use cases from the use case store as draft * Upgrade requirements and precommits

Properly timing coroutines. Resolves jsocol#119.

7b50f7a

Resolves jsocol#119: Added ThreadStatsClient for safe use in async code.

031fc66

orsinium mentioned this pull request Nov 2, 2021

Drop Python 2.7 support #150

Closed

camlee changed the title ~~WIP: Async support.~~ Async support. Apr 24, 2022

csestelo pushed a commit to Destygo/pystatsd that referenced this pull request Jul 18, 2022

Clone use cases from the use case store as draft (jsocol#149)

c00c11d

* Clone use cases from the use case store as draft * Upgrade requirements and precommits

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Async support. #149

Async support. #149

camlee commented Oct 8, 2021

jsocol commented Oct 12, 2021

camlee commented Oct 13, 2021

orsinium commented Nov 1, 2021

camlee commented Nov 1, 2021

Async support. #149

Are you sure you want to change the base?

Async support. #149

Conversation

camlee commented Oct 8, 2021

jsocol commented Oct 12, 2021

camlee commented Oct 13, 2021

orsinium commented Nov 1, 2021

camlee commented Nov 1, 2021