Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async support. #149

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Async support. #149

wants to merge 2 commits into from

Conversation

camlee
Copy link

@camlee camlee commented Oct 8, 2021

I'd like to use statsd for an async codebase I have. Hoping to start adding proper support. I believe this can be done without doubling the API surface area (not every part of the API should be async) and without dropping support for Python 2 (although if we only had to support Python 3.5+ as per milestone 4.0, the code would be a bit cleaner).

The first commit just starts simple by allowing timing of coroutine functions. Still outstanding is actually doing asynchronous IO or synchronous IO on a background thread. I'll work on that next. Submitting this PR now to potentially kick off any required discussion and review the general approach.

Doc changes included in this PR.

I couldn't figure out how the tests work so haven't run or touched those. Maybe someone can help.

@jsocol
Copy link
Owner

jsocol commented Oct 12, 2021

4.0 seems like a great time to drop support for Python <3.5.

What do you have in mind for the pieces of the API that should have async versions? Timing coroutines like you've done here is definitely important.

I couldn't figure out how the tests work so haven't run or touched those. Maybe someone can help.

Ah I've never moved this package over to using GitHub Actions for tests. In the meantime, running tox or nosetests statsd should work.

@camlee
Copy link
Author

camlee commented Oct 13, 2021

So should I change this PR to be Python 3.5 or greater? Things would be simpler but a lot is not async specific (ex. super() vs super(StatsClient, self), from __future__ import division, etc...).

I just added a commit for what I'm thinking for an async version. It's basically... not async. We could have async versions of StatsClient._send(), StatsClient._send_stat(), incr(), decr(), timing(), etc... This would mean non-blocking IO, but it's only non-blocking from the perspective of the event loop. It would still block the code being written. Or could block: in practice, particularly with UDP, you're not going to see any blocking. One of the key considerations for async code is back-pressure: when you have something producing network traffic (like pystatsd does) and something else is consuming it (the statsd server in this case), you generally don't want your producer to go faster than your consumer. That will fill up network buffers, application memory, increase latency, eventually crash servers, etc... Generally, the producer should slow down to not outpace the consumer. But for metrics, this isn't really what we want: we're talking about slowing down the application so as to not overwhelm the metrics infrastructure. For metrics, you likely want to keep the application running full-speed even if it means dropping individual data points.
So what I've written up is a new StatsdClient class: ThreadStatsClient. This can safely be used in both synchronous and asynchronous code and does all the network activity in a background thread (one without an event loop running). Simpler interface: it's all synchronous and non-blocking which can be used in either sync or async functions. If there happens to be some sort of network blocking (even for UDP, if that can happen), it'll just queue the stats up and send when it can: no risk of blocking the event loop. Of course, it's possible the queue will fill up, in which case metrics are dropped (or someone prefers not to risk losing metrics, they can set no_fail=True in which case they'll get a queue.Full exception just like the TCP StatsClient can raise socket exceptions right now.
Let me know what you think of this approach after checking the code.

Got the tests running. Thanks. Doesn't look like I've broken anything but I'll try adding some tests for this new stuff.

@orsinium
Copy link

orsinium commented Nov 1, 2021

If performance isn't a consideration here, just out of curiosity, how this change affects the performance? And how would it affect the performance if the network requests also use asyncio?

@camlee
Copy link
Author

camlee commented Nov 1, 2021

Not sure I follow 100%. I think performance is a consideration, it's what impact we want a performance degradation to have that I was trying to describe. Specifically, this change means that in an async environment, if there is a performance issue in doing network IO related to metrics, that will only impact metrics and NOT block the event loop (which would typically be considered really bad).

There's also the performance-related question of whether it's faster to put something in a Python queue or to send data over a socket. I think for the default (UDP) client, it's not a significant difference.

If statsd did it's network requests using asyncio (or trio or anyio: worth keeping in mind that asyncio isn't the only game in town) on the same thread, that could negatively impact application performance. It wouldn't block the event loop, but the particular statsd call would slow down the coroutine calling it. Like it does right now for synchronous code, hence why UDP is desirable.
If on the other hand you're asking about if network requests were doing both in a thread and using asyncio, that's a possibility that could make sense. I haven't implemented that here.

@orsinium orsinium mentioned this pull request Nov 2, 2021
@camlee camlee changed the title WIP: Async support. Async support. Apr 24, 2022
csestelo pushed a commit to Destygo/pystatsd that referenced this pull request Jul 18, 2022
* Clone use cases from the use case store as draft

* Upgrade requirements and precommits
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants