|
| 1 | +--- |
| 2 | +title: Global dependencies in FastAPI, done correctly |
| 3 | +description: Globals are the root of all evil, and Python is more than happy to let you indulge. FastAPI provides an alternative mechanism, not very well-documented |
| 4 | +tags: |
| 5 | + - observations |
| 6 | +--- |
| 7 | +As much as I love python, it also makes you fight *hard* to avoid doing the wrong things. The wrong thing in this case being global state. |
| 8 | + |
| 9 | +FastAPI implicitly encourages the use of globals through its [Dependency](https://fastapi.tiangolo.com/tutorial/dependencies/) system. You define a global, throw it in a getter function defined as a dependency, you declare them in your handlers, and FastAPI will solve the tree for you, ensuring you don't get race conditions. As much as I appreciate the power and the ergonomics, I really don't like this. There's no way to validate the correct behavior until runtime. It also makes it hard to test, usually requiring to manually requiring the dependency at runtime. |
| 10 | +# The anti-pattern |
| 11 | + |
| 12 | +Imagine you have a global dependency, say, a database engine. Instead of defining it as a global, let's define it as a function: |
| 13 | + |
| 14 | +```python |
| 15 | +# this is psuedocode, but based off async sqlalchemy off the top of my head |
| 16 | +async def get_engine() -> AsyncGenerator[AsyncEngine]: |
| 17 | + engine = create_async_engine(...) |
| 18 | + try: |
| 19 | + yield engine |
| 20 | + finally: |
| 21 | + await engine.dispose() |
| 22 | + |
| 23 | +async def get_session(session: Annotated[AsyncEngine, Depends(get_engine)]) -> AsyncSession: |
| 24 | + async with AsyncSession(engine) as session: |
| 25 | + yield session |
| 26 | +``` |
| 27 | + |
| 28 | +Using FastAPI's dependency system, you would use this as follows: |
| 29 | + |
| 30 | +```python |
| 31 | +@app.get("/handle") |
| 32 | +async def my_handler(session: Annotated[AsyncSession, Depends(get_session)]) -> dict[str, str]: |
| 33 | + my_object = { "my": "thing" } |
| 34 | + session.add(my_object) |
| 35 | + session.commit() |
| 36 | + return my_object |
| 37 | +``` |
| 38 | +When this endpoint is hit with a `get` request, FastAPI will solve the dependency tree, finding that `get_session` depends on `get_engine`, then it will call that, provide the value to `get_session`, and then we have a database session. Simple! |
| 39 | + |
| 40 | +This code has a problem. If you were to keep calling this endpoint, FastAPI would spin up a database engine _per_ request. It's best practice to keep an engine for the lifetime of your application, as it handles all the complicated database pooling nonsense. This is simply encouraging poor performance, as Database IO is likely the main blocker for your application. |
| 41 | + |
| 42 | +There's a bunch of ways you can solve this. You can define a global inside your module: |
| 43 | + |
| 44 | +```python |
| 45 | +__engine: AsyncEngine | None = None # I have multiple underscores, pweese do not import me |
| 46 | + |
| 47 | +async def get_engine() -> AsyncGenerator[AsyncEngine]: |
| 48 | + global __engine |
| 49 | + if __engine is None: |
| 50 | + __engine = AsyncEngine() |
| 51 | + |
| 52 | + yield __engine |
| 53 | +``` |
| 54 | + |
| 55 | +I don't like this, and nor should you. Another way we can solve this is by using the `functools.cache` decorator (or `functools.lru_cache` if you're on an ancient version of python). Just throw it on, and now, |
| 56 | + |
| 57 | +```python |
| 58 | +from functools import cache |
| 59 | + |
| 60 | +@cache |
| 61 | +async def get_engine() -> AsyncGenerator[AsyncEngine]: |
| 62 | + engine = create_async_engine() |
| 63 | + try: |
| 64 | + yield engine |
| 65 | + finally: |
| 66 | + await engine.dispose() |
| 67 | +``` |
| 68 | + |
| 69 | +When this engine is created, our application now has one engine. Problem solved! |
| 70 | + |
| 71 | +Truthfully, this is a suboptimal solution. Our application only creates the engine when a handler that requires the dependency is called. Your application could start up, and things _seem_ alright, but it could then crash if you failed to get a connection for some reason. With the engine tied outside the lifecycle of the application, we don't get predictable teardowns, which has all the potential for side-effects.^[It's like unplugging a hard drive without first ejecting. Sure, you've done it for years and nothing bad has happened. But do you really want to rely on that?] |
| 72 | + |
| 73 | +Some people attempt to solve this conundrum using [`contextvars`](https://github.com/fastapi/fastapi/discussions/8628). Contextvars scare me and I avoid them wherever possible. |
| 74 | + |
| 75 | + Our database should live _immediately before_ and immediately _after_ FastAPI, like an outer layer. We initialize it when FastAPI starts up, and when we CTRL-C (aka `SIGTERM`), our database has the opportunity to clean itself up. It would be convenient if we could tie it to, say, the _lifespan_ of FastAPI... |
| 76 | +# The right way with ASGI Lifespan |
| 77 | + |
| 78 | +FastAPI features support for aptly-named `ASGI Lifespan` protocol, replacing the deprecated startup events. For example, here's a lifespan modified directly [from FastAPI's docs](https://fastapi.tiangolo.com/advanced/events/#lifespan). |
| 79 | + |
| 80 | +```python |
| 81 | +from contextlib import asynccontextmanager |
| 82 | + |
| 83 | +engine: AsyncEngine = None |
| 84 | + |
| 85 | +async def get_engine() -> AsyncGenerator[AsyncEngine]: |
| 86 | + yield engine |
| 87 | + |
| 88 | +@asynccontextmanager |
| 89 | +async def lifespan(app: FastAPI) -> AsyncGenerator[None]: |
| 90 | + global engine |
| 91 | + engine = AsyncEngine() |
| 92 | + yield |
| 93 | + await engine.dispose() |
| 94 | +``` |
| 95 | + |
| 96 | +Pretty cool! A big improvement on our old code, as we can properly handle clean-ups. But it's still not optimal, as our handler still relies on the global state. Is it possible to make it, _not_? |
| 97 | + |
| 98 | +Nested in the [ASGI spec](https://asgi.readthedocs.io/en/latest/specs/lifespan.html), there's an interesting trait of lifespans: when you `yield`, you can `yield` stuff from it. Instead of the defining a global, you can just, |
| 99 | + |
| 100 | +```python |
| 101 | +class AppState(TypedDict): |
| 102 | + engine: AsyncEngine |
| 103 | + |
| 104 | +@asynccontextmanager |
| 105 | +async def lifespan(app: FastAPI) -> AsyncGenerator[AppState]: |
| 106 | + engine = AsyncEngine() |
| 107 | + yield { "engine": engine } |
| 108 | + await engine.dispose() |
| 109 | +``` |
| 110 | + |
| 111 | +And now, our engine is part of our application! To be more specific, it's part of the `ASGI Scope`. You can access it by simply defining our new session dependency like: |
| 112 | + |
| 113 | +```python |
| 114 | +from fastapi import Request |
| 115 | + |
| 116 | +async def get_session(request: Request) -> AsyncGenerator[AsyncSession]: |
| 117 | + engine = request.scope["state"]["engine"] |
| 118 | + async with async_sessionmaker(engine) as session: |
| 119 | + yield session |
| 120 | +``` |
| 121 | + |
| 122 | +And now, inside that same handler, we get a new session, initialized with a `shallow copy` of our engine (important for performance), that's tied to the lifespan of our FastAPI app. No dependency solving required, as the engine is associated with every request. |
| 123 | + |
| 124 | +When you ask FastAPI to shut down, FastAPI will clean itself up, and then the lifespan will pass its `yield` point, allowing the engine to `dispose` of itself. |
| 125 | + |
| 126 | +ASGI Lifespans are powerful! I wish more people knew about them. In general, you should associate stuff with your application rather than letting it live external to it. I throw in pretty much everything inside of it, including my application settings (of which I use `pydantic_settings`), and all my dependencies are just wrappers that pull directly from the ASGI scope. It also has the benefit of being far more testable, as you can just mock the underlying object injected into the lifespan rather than overriding the dependency itself. It also encourages you to think deeply about what the lifecycle of your application is, which I find has lead to more maintainable code. |
0 commit comments