-
Notifications
You must be signed in to change notification settings - Fork 0
Home
- What Is Nexa?
- What Is Nexa For?
- What Problem Does Nexa Solve?
- What Design Principles Underlie Nexa?
- How Does Nexa Accomplish Its Goals?
In a distributed environment, inevitably some of the many service dependencies will fail. Nexa is a library that helps you control the interactions between these distributed services by adding latency tolerance and fault tolerance logic. Nexa does this by isolating points of access between the services, stopping cascading failures across them, and providing fallback options, all of which improve your system’s overall resiliency.
Nexa evolved out of resilience engineering work that the Netflix API team began in 2011. In 2012, Nexa continued to evolve and mature, and many teams within
Nexa is designed to do the following:
- Give protection from and control over latency and failure from dependencies accessed (typically over the network) via third-party client libraries.
- Stop cascading failures in a complex distributed system.
- Fail fast and rapidly recover.
- Fallback and gracefully degrade when possible.
- Enable near real-time monitoring, alerting, and operational control.
Applications in complex distributed architectures have dozens of dependencies, each of which will inevitably fail at some point. If the host application is not isolated from these external failures, it risks being taken down with them.
For example, for an application that depends on 30 services where each service has 99.99% uptime, here is what you can expect:
99.9930 = 99.7% uptime
0.3% of 1 billion requests = 3,000,000 failures
2+ hours downtime/month even if all dependencies have excellent uptime.
Reality is generally worse.
Nexa works by:
- Preventing any single dependency from using up all container (such as Tomcat) user threads.
- Shedding load and failing fast instead of queueing.
Nexa does this by:
Nexa • API Usage Guide • Getting Started • Website