Skip to content

IO, resource contention notes, docs and tools

Notifications You must be signed in to change notification settings

FrodeI/kubernaughty

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

kubernaughty

This is a collection of documentation, how-tos, tools and other information on debugging and identifying Kubernetes/container workload failures, performance and reliability considerations.

Initially this investigation started as user-reported failures at the DNS, networking and application levels, however through the analysis the actual causes for these failures we due to severe resource saturation & contention, IO throttling, kernel panics, etc. For an overview, see Part 1: Summary.

Through the investigation, I've discovered a lack of operational / systems knowledge, tracking and general awareness of the worker nodes / linux hosts that comprise kubernetes clusters (including filesystem incompatibility).

There are many gotchas, mud pits and blind spots running distributed systems, and kubernetes is no different. My goal with this is to step through the past 20 years of my career (eg, showing everyone my mistakes and learnings from the past).

Hopefully, this stuff helps you and your team.

This is an ongoing project / labor of love. It is not complete by any means

Roadmap

Contents:

Screencasts

Kubernaughty 1: IO saturation and throttling

About

IO, resource contention notes, docs and tools

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Shell 97.6%
  • Dockerfile 2.4%