-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #1 from SumoLogic-Labs/adi-launch
Open source `token-refresher`
- Loading branch information
Showing
15 changed files
with
1,274 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
FROM --platform=$BUILDPLATFORM golang:1.22.0 as builder | ||
|
||
ARG TARGETOS | ||
ARG TARGETARCH | ||
|
||
WORKDIR /token-refresher | ||
|
||
COPY . . | ||
|
||
RUN go vet ./... && \ | ||
go test -v -race ./... && \ | ||
CGO_ENABLED=0 GOOS=${TARGETOS} GOARCH=${TARGETARCH} go build -o token-refresher | ||
|
||
FROM alpine | ||
|
||
WORKDIR /token-refresher | ||
|
||
RUN addgroup token-refresher \ | ||
&& adduser -u 1000 -S -g 1000 token-refresher --ingroup token-refresher \ | ||
&& chown -R token-refresher:token-refresher /token-refresher | ||
|
||
USER token-refresher | ||
|
||
COPY --from=builder /token-refresher/token-refresher /usr/local/bin/token-refresher | ||
|
||
ENTRYPOINT ["token-refresher"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,71 @@ | ||
# Overview | ||
|
||
`service-account-token-refresher` is a light-weight sidecar designed to ensure the continuous validity of the service account token projected by kubelet. It takes on the role of renewing the token when kubelet doesn't, addressing a known issue in Kubernetes where it ceases to refresh the token as a pod enters termination phase. | ||
|
||
For more details on the issue, visit: [Kubelet stops rotating service account tokens when pod is terminating, breaking preStop hooks](https://github.com/kubernetes/kubernetes/issues/116481) | ||
|
||
This issue particularly affects pods that require a significant amount of time, potentially hours or days, to shut down gracefully after Kubernetes sends a termination signal. | ||
|
||
![Working](assets/token-refresher.png) | ||
|
||
# Deployment Instructions | ||
|
||
1. Build and push the Docker image using the provided Dockerfile to your preferred container registry. | ||
2. Deploy the tool to your cluster using the sample Kubernetes manifest found here: [examples/token-refresher.yaml](./examples/token-refresher.yaml) | ||
|
||
# How It Works | ||
|
||
The token refresher is designed to operate as a sidecar container alongside your main application that depends on the service account token. It generates a new token at a custom location and updates the main container to use this new path. The refresher goes through several key phases: | ||
|
||
1. **Initialization** | ||
|
||
Initially, the refresher checks for the existence of a custom token. If it's missing, it sets up a symlink to the default projected token. | ||
|
||
2. **Monitoring** | ||
|
||
Then it enters a passive state where it periodically checks if the current token is expiring soon while waiting for a termination signal from Kubernetes. | ||
|
||
If it receives a shutdown signal (either from Kubernetes or the application) or it detects that the token is about to expire, it transitions to the active state. | ||
|
||
3. **Refreshing** | ||
|
||
In the active state, the refresher begins to regularly request a new token from the Kubernetes API server before the current one expires. It includes robust error handling to manage potential API server issues. This process continues until the application signals the refresher to stop. | ||
|
||
# Usage | ||
|
||
```sh | ||
$ token-refresher --help | ||
A sidecar which starts auto-refreshing the service account token when the default one is close to expiry or container receives a shutdown signal. | ||
|
||
Usage: | ||
token-refresher [flags] | ||
|
||
Flags: | ||
--default_token_file string path to default service account token file (default "/var/run/secrets/eks.amazonaws.com/serviceaccount/token") | ||
--expiration_duration duration token expiry duration (default 2h0m0s) | ||
-h, --help help for token-refresher | ||
--kubeconfig string (optional) absolute path to the kubeconfig file (default "/home/token-refresher/.kube/config") | ||
--max_attempts int max retries on token refresh failure (default 3) | ||
-n, --namespace string current namespace | ||
--refresh_interval duration token refresh interval (default 1h0m0s) | ||
-s, --service_account string name of service account to issue token for | ||
--sleep duration sleep duration between retries (default 20s) | ||
--token_audience strings comma separated token audience (default [sts.amazonaws.com]) | ||
--token_file string path to self-managed service account token file (default "/var/run/secrets/token-refresher/token") | ||
``` | ||
|
||
# Backstory | ||
|
||
While moving a microservice to Kubernetes, we encountered a scenario where the service required over 24 hours to fully drain. We set up a PreStop hook and extended the `terminationGracePeriodSeconds` to accommodate this. However, we soon faced `ExpiredTokenException` errors. | ||
|
||
Investigation led us to a bug in Kubernetes, still unresolved as of March 2024, detailed here: [Kubelet stops rotating service account tokens when pod is terminating, breaking preStop hooks](https://github.com/kubernetes/kubernetes/issues/116481). | ||
|
||
We attempted a workaround by extending the token expiration using the `eks.amazonaws.com/token-expiration` annotation, but it couldn't exceed 24 hours as discussed [here](https://github.com/aws/amazon-eks-pod-identity-webhook#amazon-eks-pod-identity-webhook). We then looked at the cluster level `service-account-max-token-expiration` flag, only to be blocked by an open feature request that prevented us from adjusting it ourselves: [Allow user to modify the kube-apiserver flag --service-account-max-token-expiration](https://github.com/aws/containers-roadmap/issues/1836). | ||
|
||
We also considered using long-lived tokens, but they were incompatible due to a hardcoded issuer in the tokens, which was not accepted as per the error: `An error occurred (InvalidIdentityToken) when calling the AssumeRoleWithWebIdentity operation: Issuer must be a valid URL`. We needed the issuer to match the cluster's OIDC HTTP URL. | ||
|
||
After exhausting all other available options, we decided to build our own token-refresher. It began as a simple shell script to fetch new tokens from the API server, but as complexity grew with retries and error handling, and with the need for better testing, we developed this Go-based service. | ||
|
||
During testing, we encountered another hiccup where the refresher would start after the main container, causing errors due to the missing token. To resolve this, we added an init container to create the necessary symlink from the custom token to the default one at startup. | ||
|
||
This refresher has proven to be very effective for us, and we hope it will be beneficial to you as well! |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
package cmd | ||
|
||
import ( | ||
"fmt" | ||
"os" | ||
"path/filepath" | ||
"time" | ||
|
||
"github.com/SumoLogic-Labs/service-account-token-refresher/pkg/signals" | ||
tokenrefresher "github.com/SumoLogic-Labs/service-account-token-refresher/pkg/token-refresher" | ||
|
||
"github.com/spf13/cobra" | ||
"github.com/spf13/viper" | ||
"k8s.io/client-go/util/homedir" | ||
) | ||
|
||
type config struct { | ||
tokenrefresher.TokenRefresher `mapstructure:",squash"` | ||
} | ||
|
||
var conf *config | ||
|
||
var rootCmd = &cobra.Command{ | ||
Use: "token-refresher", | ||
Short: "Automatic token refresher for terminating pods", | ||
Long: `A sidecar which starts auto-refreshing the service account token when the default one is close to expiry or container receives a shutdown signal.`, | ||
Run: func(cmd *cobra.Command, args []string) { | ||
stopCh := signals.SignalShutdown() | ||
refresher := conf.TokenRefresher | ||
if err := refresher.Run(stopCh); err != nil { | ||
fmt.Printf("unable to run: %s", err.Error()) | ||
os.Exit(2) | ||
} | ||
fmt.Println("Exiting") | ||
}, | ||
} | ||
|
||
func Execute() { | ||
err := rootCmd.Execute() | ||
if err != nil { | ||
os.Exit(1) | ||
} | ||
} | ||
|
||
func init() { | ||
cobra.OnInitialize(initConfig) | ||
|
||
// The flag names must match those from conf.TokenRefresher | ||
rootCmd.Flags().StringP("namespace", "n", "", "current namespace") | ||
rootCmd.Flags().StringP("service_account", "s", "", "name of service account to issue token for") | ||
rootCmd.Flags().String("default_token_file", "/var/run/secrets/eks.amazonaws.com/serviceaccount/token", "path to default service account token file") | ||
rootCmd.Flags().String("token_file", "/var/run/secrets/token-refresher/token", "path to self-managed service account token file") | ||
rootCmd.Flags().StringSlice("token_audience", []string{"sts.amazonaws.com"}, "comma separated token audience") | ||
rootCmd.Flags().Duration("expiration_duration", time.Hour*2, "token expiry duration") | ||
rootCmd.Flags().Duration("refresh_interval", time.Hour*1, "token refresh interval") | ||
rootCmd.Flags().Int("max_attempts", 3, "max retries on token refresh failure") | ||
rootCmd.Flags().Duration("sleep", time.Second*20, "sleep duration between retries") | ||
|
||
if home := homedir.HomeDir(); home != "" { | ||
rootCmd.Flags().String("kubeconfig", filepath.Join(home, ".kube", "config"), "(optional) absolute path to the kubeconfig file") | ||
} else { | ||
rootCmd.Flags().String("kubeconfig", "", "absolute path to the kubeconfig file") | ||
} | ||
|
||
viper.BindPFlags(rootCmd.LocalFlags()) | ||
} | ||
|
||
func initConfig() { | ||
viper.AutomaticEnv() // read in upper-cased env vars corresponding to above CLI flags | ||
conf = new(config) | ||
err := viper.Unmarshal(conf) | ||
if err != nil { | ||
fmt.Printf("unable to decode into config struct, %v", err) | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,110 @@ | ||
apiVersion: v1 | ||
kind: ServiceAccount | ||
metadata: | ||
name: app | ||
namespace: app | ||
--- | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
kind: ClusterRole | ||
metadata: | ||
name: token-refresher | ||
namespace: app | ||
rules: | ||
- apiGroups: [""] | ||
resources: ["serviceaccounts/token"] | ||
verbs: ["create"] | ||
--- | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
kind: ClusterRoleBinding | ||
metadata: | ||
name: token-refresher | ||
namespace: app | ||
subjects: | ||
- kind: ServiceAccount | ||
name: app | ||
namespace: app | ||
roleRef: | ||
kind: ClusterRole | ||
name: token-refresher | ||
apiGroup: rbac.authorization.k8s.io | ||
--- | ||
apiVersion: v1 | ||
kind: Pod | ||
metadata: | ||
name: long-draining-pod | ||
namespace: app | ||
spec: | ||
containers: | ||
- image: service-account-token-refresher:latest # update this | ||
imagePullPolicy: Always | ||
name: token-refresher | ||
env: | ||
- name: DEFAULT_TOKEN_FILE | ||
value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token | ||
- name: TOKEN_FILE | ||
value: /var/run/secrets/token-refresher/token | ||
- name: EXPIRATION_DURATION | ||
value: 10m | ||
- name: REFRESH_INTERVAL | ||
value: 1m | ||
- name: NAMESPACE | ||
valueFrom: | ||
fieldRef: | ||
apiVersion: v1 | ||
fieldPath: metadata.namespace | ||
- name: SERVICE_ACCOUNT | ||
value: app | ||
- name: AWS_WEB_IDENTITY_TOKEN_FILE | ||
value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token | ||
volumeMounts: | ||
- mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount | ||
name: aws-iam-token | ||
readOnly: true | ||
- mountPath: /var/run/secrets/token-refresher | ||
name: token-refresher | ||
- name: long-draining-app | ||
image: alpine | ||
command: | ||
- sh | ||
- -c | ||
- | | ||
for i in `seq 1 10` | ||
do | ||
# prints the token's expiry | ||
echo "Now: $(date)" | ||
EXPIRY=$(awk -F . '{if (length($2) % 4 == 3) print $2"="; else if (length($2) % 4 == 2) print $2"=="; else print $2; }' $AWS_WEB_IDENTITY_TOKEN_FILE | tr -- '-_' '+/' | base64 -d | awk -F , '{print $2}' | awk -F : '{print "@"$2}' | xargs date -d) | ||
echo "Exp: $EXPIRY" | ||
echo | ||
sleep 20s | ||
done | ||
lifecycle: | ||
preStop: | ||
exec: | ||
command: | ||
- sh | ||
- -c | ||
- # custom draining logic here | ||
sleep 30s && | ||
touch /var/run/secrets/token-refresher/shutdown | ||
env: | ||
- name: AWS_WEB_IDENTITY_TOKEN_FILE | ||
value: /var/run/secrets/token-refresher/token | ||
volumeMounts: | ||
- mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount | ||
name: aws-iam-token | ||
readOnly: true | ||
- name: token-refresher | ||
mountPath: /var/run/secrets/token-refresher | ||
readOnly: false | ||
terminationGracePeriodSeconds: 180 | ||
serviceAccountName: app | ||
volumes: | ||
- name: token-refresher | ||
emptyDir: {} | ||
- name: aws-iam-token | ||
projected: | ||
defaultMode: 420 | ||
sources: | ||
- serviceAccountToken: | ||
audience: sts.amazonaws.com | ||
path: token |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
module github.com/SumoLogic-Labs/service-account-token-refresher | ||
|
||
go 1.22.0 | ||
|
||
require ( | ||
github.com/spf13/cobra v1.8.0 | ||
github.com/spf13/viper v1.18.2 | ||
k8s.io/api v0.29.2 | ||
k8s.io/apimachinery v0.29.2 | ||
k8s.io/client-go v0.29.2 | ||
) | ||
|
||
require ( | ||
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect | ||
github.com/emicklei/go-restful/v3 v3.11.3 // indirect | ||
github.com/evanphx/json-patch v4.12.0+incompatible // indirect | ||
github.com/fsnotify/fsnotify v1.7.0 // indirect | ||
github.com/go-logr/logr v1.4.1 // indirect | ||
github.com/go-openapi/jsonpointer v0.21.0 // indirect | ||
github.com/go-openapi/jsonreference v0.21.0 // indirect | ||
github.com/go-openapi/swag v0.23.0 // indirect | ||
github.com/gogo/protobuf v1.3.2 // indirect | ||
github.com/golang/protobuf v1.5.4 // indirect | ||
github.com/google/gnostic-models v0.6.8 // indirect | ||
github.com/google/gofuzz v1.2.0 // indirect | ||
github.com/google/uuid v1.6.0 // indirect | ||
github.com/hashicorp/hcl v1.0.0 // indirect | ||
github.com/imdario/mergo v0.3.16 // indirect | ||
github.com/inconshreveable/mousetrap v1.1.0 // indirect | ||
github.com/josharian/intern v1.0.0 // indirect | ||
github.com/json-iterator/go v1.1.12 // indirect | ||
github.com/magiconair/properties v1.8.7 // indirect | ||
github.com/mailru/easyjson v0.7.7 // indirect | ||
github.com/mitchellh/mapstructure v1.5.0 // indirect | ||
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect | ||
github.com/modern-go/reflect2 v1.0.2 // indirect | ||
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect | ||
github.com/pelletier/go-toml/v2 v2.1.1 // indirect | ||
github.com/pkg/errors v0.9.1 // indirect | ||
github.com/sagikazarmark/locafero v0.4.0 // indirect | ||
github.com/sagikazarmark/slog-shim v0.1.0 // indirect | ||
github.com/sourcegraph/conc v0.3.0 // indirect | ||
github.com/spf13/afero v1.11.0 // indirect | ||
github.com/spf13/cast v1.6.0 // indirect | ||
github.com/spf13/pflag v1.0.5 // indirect | ||
github.com/subosito/gotenv v1.6.0 // indirect | ||
go.uber.org/multierr v1.11.0 // indirect | ||
golang.org/x/exp v0.0.0-20240222234643-814bf88cf225 // indirect | ||
golang.org/x/net v0.22.0 // indirect | ||
golang.org/x/oauth2 v0.18.0 // indirect | ||
golang.org/x/sys v0.18.0 // indirect | ||
golang.org/x/term v0.18.0 // indirect | ||
golang.org/x/text v0.14.0 // indirect | ||
golang.org/x/time v0.5.0 // indirect | ||
google.golang.org/appengine v1.6.8 // indirect | ||
google.golang.org/protobuf v1.33.0 // indirect | ||
gopkg.in/inf.v0 v0.9.1 // indirect | ||
gopkg.in/ini.v1 v1.67.0 // indirect | ||
gopkg.in/yaml.v2 v2.4.0 // indirect | ||
gopkg.in/yaml.v3 v3.0.1 // indirect | ||
k8s.io/klog/v2 v2.120.1 // indirect | ||
k8s.io/kube-openapi v0.0.0-20240228011516-70dd3763d340 // indirect | ||
k8s.io/utils v0.0.0-20240102154912-e7106e64919e // indirect | ||
sigs.k8s.io/json v0.0.0-20221116044647-bc3834ca7abd // indirect | ||
sigs.k8s.io/structured-merge-diff/v4 v4.4.1 // indirect | ||
sigs.k8s.io/yaml v1.4.0 // indirect | ||
) |
Oops, something went wrong.