Skip to content

buivision/agent-sleep

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

Tips for collapsed sections

You can add a header

You can add text within a collapsed section.

You can add an image or a code block, too.

   puts "Hello World"

TFE Agent Debugging Workspace

This repository contains a Terraform configuration designed to help debug Terraform Enterprise (TFE) agents.

Its sole purpose is to run a job on a TFE agent and then pause the agent by initiating a sleep command. This keeps the agent's container alive for a set period, allowing an administrator to connect to it (e.g., via docker exec, kubectl exec, or SSH) to inspect its state, environment variables, and filesystem.


How It Works

This configuration uses a null_resource to run a local-exec provisioner.

  1. null_resource: This is an empty resource that acts as a container for provisioners.
  2. triggers: A timestamp() trigger is used to ensure this resource runs on every terraform apply.
  3. local-exec: This provisioner runs a shell script directly on the TFE agent container.
  4. trap Command: The script uses a trap '...' EXIT command. This is a shell feature that guarantees the code inside the trap will run when the script exits, even if the main job fails.
  5. final_sleep Function: This function, called by the trap, prints a final message and then executes sleep ${var.sleep_duration_seconds}, pausing the run and keeping the agent container alive.
  6. var.sleep_duration_seconds: The sleep time is controlled by a Terraform variable, allowing you to set it from the TFE UI without changing code.

Configuration

Configuration Sleep Duration You can control the sleep duration by setting a variable in your TFE workspace.

Variable Name: sleep_duration_seconds

Type: Terraform Variable

Value: The number of seconds you want the agent to sleep (e.g., 3600 for 1 hour).

Default: If not set, the default is 360 (6 minutes), as defined in variables.tf.


How to Use

  1. Connect to TFE: Configure this GitLab repository as the VCS provider for a workspace in your TFE instance.
  2. Set Sleep Time (Optional): In the TFE workspace UI, go to Variables. Add a Terraform Variable with the key sleep_duration_seconds and set the value to your desired time in seconds (e.g., 1800 for 30 minutes). If you don't set this, it will use the default (360 seconds / 6 minutes).
  3. Queue Plan: Start a new plan in the TFE workspace. Note the Run ID from the TFE UI (e.g., run-vCQbYPYy7K2sfR7Z).
  4. Run Apply: Approve the plan and run the apply.
  5. Connect to Agent: Once the TFE UI shows the run is "Applying" and logging the "Sleeping for..." message, proceed to the "How to Connect to the Agent" section below.

How to Connect to the Agent (using Podman)

When the run is paused, you can find and access the specific agent container on your host machine using the Run ID from the TFE UI.

1. Find the Container ID

Open a terminal on the host machine where your TFE agents are running. Use the podman ps command to filter by the run_id label.

Replace run-vCQbYPYy7K2sfR7Z with your actual Run ID.

podman ps --filter "label=run_id=run-vCQbYPYy7K2sfR7Z"

This will give you an output like this:

CONTAINER ID   IMAGE                                COMMAND     CREATED          STATUS          PORTS     NAMES
a4e9b1d72f1a   docker.io/hashicorp/tfc-agent:latest   ...         3 minutes ago    Up 3 minutes              inspiring_murdock
Copy the CONTAINER ID (e.g., a4e9b1d72f1a).
  1. Get a Shell Inside the Container Use the podman exec command with the container ID to get an interactive shell.
podman exec -it a4e9b1d72f1a /bin/sh

You will now have a shell prompt (/ # or $) inside the TFE agent, and you are free to debug for the remainder of your sleep duration.

About

used to troubleshoot remote agents

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages