Skip to content

Commit 2757ea4

Browse files
committed
📝 (posts): publish full article on why workflow is failing and update request-lifecycle metadata
- updated: content/tech-journal/post-04-why-workflow-failing/index.md (final article) - added: all related images (auth-results.png, feature.png, kanban-board.png, project-token.png, runner-error.png, runner-warning.png, workflow-debug-journey.png) - updated: content/tech-journal/post-05-request-lifecycle/index.md (metadata)
1 parent eb4828e commit 2757ea4

File tree

9 files changed

+72
-96
lines changed

9 files changed

+72
-96
lines changed
96 KB
Loading
2 MB
Loading

content/tech-journal/post-04-why-workflow-failing/index.md

Lines changed: 65 additions & 94 deletions
Original file line numberDiff line numberDiff line change
@@ -2,131 +2,107 @@
22
title: "Why Your Workflow Isn’t Failing Where You Think It Is"
33
slug: "why-workflow-failing"
44
date: 2025-05-04
5-
description: "-----"
6-
summary: "------"
5+
description: "Troubleshooting a failed GitHub Actions workflow, revealing an expired PAT behind an 'unknown owner type' error."
6+
summary: "Expired PAT caused GitHub Actions workflow failure; a debugging journey from runner changes to token refresh."
77
categories: ["Automation & Devops"]
88
tags: ["github-actions", "CI/CD", "gh CLI", "debugging"]
9-
featureAlt: "----"
10-
draft: true
9+
featureAlt: "Magnifying glass inspecting a checklist with a cracked bug, CI/CD symbol, and padlock on a blue-purple gradient background."
10+
draft: false
1111
---
1212

1313
When a _GitHub Actions workflow_ that had been working fine for months suddenly failed, I went down a familiar rabbit hole of false assumptions, vague errors, and misleading logs. This post details the troubleshooting journey, the kind that initially screams ***“runner environment change,”*** but ends in a quiet whisper: ***“your token expired.”***
1414

15-
## Context: Automation That Moved 🐛 Issues
16-
For context, I had a GitHub Actions workflow using `gh` (GitHub CLI) to automatically move issues labeled `bug` into a "Bugs" column on my GitHub Project board (#6). This workflow ran fine for months—until mid-April 2025, when it silently failed.
15+
## Failing Automation That Moved 🐛 Issues
1716

18-
![Context Timeline Infographic](context-timeline.png "Context Timeline Infographic")
17+
The automated _GitHub Actions workflow_ intelligently ***managed bug-related issues*** by moving them to a dedicated "Bugs" column in the [project management board](https://github.com/users/socrabytes/projects/6/views/3).
1918

20-
- **Workflow Name:** `🐛 Auto Bug Column Management`
21-
- **Purpose:** Move issues labeled `bug` to a "Bugs" column in GitHub Projects (Project #6)
22-
- **Tools:** GitHub Actions, `gh` CLI, shell scripting, PAT-based auth
23-
- **Initial State:** Everything worked smoothly until mid-April 2025
19+
![Kanban Board, Project View](kanban-board.png "Kanban Board (Project View)")
2420

25-
---
21+
- **Workflow Name:** [🐛 Auto Bug Column Management](https://github.com/socrabytes/youtube-digest/blob/main/.github/workflows/auto-bug-column.yml)
22+
- **Purpose:** Move issues labeled `bug` to a "Bugs" column in GitHub Projects
23+
- **Tools:** GitHub Actions, `gh` CLI, shell scripting, PAT-based auth
24+
- **Trigger Event:** Issues labeled with `bug`
25+
26+
![Context Timeline Infographic](context-timeline.png "Context Timeline Infographic")
2627

2728
## Initial Suspect: A Changing Environment
2829

29-
Like many others, I use `ubuntu-latest` for my GitHub Actions runners for convenience. Around the time my workflow failed (mid-to-late April 2025), I noticed warnings appearing in my Actions logs about `ubuntu-latest` preparing to point to the new `ubuntu-24.04` LTS, updating from `ubuntu-22.04`.
30+
Like many others, I use `ubuntu-latest` for my GitHub Actions runners for convenience. The timing was suspicious. Around the failure window, `ubuntu-latest` was shifting to `ubuntu-24.04`, and warnings started appearing in my logs. That seemed like the obvious issue—new OS, new CLI versions, maybe breaking changes.
3031

31-
{{< screenshot src="ubuntu-latest-warning.png" alt="GitHub Actions warning about ubuntu-latest" >}}
32+
![GitHub Actions warning about ubuntu-latest](runner-warning.png "GitHub Actions warning about ubuntu-latest")
3233

33-
This seemed like the obvious culprit. Runner environment changes are a common source of workflow failures. I also checked the runner image software lists (like those tracked in actions/runner-images issues, e.g., #10636) and noted potential differences in pre-installed software, including the gh CLI version itself (my local gh 2.68.1 vs. runner versions potentially being 2.69.0 or 2.70.0).
34+
This seemed like the obvious culprit. Runner environment changes are a common source of workflow failures. I also checked the runner image software lists (like those tracked in actions/runner-images issues, e.g., [#10636](https://github.com/actions/runner-images/issues/10636)) and noted potential differences in pre-installed software, including the `gh CLI` version itself (my local `gh 2.68.1` v.s. runner version `2.70.0`).
3435

35-
My first logical step was to eliminate this variable. I updated my workflow YAML:
36+
My first logical step was to eliminate this variable by pinning the runner to `ubuntu-22.04`.
3637

37-
```YAML
38+
```YAML {linenos=false hl_lines=[5] style="emacs"}
3839
jobs:
3940
move_bug_issues:
40-
# ...
41+
...
4142
# runs-on: ubuntu-latest # Changed from this
42-
runs-on: ubuntu-22.04 # To this
43-
# ...
43+
runs-on: ubuntu-22.04 # To this
44+
...
4445
```
45-
I reran the workflow, confident this would likely resolve the issue.
46+
The result? No change. The failure persisted.
4647

47-
## Hitting a Wall: The Cryptic Error
48-
49-
Pinning the runner to `ubuntu-22.04` didn't fix it. The workflow failed again, specifically at the step designed to find the project item ID associated with the labeled issue:
50-
51-
```YAML
52-
- name: Retrieve Project Item
53-
id: get-item
54-
run: |
55-
ITEM_ID=$(
56-
gh project item-list "6" \ # Project Number 6
57-
--owner "socrabytes" \ # My username
58-
--limit 100 \
59-
--format json \
60-
--jq ".items[] | select(.content.number == $ISSUE_NUMBER) | .id"
61-
)
62-
# ... rest of script ...
63-
env:
64-
GH_TOKEN: ${{ secrets.PROJECT_TOKEN }}
65-
OWNER: "socrabytes"
66-
ISSUE_NUMBER: ${{ github.event.issue.number }}
67-
PROJECT_NUMBER: "6"
68-
```
48+
## Hitting a 🧱: The Cryptic Error
49+
The crash point was a `gh project item-list` call meant to fetch the associated project card for the labeled issue.
6950

70-
{{< screenshot src="github-actions-error.png" alt="GitHub Actions error message" >}}
51+
![GitHub Actions error message](runner-error.png "GitHub Actions error message")
7152

72-
The error message wasn't immediately helpful regarding the runner environment: unknown owner type. Why would it suddenly not know the owner type for "socrabytes"? This didn't feel like a gh version compatibility issue on the surface.
53+
The error? `unknown owner type` with exit code 1.
7354

74-
## Isolating the Variable: Local 🆚 Remote Testing
55+
### Eliminating Environment Variables
7556

76-
If it wasn't the runner OS or (maybe) the `gh` version difference, I needed to confirm the command itself was still valid.
57+
This wasn’t an obvious environment problem, and it didn’t *look* like a CLI version incompatibility. I checked anyway:
7758

78-
1. **Test Locally (Current Version):** I ran the equivalent `gh project item-list` command on my local machine, which had `gh version 2.68.1` installed via Homebrew. **Result: It worked perfectly.**
79-
2. **Test Locally (Upgraded Version):** To further rule out a breaking change in newer `gh` versions, I upgraded my local CLI (`brew upgrade gh`) to `gh version 2.71.1`. I ran the command again locally. **Result: It *still* worked perfectly.**
59+
- Local `gh 2.68.1`: ✅ Worked
60+
- Upgraded local `gh 2.71.1`: ✅ Still worked
61+
- Actions runner: ❌ Failed
8062

81-
This was a critical finding. If the command worked locally with both the older version *and* a version newer than the one on the runner, the `gh` version number itself was highly unlikely to be the direct cause. The problem had to be specific to the GitHub Actions execution **context**.
63+
Environment inconsistencies were ruled out. Something about the Actions environment was off.
8264

83-
## Digging Deeper: Checking Authentication in Actions
65+
## Refocusing: Authentication in CI/CD
8466

85-
My workflow uses a Personal Access Token (PAT) stored as a secret (`secrets.PROJECT_TOKEN`) to authenticate `gh` commands, allowing it to modify my project board. Although I knew the PAT *should* be valid (it hadn't been changed recently), the next logical step was to explicitly verify authentication *within the runner environment*.
67+
My workflow uses a **Personal Access Token (PAT)** stored as a secret (`secrets.PROJECT_TOKEN`) to authenticate `gh` commands, allowing it to modify my project board. Although I knew the PAT *should* be valid (it hadn't been changed recently), the next logical step was to explicitly ***verify authentication within the runner environment***.
8668

8769
I added a simple debug command to the failing step: `gh auth status`.
8870

89-
```YAML
71+
```YAML {linenos=false hl_lines=["4-6"] style="emacs"}
9072
- name: Retrieve Project Item
9173
id: get-item
9274
run: |
93-
echo "gh cli version: $(gh --version)" # Added for good measure
94-
echo "Debugging OWNER: $OWNER"
95-
echo "Checking auth status:" # <-- Added this line
96-
gh auth status # <-- Added this line
75+
echo "gh cli version: $(gh --version)"
76+
echo "Checking auth status:"
77+
gh auth status
9778
9879
# Original command follows...
9980
ITEM_ID=$(
100-
gh project item-list "$PROJECT_NUMBER" # ... etc
81+
gh project item-list "$PROJECT_NUMBER" ...
10182
)
102-
# ...
83+
...
10384
env:
10485
GH_TOKEN: ${{ secrets.PROJECT_TOKEN }}
10586
```
106-
## The "Aha!" Moment: The Real Culprit
107-
108-
The output from this debug step in the Actions log was crystal clear:
109-
```CLI
110-
gh cli version: gh version 2.70.0 (2025-04-15)
111-
https://github.com/cli/cli/releases/tag/v2.70.0
112-
Debugging OWNER: socrabytes
113-
Checking auth status:
114-
github.com
115-
X Failed to log in to github.com using token (GH_TOKEN)
116-
- Active account: true
117-
- The token in GH_TOKEN is invalid.
118-
Error: Process completed with exit code 1.
119-
```
12087
121-
There it was: **"The token in GH\_TOKEN is invalid."** The `unknown owner type` error was simply a downstream effect of `gh` failing to authenticate properly *before* it could even process the project and owner details.
88+
### The Real Culprit 🧨
89+
90+
{{< lead >}}
91+
The token in GH\_TOKEN is invalid.
92+
{{< /lead >}}
12293
123-
**The Resolution: A Simple Token Refresh**
94+
![`gh auth status` output](auth-results.png "gh auth status output")
12495

125-
Why was the token invalid? I checked my repository secrets – the `PROJECT_TOKEN` secret itself showed "Last updated 4 months ago".
96+
The `unknown owner type` error was simply a downstream effect of `gh` failing to authenticate properly *before* it could even process the project and owner details.
12697

127-
**(Optional: Insert Image `image_597009.png` here, showing the secrets list)**
98+
### Root Cause: An Expired PAT 🔐
99+
The PAT used in `PROJECT_TOKEN` had simply expired. GitHub’s UI still showed “last updated 4 months ago,” which was misleading-- <mark>this reflects when the secret was added, not the PAT’s expiration.</mark>
128100

129-
However, the "last updated" time for the *secret storage* doesn't reflect the *PAT's expiration date*. PATs are generated with specific lifetimes (e.g., 30, 60, 90 days, or custom). It was almost certain my PAT, likely created with a 90-day expiry, had simply expired.
101+
![project-token](project-token.png "Repository Secrets: PROJECT_TOKEN")
102+
103+
PATs are generated with specific lifetimes (e.g., 30, 60, 90 days, or custom). It was almost certain my PAT, likely created with a 90-day expiry, had simply expired.
104+
105+
## The Resolution: A Simple Token Refresh
130106

131107
The fix was straightforward:
132108

@@ -137,31 +113,26 @@ The fix was straightforward:
137113
5. Go back to the `youtube-digest` repository Settings -&gt; Secrets and variables -&gt; Actions.
138114
6. Update the `PROJECT_TOKEN` secret with the new token value.
139115

140-
After updating the secret, I re-ran the workflow, and it executed perfectly.
116+
After updating the secret, I re-ran the workflow, and it functioned as designed.
141117

142-
**Lessons Learned & Takeaways**
118+
## 🧭 Lessons Learned
143119

144120
This half-day troubleshooting journey reinforced several key points:
145121

122+
146123
- **Debug Systematically:** Don't get locked onto the first hypothesis, even if initial evidence seems strong (like runner update warnings). Methodically eliminate variables.
147-
- **Leverage Local Testing:** Comparing behavior locally versus in CI/CD is crucial for pinpointing environment-specific issues.
124+
- **Test Locally + Remotely**: Validate CLI commands across both local and CI environments to isolate failure context.
148125
- **Verify Authentication Early:** When CI/CD tools interact with APIs, especially if encountering strange errors, explicitly check the authentication status (`gh auth status` in this case) early in the debugging process.
149126
- **Error Messages Can Mislead:** The initial `unknown owner type` error sent me down the wrong path initially. The real error was hidden until authentication was explicitly checked.
150-
- **Manage Credential Lifecycles:** PATs expire! This incident highlighted the need for proactive management. Setting calendar reminders or documenting expiration dates is crucial, even for solo projects.
127+
- **Manage Credential Lifecycles:** <mark>PATs expire!</mark> This incident highlighted the need for proactive management. Setting calendar reminders or documenting expiration dates is crucial, even for solo projects.
151128

152-
**Conclusion**
129+
## Final Thoughts 💭
153130

154-
While the root cause – an expired PAT – was operationally simple, the path to diagnosing it involved navigating misleading clues and systematically ruling out other potential causes. It was a valuable reminder that sometimes the most obvious environmental changes aren't the culprit, and checking foundational aspects like authentication is key. Hopefully, sharing this journey helps someone else who encounters a similarly confusing workflow failure!
131+
This wasn’t a code issue. It wasn’t a config mistake. It was an invisible clock on an auth token—masked by a misleading error. Sharing this isn’t just about fixing a one-off 🐛 bug. It’s about how to think like a debugger in CI/CD land, where context is everything and logs don’t always tell the truth.
155132

156-
---
157-
158-
159-
## 🐛 **The Problem**
160-
161-
{{< lead >}}
162-
When a _GitHub Actions workflow_ that had been working fine for months suddenly failed
163-
{{< /lead >}}
133+
![Workflow Debug Journey](workflow-debug-journey.png "Workflow Debug Journey")
164134

165-
{{< screenshot src="github-actions-error.png" alt="GitHub Actions error message" >}}
135+
Hopefully, this helps someone else avoid losing half a day chasing ghosts.
136+
Feel free to check out the 👇 `workflow file` if you're curious how it's wired.
166137

167-
The error message was vague: "The workflow is not authorized to run a workflow file."
138+
{{< button href="https://github.com/socrabytes/youtube-digest/blob/main/.github/workflows/auto-bug-column.yml" target="_blank" >}} {{< icon "github" >}} Workflow File {{< /button >}}
126 KB
Loading
19 KB
Loading
110 KB
Loading
33.5 KB
Loading
95.6 KB
Loading

content/tech-journal/post-05-request-lifecycle/index.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,13 @@
11
---
22
title: "Request Lifecycle: From URL to AI-Generated Digest"
3-
date: 2025-05-02T09:00:00-05:00
3+
slug: "request-lifecycle"
4+
date: 2025-05-11
5+
description: ""
6+
summary: ""
7+
categories: ["AI & Machine Learning"]
8+
tags: ["FastAPI", "PostgreSQL", "Background Processing", "Asynchronous Architecture"]
9+
featureAlt: ""
410
draft: true
5-
tags: ["youtube-digest"]
611
---
712

813
## System Architecture & Flow

0 commit comments

Comments
 (0)