Skip to content

Commit 4eff966

Browse files
committed
📝 (tech-journal): finalize 'Behind the Build' post and add project images
- Update content/tech-journal/post-03-behind-the-build/index.md with final draft and metadata - Add project-view-feature.png and ytd-gh-project-view.png for visual context
1 parent aa95a22 commit 4eff966

File tree

3 files changed

+90
-12
lines changed

3 files changed

+90
-12
lines changed
Lines changed: 90 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,42 +1,120 @@
11
---
22
title: "Behind the Build: Foundations, Tradeoffs & What’s Ahead"
3-
date: 2025-04-25T09:00:00-05:00
4-
draft: true
5-
tags: ["youtube-digest"]
3+
slug: "Behind-the-build-foundations-tradeoffs-what-s-ahead"
4+
date: 2025-04-27T09:00:00-05:00
5+
description: "A behind-the-scenes look at the architecture, tradeoffs, and roadmap of YouTube Digest, an AI-powered summarization tool."
6+
summary: "How I built YouTube Digest to be fast, scalable, and ready for what’s next—from containerized infrastructure to async processing and cost-aware design."
7+
categories: ["Automation & Devops"]
8+
tags: ["youtube-digest", "github-projects", "architecture", "system-design"]
9+
featureAlt: "GitHub Project View showing task phases and progress for the YouTube Video Digest project on a purple background."
10+
draft: false
611
---
712

8-
## Behind the Build: Foundations, Tradeoffs & What's Ahead
13+
{{< lead >}}
14+
Built with one goal: create a lightweight, scalable system that could adapt and evolve without rewrites.
15+
{{< /lead >}}
916

10-
From the outset, YouTube Digest was designed as more than a simple demo; it serves as a solid foundation for a more comprehensive tool. The following design choices were made not just to deliver initial functionality, but to ensure the application can evolve effectively.
17+
[YouTube Digest]({{< relref "/projects/youtube-digest/index.md" >}}) isn't just functional.
18+
It’s a system designed for growth, built on async-first patterns, containerization, and modular AI integration.
19+
This article breaks down the key architectural decisions behind it.
1120

12-
### 🧱 **Foundations That Scale**
21+
---
22+
23+
## 🧱 **Foundations That Scale**
24+
25+
{{< mermaid >}}
26+
flowchart LR
27+
A[Next.js UI] --> B[FastAPI API]
28+
B --> C[PostgreSQL]
29+
B --> D[Background Workers]
30+
D --> E[yt-dlp]
31+
D --> F[OpenAI]
32+
E --> C
33+
F --> C
34+
C --> A
35+
subgraph Frontend
36+
A
37+
end
38+
subgraph Backend
39+
B
40+
D
41+
E
42+
F
43+
end
44+
subgraph DB
45+
C
46+
end
47+
{{< /mermaid >}}
1348

1449
- **Containerized from Day One:** The entire stack runs in Docker Compose—frontend (Next.js), backend (FastAPI), and database (PostgreSQL). That makes it reproducible, portable, and ready for production. A single `docker-compose up` is all it takes to spin up the full environment.
1550
- **Mounted Volumes for Fast Dev:** To move fast, I mapped local volumes to my containers. No rebuild loops—just save and refresh. This cut iteration time drastically during early development and testing.
1651

1752
* * *
1853

19-
### 🔁 **Smart Tradeoffs (Not Shortcuts)**
54+
## 🔁 **Strategic Tradeoffs, Not Shortcuts**
2055

21-
- **yt-dlp &gt; YouTube API:** I bypassed the YouTube Data API completely. Instead, I use `yt-dlp` to extract metadata and transcripts reliably—no quota limits, no credential headaches. It’s battle-tested (100k+ stars) and does exactly what I need.
22-
- **PostgreSQL as Cache + Source of Truth:** Every transcript and summary is stored, so nothing gets recomputed unnecessarily. This avoids repeated calls to the OpenAI API, which cuts latency and saves money.
56+
- **yt-dlp &gt; YouTube API:** I bypassed the YouTube Data API completely. Instead, I use `yt-dlp` to extract metadata and transcripts reliably—no quota limits, no credential headaches. It’s battle-tested (100k+ github ⭐'s) and does exactly what I need.
57+
- **PostgreSQL as Cache + Source of Truth:** Every transcript and summary is stored, so nothing gets recomputed unnecessarily. Avoiding repeated OpenAI calls reduces latency and keeps costs down—critical for scaling responsibly.
2358
- **Async-First, Always:** Heavy lifting (like fetching transcripts or summarizing long videos) happens in background tasks. The frontend stays responsive. If something takes 60 seconds, it won’t block anything else.
2459

2560
* * *
2661

27-
### 📈 **Performance and Observability**
62+
## 📈 **Performance, Observability, and Cost Control**
2863

2964
- **Digest Polling & Status Updates:** The frontend polls the backend to check video and digest processing status, ensuring users always see real-time feedback without blocking the UI.
3065
- **Token Usage & Cost Tracking:** Every OpenAI request logs tokens in/out. Right now it’s just internal, but the groundwork is there for per-user quotas, cost dashboards, or even billing in the future.
3166

32-
* * *
67+
| Video ID | Tokens In | Tokens Out | Cost (USD) |
68+
|----------|----------:|-----------:|-----------:|
69+
| abc123 | 1,200| 800| 0.016|
70+
| xyz789 | 2,500| 1,600| 0.032|
71+
| **Total**| 3,700| 2,400| 0.048|
72+
73+
74+
75+
## 🏗️ Built with Intent
76+
77+
Development was structured with long-term maintainability in mind—from backend schema design to the sprint process behind each feature.
78+
Task planning was managed using [GitHub Projects](https://github.com/users/socrabytes/projects/6/views/7), organized across clearly defined phases: `infrastructure`, `video processing`, and `UX`.
79+
80+
![Development Sprints in GitHub Projects](ytd-gh-project-view.png "Task management structured across phases using GitHub Projects")
3381

3482
### 🛠️ **Engineered for What's Next**
3583

84+
{{< mermaid >}}
85+
erDiagram
86+
users {
87+
int id PK
88+
string email
89+
}
90+
videos {
91+
int id PK
92+
string url
93+
}
94+
summaries {
95+
int id PK
96+
int video_id FK
97+
text content
98+
datetime created_at
99+
}
100+
users ||--o{ summaries: "creates"
101+
videos ||--o{ summaries: "has"
102+
{{< /mermaid >}}
103+
36104
- **Overbuilt Schema (On Purpose):** I designed the database with user accounts, content tracking, and digest history in mind—even though none of it’s visible in the UI yet. No rewrites later—just feature toggles when I need them.
37105
- **Library View (WIP):** Digest persistence is live. You can already retrieve previous summaries. The “library view” isn’t fully polished yet, but the backend is ready for when it is.
38106
- **Model-Agnostic Summarization:** The app started on GPT-4-turbo and now runs `o3-mini`. As new models drop, upgrades are plug-and-play. The MVP is quite literally the worst these summaries will ever be.
39107

40108
* * *
41109

42-
This isn’t just a working prototype—it’s a system designed to grow without crumbling. The early effort was intentional, and it sets the stage for rapid iteration without technical debt. Next stop: polish, UX upgrades, and user-facing features.
110+
111+
112+
## 🧠 Closing Thoughts
113+
114+
This isn’t just a working prototype—it’s a system designed to grow without crumbling. The early effort was intentional, and it sets the stage for rapid iteration without technical debt.
115+
116+
With the core foundation resilient and scalable, future work will focus on surface-level improvements: UX enhancements, personalized digest libraries, user dashboards, and streamlined model swapping as newer capabilities emerge.
117+
118+
See how the pieces fit together—or build on it yourself:
119+
120+
{{< button href="https://github.com/socrabytes/youtube-digest" target="_self" >}} {{< icon "github" >}} Github Repo {{< /button >}}
1.84 MB
Loading
192 KB
Loading

0 commit comments

Comments
 (0)