-
Notifications
You must be signed in to change notification settings - Fork 50
feat(logstream): Log streaming for argocd agent #569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
4d5e132 to
30aab14
Compare
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com> (cherry picked from commit 2a08301)
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com> (cherry picked from commit d07df62)
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com> (cherry picked from commit 161f2a4)
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com> (cherry picked from commit 30aab14)
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com> (cherry picked from commit f8a6666)
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com> (cherry picked from commit e820c35)
e820c35 to
3e1c2cd
Compare
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #569 +/- ##
==========================================
+ Coverage 45.52% 45.54% +0.02%
==========================================
Files 90 92 +2
Lines 9957 10488 +531
==========================================
+ Hits 4533 4777 +244
- Misses 4959 5226 +267
- Partials 465 485 +20 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@Mangaal I see this error intermittently on the UI. Works fine after requesting the logs again. I guess we are not handling |
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com>
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com>
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com>
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com>
|
@chetan-rns, Thanks for reviewing my PR. I’ve updated it and addressed your suggestions. Please take a look when you get a chance. |
chetan-rns
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Mangaal Sorry for the delay. Added a few questions around simplifying the agent logic. IMO, the agent should only propagate the options from the argocd server to the client-go's GetLogs(). Read the bytes from the reader until EOF, forward them back in chunks, and return any errors. We can avoid extracting timestamps to modify the sinceTime dynamically. I think we can rely on the argocd server to handle the chunks. This way the agent doesn't have to do any extra work. WDYT @jannfis
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com>
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com>
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com>
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com>
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com>
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com>
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com>
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com>
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com>
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com>
chetan-rns
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
WalkthroughAdds end-to-end container log streaming: agents accept ContainerLogRequest events, stream pod logs (static and resumable live) to principal via a new gRPC LogStream service; principal exposes HTTP streaming endpoints, protobufs, server implementation, tests, mocks, and a backoff dependency. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as HTTP Client
participant Principal as Principal
participant Agent as Agent
participant K8s as Kubernetes API
Client->>Principal: GET /logs (follow or static)
Principal->>Principal: RegisterHTTP(requestUUID) → open writer/session
Principal->>Agent: Emit ContainerLogRequest event
Agent->>Agent: startLogStreamIfNew (dedupe via inflightLogs)
alt Static (follow=false)
Agent->>K8s: PodLogs (follow=false)
K8s-->>Agent: log bytes + EOF
Agent->>Principal: StreamLogs (send chunks, EOF)
Principal->>Client: write & flush, then close
else Live (follow=true)
Agent->>Agent: streamLogsWithResume (background goroutine)
loop resume/retry with backoff
Agent->>K8s: PodLogs (follow=true, SinceTime)
K8s-->>Agent: new log bytes
Agent->>Principal: StreamLogs (data + timestamps)
Principal->>Client: write & flush
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (3)
go.sumis excluded by!**/*.sumpkg/api/grpc/logstreamapi/logstream.pb.gois excluded by!**/*.pb.gopkg/api/grpc/logstreamapi/logstream_grpc.pb.gois excluded by!**/*.pb.go
📒 Files selected for processing (15)
agent/agent.go(1 hunks)agent/inbound.go(1 hunks)agent/log.go(1 hunks)agent/log_test.go(1 hunks)go.mod(1 hunks)internal/event/event.go(5 hunks)principal/apis/logstream/logstream.go(1 hunks)principal/apis/logstream/logstream.proto(1 hunks)principal/apis/logstream/logstream_test.go(1 hunks)principal/apis/logstream/mock/mock.go(1 hunks)principal/listen.go(2 hunks)principal/resource.go(2 hunks)principal/server.go(3 hunks)test/e2e/fixture/argoclient.go(1 hunks)test/e2e/logs_test.go(1 hunks)
🧰 Additional context used
🪛 Buf (1.59.0)
principal/apis/logstream/logstream.proto
17-17: Files with package "logstreamapi" must be within a directory "logstreamapi" relative to root but were in directory "principal/apis/logstream".
(PACKAGE_DIRECTORY_MATCH)
| package logstreamapi; | ||
|
|
||
| option go_package = "github.com/argoproj-labs/argocd-agent/pkg/api/grpc/logstreamapi"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix proto package directory mismatch
Line 17 declares package logstreamapi, but the file lives under principal/apis/logstream/…. Buf 1.59.0 enforces PACKAGE_DIRECTORY_MATCH, so buf lint (and any proto codegen driven by it) will fail until the directory matches the declared package. Please either relocate the proto to principal/apis/logstreamapi/logstream.proto (matching the existing go_package) or adjust the package name/directory structure so Buf’s rule passes. Based on static analysis hints.
🧰 Tools
🪛 Buf (1.59.0)
17-17: Files with package "logstreamapi" must be within a directory "logstreamapi" relative to root but were in directory "principal/apis/logstream".
(PACKAGE_DIRECTORY_MATCH)
🤖 Prompt for AI Agents
In principal/apis/logstream/logstream.proto around lines 17 to 19, the declared
proto package "logstreamapi" does not match the file's directory
(principal/apis/logstream/...), causing Buf's PACKAGE_DIRECTORY_MATCH lint to
fail; either move this .proto into principal/apis/logstreamapi/logstream.proto
to match the package and the go_package, or update the package name and
go_package to reflect the current directory (e.g., change package to "logstream"
and adjust option go_package accordingly), then run buf lint to verify the
mismatch is resolved.
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com>
…og-streaming Signed-off-by: Mangaal <angommeeteimangaal@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
principal/apis/logstreamapi/logstream.proto (1)
17-20: Fix Buf PACKAGE_DIRECTORY_MATCH failure.
buf lintstill reportsPACKAGE_DIRECTORY_MATCHbecause the proto packagelogstreamapimust live directly under alogstreamapidirectory relative to the module root, while the current pathprincipal/apis/logstreamapistill violates that rule. Adjust either the package declaration (e.g.package principal.apis.logstreamapi;) or the directory/module configuration so the package path and filesystem layout align; otherwise, Buf-based lint/codegen will continue to fail. Based on static analysis hints
🧹 Nitpick comments (1)
principal/apis/logstreamapi/logstream.go (1)
289-293: Lower the per-chunk log level to avoid flooding principal logs.Writing an
Infoentry for every chunk sent to the HTTP client will spam logs during live streams and adds measurable overhead. Please drop this log or demote it toTrace, keeping high-volume traffic out of the default log level.- logCtx.WithFields(logrus.Fields{ - "data_length": len(data), - "request_id": reqID, - }).Info("HTTP write and flush successful") + logCtx.WithFields(logrus.Fields{ + "data_length": len(data), + "request_id": reqID, + }).Trace("HTTP write and flush successful")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (12)
agent/agent.go(2 hunks)agent/inbound.go(1 hunks)agent/log.go(1 hunks)agent/log_test.go(1 hunks)go.mod(1 hunks)internal/event/event.go(5 hunks)principal/apis/logstreamapi/logstream.go(1 hunks)principal/apis/logstreamapi/logstream.proto(1 hunks)principal/apis/logstreamapi/logstream_test.go(1 hunks)principal/apis/logstreamapi/mock/mock.go(1 hunks)principal/server.go(3 hunks)test/e2e/fixture/argoclient.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- go.mod
- agent/agent.go
🧰 Additional context used
🧬 Code graph analysis (6)
principal/apis/logstreamapi/logstream_test.go (2)
principal/apis/logstreamapi/logstream.go (1)
NewServer(66-71)principal/apis/logstreamapi/mock/mock.go (4)
NewMockHTTPResponseWriter(119-123)MockWriterWithoutFlusher(168-172)NewMockLogStreamServer(41-46)PanicFlusher(161-161)
principal/server.go (1)
principal/apis/logstreamapi/logstream.go (2)
Server(32-36)NewServer(66-71)
principal/apis/logstreamapi/logstream.go (2)
principal/server.go (2)
Server(71-164)NewServer(183-388)pkg/api/grpc/logstreamapi/logstream_grpc.pb.go (1)
LogStreamService_StreamLogsServer(104-108)
agent/log_test.go (5)
principal/apis/logstreamapi/mock/mock.go (2)
MockLogStreamServer(30-39)NewMockLogStreamServer(41-46)pkg/api/grpc/logstreamapi/logstream.pb.go (6)
LogStreamData(24-37)LogStreamData(52-52)LogStreamData(67-69)LogStreamResponse(100-109)LogStreamResponse(124-124)LogStreamResponse(139-141)agent/agent.go (1)
Agent(62-116)test/fake/kube/kubernetes.go (1)
NewKubernetesFakeClientWithResources(69-78)internal/event/event.go (1)
ContainerLogRequest(959-973)
agent/inbound.go (1)
internal/event/event.go (1)
TargetContainerLog(85-85)
agent/log.go (4)
agent/agent.go (1)
Agent(62-116)internal/event/event.go (1)
ContainerLogRequest(959-973)pkg/api/grpc/logstreamapi/logstream.pb.go (3)
LogStreamData(24-37)LogStreamData(52-52)LogStreamData(67-69)pkg/api/grpc/logstreamapi/logstream_grpc.pb.go (1)
LogStreamService_StreamLogsClient(46-50)
🪛 Buf (1.59.0)
principal/apis/logstreamapi/logstream.proto
17-17: Files with package "logstreamapi" must be within a directory "logstreamapi" relative to root but were in directory "principal/apis/logstreamapi".
(PACKAGE_DIRECTORY_MATCH)
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com>
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
agent/log.go (2)
238-246: Consider propagating actual error details to the principal.Line 244 sends a generic error message
"log stream read failed"to the principal. Sending the actual error message (e.g.,err.Error()) would aid debugging on the principal/UI side, similar to how line 119 and line 294 already do this for other error paths.Apply this diff if you want more specific error messages:
if errors.Is(err, io.EOF) { _ = stream.Send(&logstreamapi.LogStreamData{RequestUuid: logReq.UUID, Eof: true}) return nil } logCtx.WithError(err).Error("Error reading log stream") - _ = stream.Send(&logstreamapi.LogStreamData{RequestUuid: logReq.UUID, Error: "log stream read failed"}) + _ = stream.Send(&logstreamapi.LogStreamData{RequestUuid: logReq.UUID, Error: err.Error()}) return err
268-272: Consider extracting the hardcoded overlap duration to a named constant.The
-100msoverlap at line 270 ensures no log lines are lost during resume, which is important. However, it's hardcoded and lacks explanation. Extracting it to a named constant would improve readability and make the intent clear.For example, add this constant near line 251:
const ( waitForReconnect = 10 * time.Second // how long we poll IsConnected() after Unauthenticated pollEvery = 1 * time.Second resumeOverlap = 100 * time.Millisecond // overlap to prevent losing log lines during resume )Then update line 270:
resumeReq := *logReq if lastTimestamp != nil { - t := lastTimestamp.Add(-100 * time.Millisecond) + t := lastTimestamp.Add(-resumeOverlap) resumeReq.SinceTime = t.Format(time.RFC3339) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
agent/log.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
agent/log.go (4)
agent/agent.go (1)
Agent(62-116)internal/event/event.go (1)
ContainerLogRequest(959-973)pkg/api/grpc/logstreamapi/logstream.pb.go (3)
LogStreamData(24-37)LogStreamData(52-52)LogStreamData(67-69)pkg/api/grpc/logstreamapi/logstream_grpc.pb.go (2)
LogStreamService_StreamLogsClient(46-50)NewLogStreamServiceClient(33-35)
🔇 Additional comments (10)
agent/log.go (10)
17-34: LGTM!The imports are well-organized and all dependencies are appropriate for the log streaming functionality.
36-58: LGTM!Clean entry point with appropriate error handling and structured logging.
60-97: LGTM!The duplicate detection and lifecycle management logic is correct. The cleanup function properly invokes
cancel()and removes the inflight entry, addressing the critical issues from previous reviews.
148-165: LGTM!The goroutine correctly defers
cleanup()at line 156, ensuring the inflight entry is always released when the live stream ends. The panic handler only logs the panic, relying on the deferred cleanup for resource management—this is the correct pattern.
167-176: LGTM!Simple and correct helper for creating the gRPC log stream.
178-201: LGTM!The Kubernetes log stream configuration is correct. Setting
Timestampstotrueis essential for the resume capability, and theSinceTimeparsing properly supports retry logic.
282-286: Verify that the initial empty data message is necessary.The code sends an empty
LogStreamDatamessage at the start of each attempt for "health checks" (lines 282-286). While the comment explains this, it's worth confirming whether the principal actually requires this initial message or if it could be omitted to simplify the protocol.If the principal doesn't strictly require this empty message, consider removing it to simplify the streaming flow.
322-346: LGTM, but consider simplifying the reconnection wait loop if possible.The authentication failure handling is sophisticated and correct: it pauses exponential backoff and waits for reconnection by polling
IsConnected(). The manual ticker loop (lines 327-344) works correctly, though it could potentially be simplified. Given the complexity is warranted for reliable auth handling, this is acceptable as-is.
402-408: Verify EOF handling is intentional for live streaming.The code does NOT send an EOF message when
rc.Read()returnsio.EOF(lines 402-408), instead treating it as an error. In contrast,streamLogsToCompletion(line 239) handles EOF as clean termination. For live streaming withfollow=true, is it intentional to treat EOF as an error and retry?This might be related to the intermittent EOF errors reported in the PR comments. If the Kubernetes stream returns EOF prematurely (e.g., during reconnection), treating it as an error would trigger a retry with the resume logic, which seems correct. However, sending
err.Error()(which would be "EOF") to the principal as an error message might surface in the UI.Consider adding explicit EOF handling to distinguish between expected EOF (if any) and unexpected errors:
if err != nil { if errors.Is(err, io.EOF) { // For follow=true, EOF from k8s might indicate temporary disruption // Let the resume logic handle it logCtx.Info("EOF from k8s stream, will resume") return lastTimestamp, err } logCtx.WithError(err).Error("Error reading log stream") _ = stream.Send(&logstreamapi.LogStreamData{RequestUuid: logReq.UUID, Error: err.Error()}) return lastTimestamp, err }
412-438: LGTM!The timestamp extraction logic is well-optimized with appropriate guards and early exits. Trying both
RFC3339NanoandRFC3339formats ensures compatibility with Kubernetes timestamp variations.
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
agent/log.go (1)
99-146: Remove redundantdefer rc.Close()at line 123.The
rc.Close()is deferred at both line 123 and insidestreamLogsToCompletionat line 213, resulting in a double-close. While mostio.ReadCloserimplementations handle this gracefully, callingClose()twice is a code smell and not guaranteed safe across all implementations.Apply this diff to remove the redundant defer:
// Create Kubernetes log stream rc, err := a.createKubernetesLogStream(ctx, logReq) if err != nil { _ = stream.Send(&logstreamapi.LogStreamData{RequestUuid: logReq.UUID, Eof: true, Error: err.Error()}) _, _ = stream.CloseAndRecv() return err } - defer rc.Close() err = a.streamLogsToCompletion(ctx, stream, rc, logReq, logCtx)
🧹 Nitpick comments (2)
agent/log.go (2)
280-286: Clarify the purpose of the initial empty message.The comment "Used for health checks" may be misleading. This initial message appears to establish the stream connection rather than perform a health check. Consider clarifying the comment to accurately reflect its purpose (e.g., "Send initial message to establish stream connection").
383-392: Consider adding a comment to explain the timestamp extraction logic.The logic for extracting the timestamp from the last complete line in the buffer is correct but somewhat dense. A brief comment explaining why we search backwards for the last complete line (to enable resume from the most recent known timestamp) would improve maintainability.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
pkg/api/grpc/logstreamapi/logstream.pb.gois excluded by!**/*.pb.gopkg/api/grpc/logstreamapi/logstream_grpc.pb.gois excluded by!**/*.pb.go
📒 Files selected for processing (4)
agent/log.go(1 hunks)hack/generate-proto.sh(1 hunks)principal/apis/logstreamapi/logstream.proto(1 hunks)principal/server.go(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- principal/server.go
🧰 Additional context used
🧬 Code graph analysis (1)
agent/log.go (4)
agent/agent.go (1)
Agent(62-116)internal/event/event.go (1)
ContainerLogRequest(959-973)pkg/api/grpc/logstreamapi/logstream.pb.go (3)
LogStreamData(38-51)LogStreamData(66-66)LogStreamData(81-83)pkg/api/grpc/logstreamapi/logstream_grpc.pb.go (1)
LogStreamService_StreamLogsClient(46-50)
🔇 Additional comments (7)
hack/generate-proto.sh (1)
26-26: Proto code generation target addition looks good.The new logstreamapi entry follows the established pattern and will be processed identically to existing API targets by the generation loop.
principal/apis/logstreamapi/logstream.proto (2)
17-19: Proto package-to-directory alignment appears to resolve prior Buf lint issue.The file location
principal/apis/logstreamapi/logstream.protonow correctly matches the declared packageprincipal.apis.logstreamapiand go_package option, satisfying Buf'sPACKAGE_DIRECTORY_MATCHrule that was flagged in the previous review.
22-44: Proto schema structure aligns with unidirectional client-streaming design.LogStreamData messages (with request_uuid correlation, data payload, eof marker, and error field) enable the agent to push log chunks to the principal, which the principal writes directly to HTTP response streams. The single LogStreamResponse returned after stream closure provides final status and statistics. The StreamLogs RPC's client-streaming semantics correctly implement the agent→principal push model described in the PR objectives.
agent/log.go (4)
60-97: LGTM: Cleanup and duplicate detection logic is correct.The in-flight tracking with cleanup function properly calls
cancel()before deleting the map entry, and the separation between static (with deferred cleanup) and live streaming (with cleanup passed to goroutine) correctly handles resource lifecycle for both paths.
148-165: LGTM: Live streaming goroutine properly defers cleanup.The goroutine correctly defers
cleanup()at the top, ensuring the in-flight entry and context cancellation are released when streaming ends, regardless of how it terminates (success, error, or panic).
250-360: LGTM: Resume logic with exponential backoff is well-designed.The error handling properly distinguishes between terminal errors (Canceled/NotFound), authentication errors requiring reconnection wait, and transient errors handled with exponential backoff. The 100ms timestamp overlap prevents log loss during resume.
411-437: LGTM: Timestamp extraction is robust and efficient.The function properly guards against edge cases (short lines, missing spaces, long tokens) and handles both RFC3339 and RFC3339Nano formats. The fast-path checks prevent unnecessary parsing attempts.
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
agent/log.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
agent/log.go (4)
agent/agent.go (1)
Agent(62-116)internal/event/event.go (1)
ContainerLogRequest(959-973)pkg/api/grpc/logstreamapi/logstream.pb.go (3)
LogStreamData(38-51)LogStreamData(66-66)LogStreamData(81-83)pkg/api/grpc/logstreamapi/logstream_grpc.pb.go (1)
LogStreamService_StreamLogsClient(46-50)
Signed-off-by: Mangaal <angommeeteimangaal@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
agent/log.go(1 hunks)agent/log_test.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
agent/log_test.go (5)
principal/apis/logstreamapi/mock/mock.go (2)
MockLogStreamServer(30-39)NewMockLogStreamServer(41-46)pkg/api/grpc/logstreamapi/logstream.pb.go (6)
LogStreamData(38-51)LogStreamData(66-66)LogStreamData(81-83)LogStreamResponse(114-123)LogStreamResponse(138-138)LogStreamResponse(153-155)agent/agent.go (1)
Agent(62-116)test/fake/kube/kubernetes.go (1)
NewKubernetesFakeClientWithResources(69-78)internal/event/event.go (1)
ContainerLogRequest(959-973)
agent/log.go (4)
agent/agent.go (1)
Agent(62-116)internal/event/event.go (1)
ContainerLogRequest(959-973)pkg/api/grpc/logstreamapi/logstream.pb.go (3)
LogStreamData(38-51)LogStreamData(66-66)LogStreamData(81-83)pkg/api/grpc/logstreamapi/logstream_grpc.pb.go (2)
LogStreamService_StreamLogsClient(46-50)NewLogStreamServiceClient(33-35)
| const chunkMax = 64 * 1024 // 64KB chunks | ||
| var lastTimestamp *time.Time | ||
| readBuf := make([]byte, chunkMax) | ||
|
|
||
| for { | ||
| select { | ||
| case <-ctx.Done(): | ||
| return lastTimestamp, ctx.Err() | ||
| case <-stream.Context().Done(): | ||
| return lastTimestamp, stream.Context().Err() | ||
| default: | ||
| } | ||
| n, err := rc.Read(readBuf) | ||
| if n > 0 { | ||
| b := readBuf[:n] | ||
| // Extract timestamp from the last complete line in the buffer to enable resume capability. | ||
| if end := bytes.LastIndexByte(b, '\n'); end >= 0 { | ||
| start := bytes.LastIndexByte(b[:end], '\n') + 1 | ||
| line := b[start:end] | ||
| if len(line) > 0 && line[len(line)-1] == '\r' { | ||
| line = line[:len(line)-1] | ||
| } | ||
| if ts := extractTimestamp(string(line)); ts != nil { | ||
| lastTimestamp = ts | ||
| } | ||
| } | ||
| if sendErr := stream.Send(&logstreamapi.LogStreamData{ | ||
| RequestUuid: logReq.UUID, | ||
| Data: readBuf[:n], | ||
| }); sendErr != nil { | ||
| // For client side streaming, the actual gRPC error may only surface | ||
| // after stream closure. Attempt to close and return the final error. | ||
| if _, closedErr := stream.CloseAndRecv(); closedErr != nil { | ||
| return lastTimestamp, closedErr | ||
| } | ||
| return lastTimestamp, sendErr | ||
| } | ||
| } | ||
| if err != nil { | ||
| if errors.Is(err, io.EOF) { | ||
| _ = stream.Send(&logstreamapi.LogStreamData{RequestUuid: logReq.UUID, Eof: true}) | ||
| return lastTimestamp, nil | ||
| } | ||
| _ = stream.Send(&logstreamapi.LogStreamData{RequestUuid: logReq.UUID, Error: err.Error()}) | ||
| return lastTimestamp, err | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure the Kubernetes log ReadCloser is closed
streamLogs never closes the rc it reads from, so every successful or failed pass leaks the Kubernetes HTTP response. Over time the agent will exhaust the underlying sockets/descriptors and degrade streaming/retry behaviour. Please close the reader as soon as the function starts so all exit paths release the resource.
func (a *Agent) streamLogs(ctx context.Context, stream logstreamapi.LogStreamService_StreamLogsClient, rc io.ReadCloser, logReq *event.ContainerLogRequest, logCtx *logrus.Entry) (*time.Time, error) {
const chunkMax = 64 * 1024 // 64KB chunks
+ defer rc.Close()
var lastTimestamp *time.Time
readBuf := make([]byte, chunkMax)🤖 Prompt for AI Agents
In agent/log.go around lines 348 to 395, the Kubernetes log ReadCloser (rc) is
never closed causing a file descriptor/socket leak; immediately close it on
function entry by adding a nil-checked defer rc.Close() (e.g., if rc != nil {
defer func(){ _ = rc.Close() }() }) so every exit path (normal, EOF, errors,
context cancel, or after CloseAndRecv) releases the underlying HTTP response; do
not remove existing stream.CloseAndRecv logic—just ensure rc is always closed.
This PR introduces a unidirectional(agent → principal) log streaming service and wires it into the resource-proxy path so the Principal can serve Kubernetes pod logs to the Argo CD UI. The Agent handles both static logs (follow=false) and live streaming (follow=true) with resume support.
What’s included:
Key feature:
Assisted-by: Cursor/Gemini etc
logs.mov
Summary by CodeRabbit