Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable wsh file cross-remote copy/move #1725

Merged
merged 155 commits into from
Jan 31, 2025
Merged

Conversation

esimkowitz
Copy link
Member

@esimkowitz esimkowitz commented Jan 13, 2025

This adds the ability to stream tar archives over channels between wsh instances. The main use cases for this are remote copy and move operations.

It also completes the wavefs implementation of the FileShare interface to allow copy/move interoperability between wavefiles and other storage types.

The tar streaming functionality has been broken out into the new tarcopy package for easy reuse.

New fileshare functions are added for CopyInternal, which allows copying files internal to a filesystem to bypass the expensive interop layer, and MoveInternal, which does the same for moving a file within a filesystem. Copying between remotes is now handled by CopyRemote, which accepts the source FileShareClient as a parameter. wsh connections use the same implementation for CopyInternal and CopyRemote as they need to request the channel on the remote destination, since we don't offer a way to pass channels as a parameter to a remote call.

This also adds a recursive -r flag to wsh file rm to allow for deleting a directory and all its contents.

S3 support will be addressed in a future PR.

Copy link
Contributor

coderabbitai bot commented Jan 13, 2025

Warning

Rate limit exceeded

@esimkowitz has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 24 minutes and 36 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 0b9bdd5 and 4749943.

📒 Files selected for processing (2)
  • pkg/util/fileutil/fileutil.go (2 hunks)
  • pkg/wshrpc/wshremote/wshremote.go (7 hunks)

Walkthrough

The pull request introduces several significant changes across multiple packages in the Wave Terminal project. The modifications primarily focus on enhancing file operation capabilities, particularly for file deletion and streaming. A new iochantypes package is introduced to support a more structured approach to data handling, with a Packet type that includes data and a checksum.

Key changes include adding a recursive deletion option for file removal commands, updating method signatures to use a new CommandDeleteFileData type, and modifying streaming methods to return iochantypes.Packet instead of raw byte slices. The ListBuckets function's implementation is simplified by removing pagination, and AWS-related functionality is removed from the file share package. Additionally, several methods across various components have been renamed or modified to enhance functionality and error handling. These changes span frontend and backend components, affecting file share clients, RPC interfaces, and web handling mechanisms, thereby providing more flexible and robust file management capabilities across different file system types and remote connections.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (5)
pkg/util/tarcopy/tarcopy.go (2)

71-108: Path bounding in TarCopyDest function.

While TarCopyDest focuses on reading tar entries, it processes each header without verifying whether the extracted file paths remain within the intended directory structure. In a multi-tenant or security-sensitive environment, malicious tar headers could overwrite unintended files.

Ensure you have an optional validation step to confirm that the final paths for extracted files remain restricted to the desired target directory.


110-124: Capture final error on closure failure.

gracefulClose logs each close attempt’s error, but if all attempts fail, the final message doesn’t include the last error details. In complex workflows, losing this error context can complicate debugging.

Consider tracking and logging the last encountered error in the final log.Printf statement to provide more comprehensive diagnostic information.

pkg/remote/fileshare/wavefs/wavefs.go (3)

105-124: Short deadline may cause unexpected cancellation.

A default timeout of 100ms (line 115) might be too short for large file listings or slow networks. If large directories are encountered, this can result in partial tar data.

Consider increasing the default timeout or making it clearly configurable to avoid premature cancellation.


178-185: Enhanced logging for ListEntries.

Line 178 logs connection details, which can be beneficial for debugging. Ensure sensitive or private information is not leaked in logs.


388-399: Partial failures in MoveInternal.

If CopyInternal succeeds but Delete fails, the operation leaves a partially moved file. Depending on your needs, consider an all-or-nothing approach or an explicit rollback.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f149473 and 9dc8ebf.

📒 Files selected for processing (2)
  • pkg/remote/fileshare/wavefs/wavefs.go (5 hunks)
  • pkg/util/tarcopy/tarcopy.go (1 hunks)
🧰 Additional context used
🪛 golangci-lint (1.62.2)
pkg/util/tarcopy/tarcopy.go

56-56: S1017: should replace this if statement with an unconditional strings.TrimPrefix

(gosimple)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Build for TestDriver.ai
  • GitHub Check: Analyze (go)
🔇 Additional comments (6)
pkg/util/tarcopy/tarcopy.go (1)

1-33: Proper package documentation & import usage.

The overall package documentation is clear, and imports are well-structured. No immediate issues are spotted within these lines.

pkg/remote/fileshare/wavefs/wavefs.go (5)

7-24: Imports adjusted for tar-based copying.

The newly added import statements (e.g., "archive/tar", "github.com/wavetermdev/waveterm/pkg/util/tarcopy") appear appropriate for the cross-remote copy/move feature described in the PR objectives. No immediate concerns.


127-157: Directory handling in ReadTarStream.

When encountering directories (line 139), the loop simply continues without recursion. This is likely intentional but consider clarifying with a comment if directory recursion is out of scope.


385-385: Unsupported Mkdir.

Line 385 returns errors.ErrUnsupported, ensuring clarity that wavefs does not support directories. This is consistent with the overall wavefile approach.


403-439: CopyInternal only supports same-host transfers.

Line 437 raises an error if hosts differ. This is consistent with wavefs’s design, but confirm the higher-level caller gracefully handles this limitation.


569-577: getPathPrefix comprehension.

getPathPrefix attempts to preserve everything up to the last slash. Verify that this logic aligns with how wavefs is intended to locate files, especially for edge cases where the path may be just '/' or contain trailing slashes.

pkg/util/tarcopy/tarcopy.go Outdated Show resolved Hide resolved
pkg/util/tarcopy/tarcopy.go Outdated Show resolved Hide resolved
pkg/remote/fileshare/wavefs/wavefs.go Show resolved Hide resolved
pkg/remote/fileshare/wavefs/wavefs.go Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (14)
pkg/util/fileutil/fileutil.go (2)

168-198: Add documentation for the FsFileInfo type and its methods.

The implementation correctly satisfies the fs.FileInfo interface, but would benefit from documentation explaining its purpose and usage.

Add documentation comments:

+// FsFileInfo implements fs.FileInfo interface for remote file information
 type FsFileInfo struct {
     NameInternal    string
     ModeInternal    os.FileMode
     SizeInternal    int64
     ModTimeInternal int64
     IsDirInternal   bool
 }

+// Name returns the base name of the file
 func (f FsFileInfo) Name() string {
     return f.NameInternal
 }

202-210: Add nil check in ToFsFileInfo.

The conversion logic is correct, but should handle nil input gracefully.

Add nil check:

 func ToFsFileInfo(fi *wshrpc.FileInfo) FsFileInfo {
+    if fi == nil {
+        return FsFileInfo{}
+    }
     return FsFileInfo{
         NameInternal:    fi.Name,
         ModeInternal:    fi.Mode,
         SizeInternal:    fi.Size,
         ModTimeInternal: fi.ModTime,
         IsDirInternal:   fi.IsDir,
     }
 }
pkg/util/tarcopy/tarcopy.go (3)

52-55: Enhance path validation for better security.

While the current implementation checks for directory traversal after cleaning the path, consider validating the path before cleaning to prevent any potential normalization bypasses.

-header.Name = filepath.Clean(strings.TrimPrefix(file, pathPrefix))
-if err := validatePath(header.Name); err != nil {
+trimmedPath := strings.TrimPrefix(file, pathPrefix)
+if err := validatePath(trimmedPath); err != nil {
+  return err
+}
+header.Name = filepath.Clean(trimmedPath)
+if err := validatePath(header.Name); err != nil {
   return err
 }

68-76: Strengthen path validation for cross-platform security.

Consider adding checks for:

  1. Windows-style paths (e.g., C:\)
  2. Special characters that might be interpreted differently across platforms
  3. Null bytes that could lead to string truncation
 func validatePath(path string) error {
+    if strings.ContainsRune(path, '\x00') {
+        return fmt.Errorf("invalid path containing null byte: %s", path)
+    }
+    if strings.ContainsAny(path, "\\:*?\"<>|") {
+        return fmt.Errorf("invalid path containing special characters: %s", path)
+    }
     if strings.Contains(path, "..") {
         return fmt.Errorf("invalid path containing directory traversal: %s", path)
     }
     if strings.HasPrefix(path, "/") {
         return fmt.Errorf("invalid path starting with /: %s", path)
     }
+    if len(path) >= 3 && path[1] == ':' && path[2] == '\\' {
+        return fmt.Errorf("invalid Windows-style absolute path: %s", path)
+    }
     return nil
 }

94-116: Optimize the loop structure for better readability.

The current loop structure with a select statement can be simplified since the context check is only needed before reading the next tar entry.

-    for {
-        select {
-        case <-ctx.Done():
-            return nil
-        default:
-            next, err := tarReader.Next()
-            if err != nil {
-                // Do one more check for context error before returning
-                if ctx.Err() != nil {
-                    return nil
-                }
-                if errors.Is(err, io.EOF) {
-                    return nil
-                } else {
-                    return fmt.Errorf("cannot read tar stream: %w", err)
-                }
-            }
-            err = readNext(next, tarReader)
-            if err != nil {
-                return err
-            }
-        }
-    }
+    for ctx.Err() == nil {
+        next, err := tarReader.Next()
+        if err != nil {
+            if errors.Is(err, io.EOF) {
+                return nil
+            }
+            return fmt.Errorf("cannot read tar stream: %w", err)
+        }
+        if err := readNext(next, tarReader); err != nil {
+            return err
+        }
+    }
+    return nil
pkg/remote/fileshare/wavefs/wavefs.go (5)

105-158: Consider increasing the default timeout value.

The default timeout of 100ms might be too short for large files or slow network conditions. Consider increasing it to a more reasonable default, like 5 seconds (5000ms).

-	timeout := time.Millisecond * 100
+	timeout := time.Second * 5

388-399: Consider implementing atomic move operation.

The current implementation could leave the system in an inconsistent state if the delete operation fails after a successful copy. Consider implementing a transaction-like approach or adding cleanup on failure.


403-439: Consider using buffered reads for large files.

For large files, consider using a buffer pool or streaming approach to reduce memory usage during the copy operation.

-		_, dataBuf, err := filestore.WFS.ReadFile(ctx, host, srcFileName)
+		const bufferSize = 32 * 1024 // 32KB buffer
+		buffer := make([]byte, bufferSize)
+		reader, _, err := filestore.WFS.ReadStream(ctx, host, srcFileName)

498-533: Improve error aggregation in recursive deletion.

Consider using errors.Join (Go 1.20+) for better error aggregation and formatting:

-			return fmt.Errorf("error deleting blockfiles: %v", errs)
+			return fmt.Errorf("error deleting blockfiles: %w", errors.Join(errs...))

574-582: Replace magic number with named constant.

The magic number 10 in the path length check should be replaced with a named constant for better maintainability and clarity.

+const minPathLength = 10 // minimum length for scheme://host/
 func getPathPrefix(conn *connparse.Connection) string {
 	fullUri := conn.GetFullURI()
 	pathPrefix := fullUri
 	lastSlash := strings.LastIndex(fullUri, "/")
-	if lastSlash > 10 && lastSlash < len(fullUri)-1 {
+	if lastSlash > minPathLength && lastSlash < len(fullUri)-1 {
 		pathPrefix = fullUri[:lastSlash+1]
 	}
 	return pathPrefix
 }
pkg/wshrpc/wshremote/wshremote.go (4)

262-265: Consider extracting the default timeout value into a constant.

The default timeout of 100ms is hardcoded. Consider extracting this into a named constant for better maintainability.

+const defaultTarStreamTimeout = 100 * time.Millisecond

 func (impl *ServerImpl) RemoteTarStreamCommand(ctx context.Context, data wshrpc.CommandRemoteStreamTarData) <-chan wshrpc.RespOrErrorUnion[iochantypes.Packet] {
-    timeout := time.Millisecond * 100
+    timeout := defaultTarStreamTimeout
     if opts.Timeout > 0 {
         timeout = time.Duration(opts.Timeout) * time.Millisecond
     }

290-293: Consider using io.CopyBuffer for potential performance improvement.

When copying file contents, using io.CopyBuffer with a pre-allocated buffer might improve performance for large files.

+    buffer := make([]byte, 32*1024)
-    if _, err := io.Copy(fileWriter, data); err != nil {
+    if _, err := io.CopyBuffer(fileWriter, data, buffer); err != nil {
         log.Printf("error copying file %q: %v\n", file, err)
         return err
     }

Line range hint 356-438: Consider breaking down the large copy operation into smaller functions.

The file copy implementation is quite long and handles multiple concerns. Consider extracting the tar copy logic into separate functions for better maintainability.

+func (impl *ServerImpl) handleTarCopy(ctx context.Context, destPathCleaned string, srcUri string, destUri string, opts *wshrpc.FileCopyOpts) error {
+    timeout := time.Millisecond * 100
+    if opts.Timeout > 0 {
+        timeout = time.Duration(opts.Timeout) * time.Millisecond
+    }
+    readCtx, cancel := context.WithCancelCause(ctx)
+    readCtx, timeoutCancel := context.WithTimeoutCause(readCtx, timeout, fmt.Errorf("timeout copying file %q to %q", srcUri, destUri))
+    defer timeoutCancel()
+    
+    copyStart := time.Now()
+    ioch := wshclient.FileStreamTarCommand(wshfs.RpcClient, wshrpc.CommandRemoteStreamTarData{Path: srcUri, Opts: opts}, &wshrpc.RpcOpts{Timeout: opts.Timeout})
+    
+    stats := &copyStats{}
+    err := tarcopy.TarCopyDest(readCtx, cancel, ioch, impl.processTarEntry(destPathCleaned, stats, opts))
+    if err != nil {
+        return fmt.Errorf("cannot copy %q to %q: %w", srcUri, destUri, err)
+    }
+    
+    log.Printf("RemoteFileCopyCommand: done; %d files copied in %dms, total of %d bytes, %d files skipped\n",
+        stats.numFiles, time.Since(copyStart).Milliseconds(), stats.totalBytes, stats.numSkipped)
+    return nil
+}

755-775: Consider improving error handling for better clarity.

The error handling could be more structured and provide clearer messages.

 func (*ServerImpl) RemoteFileDeleteCommand(ctx context.Context, data wshrpc.CommandDeleteFileData) error {
     expandedPath, err := wavebase.ExpandHomeDir(data.Path)
     if err != nil {
-        return fmt.Errorf("cannot delete file %q: %w", data.Path, err)
+        return fmt.Errorf("cannot expand path %q: %w", data.Path, err)
     }
     cleanedPath := filepath.Clean(expandedPath)
 
-    err = os.Remove(cleanedPath)
+    finfo, err := os.Stat(cleanedPath)
     if err != nil {
-        finfo, _ := os.Stat(cleanedPath)
-        if finfo != nil && finfo.IsDir() {
-            if !data.Recursive {
-                return fmt.Errorf("cannot delete directory %q, recursive option not specified", data.Path)
-            }
-            err = os.RemoveAll(cleanedPath)
-            if err != nil {
-                return fmt.Errorf("cannot delete directory %q: %w", data.Path, err)
-            }
-        } else {
-            return fmt.Errorf("cannot delete file %q: %w", data.Path, err)
+        if os.IsNotExist(err) {
+            return fmt.Errorf("path %q does not exist", data.Path)
         }
+        return fmt.Errorf("cannot access path %q: %w", data.Path, err)
+    }
+
+    if finfo.IsDir() {
+        if !data.Recursive {
+            return fmt.Errorf("cannot delete directory %q: recursive option not specified", data.Path)
+        }
+        if err := os.RemoveAll(cleanedPath); err != nil {
+            return fmt.Errorf("cannot delete directory %q: %w", data.Path, err)
+        }
+        return nil
+    }
+
+    if err := os.Remove(cleanedPath); err != nil {
+        return fmt.Errorf("cannot delete file %q: %w", data.Path, err)
     }
     return nil
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9dc8ebf and 7121868.

📒 Files selected for processing (4)
  • pkg/remote/fileshare/wavefs/wavefs.go (5 hunks)
  • pkg/util/fileutil/fileutil.go (2 hunks)
  • pkg/util/tarcopy/tarcopy.go (1 hunks)
  • pkg/wshrpc/wshremote/wshremote.go (8 hunks)
🧰 Additional context used
📓 Learnings (1)
pkg/remote/fileshare/wavefs/wavefs.go (1)
Learnt from: esimkowitz
PR: wavetermdev/waveterm#1725
File: pkg/remote/fileshare/wavefs/wavefs.go:441-494
Timestamp: 2025-01-29T04:21:11.649Z
Learning: The `CopyRemote` function in WaveTerm's file operations has proper error handling that prevents partial writes by failing fast and using context cancellation. Each step (path cleaning, file operations, tar reading/writing) is guarded by error checks that prevent proceeding with writes on error.
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
  • GitHub Check: Build for TestDriver.ai
🔇 Additional comments (12)
pkg/util/fileutil/fileutil.go (3)

15-15: LGTM! Required imports added for new functionality.

The new imports are necessary for timestamp handling and RPC type conversion.

Also applies to: 18-18


22-22: LGTM! Improved error handling in FixPath.

The error variable is now properly scoped, and the path normalization logic remains correct.

Also applies to: 26-26


200-200: LGTM! Good practice using type assertion.

The type assertion ensures the struct properly implements the fs.FileInfo interface at compile time.

pkg/util/tarcopy/tarcopy.go (2)

1-32: LGTM! Well-structured package setup.

The package declaration, imports, and constants are well-organized. The retry parameters and debug names are clearly defined and follow Go conventions.


119-134: LGTM! Well-implemented resource cleanup.

The gracefulClose function provides robust error handling with:

  • Proper retry logic with configurable parameters
  • Informative logging
  • Clear success/failure indication
pkg/remote/fileshare/wavefs/wavefs.go (4)

7-24: LGTM! Well-organized imports.

The new imports are properly organized and necessary for implementing the tar streaming and file operations functionality.


178-185: LGTM! Improved logging and nil safety.

The added logging and nil check for opts enhance debugging capabilities and prevent potential nil pointer dereferences.


385-385: LGTM! Better error handling.

Returning errors.ErrUnsupported is more explicit and helps clients handle the unsupported operation correctly.


441-494: LGTM! Robust implementation with proper error handling.

The implementation includes proper error handling, context cancellation, and prevents partial writes by failing fast. The tar streaming approach is well-suited for remote copy operations.

pkg/wshrpc/wshremote/wshremote.go (3)

21-28: LGTM! Import changes align with new functionality.

The new imports support the tar streaming functionality and enhanced file operations.


670-670: LGTM! Clear error message for cross-host move operations.

The error message clearly indicates that moving files between different hosts is not supported.


274-277: 🛠️ Refactor suggestion

Improve error handling in filepath.Walk.

The error handling for the Walk function could be more robust by checking the error before proceeding.

 err := filepath.Walk(path, func(file string, fi os.FileInfo, err error) error {
     if readerCtx.Err() != nil {
         return readerCtx.Err()
     }
+    if err != nil {
+        return fmt.Errorf("error accessing %q: %w", file, err)
+    }
-    if err != nil {
-        return err
-    }

Likely invalid or redundant comment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (8)
pkg/wshrpc/wshremote/wshremote.go (3)

297-304: Improve error handling in the goroutine.

The error handling could be improved by checking for context cancellation before sending the error.

Apply this diff to enhance error handling:

-       err = nil
-       if finfo.IsDir() {
-           err = filepath.Walk(path, walkFunc)
-       } else {
-           err = walkFunc(path, finfo, nil)
-       }
-       if err != nil {
-           rtn <- wshutil.RespErr[iochantypes.Packet](err)
-       }
+       err = nil
+       if finfo.IsDir() {
+           err = filepath.Walk(path, walkFunc)
+       } else {
+           err = walkFunc(path, finfo, nil)
+       }
+       if err != nil && readerCtx.Err() == nil {
+           rtn <- wshutil.RespErr[iochantypes.Packet](err)
+       }

428-436: Add deferred close for file operations.

File operations should use deferred close to ensure proper cleanup.

Apply this diff:

     file, err := os.OpenFile(nextPath, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, finfo.Mode())
     if err != nil {
         return fmt.Errorf("cannot create new file %q: %w", nextPath, err)
     }
+    defer file.Close()
     _, err = io.Copy(file, reader)
     if err != nil {
         return fmt.Errorf("cannot write file %q: %w", nextPath, err)
     }
-    file.Close()

777-779: Enhance error message for recursive deletion.

The error message could be more descriptive to guide users on how to enable recursive deletion.

Apply this diff:

-               return fmt.Errorf("cannot delete directory %q, recursive option not specified", data.Path)
+               return fmt.Errorf("cannot delete directory %q: directory not empty, use -r flag for recursive deletion", data.Path)
pkg/util/tarcopy/tarcopy.go (5)

4-5: Enhance package documentation.

Consider expanding the package documentation to include:

  • Security considerations when handling tar streams
  • Example usage patterns
  • Relationship with other packages like iochan and wshrpc

38-38: Consider adding chunkSize parameter for flexibility.

The function uses a hardcoded wshrpc.FileChunkSize but could benefit from a configurable chunk size parameter for different use cases.


52-55: Strengthen path validation.

While validatePath checks for basic path traversal, consider additional checks:

  • Normalize path separators
  • Check for null bytes
  • Validate against a maximum path length
-			header.Name = filepath.Clean(strings.TrimPrefix(file, pathPrefix))
-			if err := validatePath(header.Name); err != nil {
+			normalizedPath := filepath.ToSlash(filepath.Clean(strings.TrimPrefix(file, pathPrefix)))
+			if err := validatePath(normalizedPath); err != nil {
+				return err
+			}
+			if len(normalizedPath) > 256 {
+				return fmt.Errorf("path too long: %s", normalizedPath)
+			}
+			if strings.Contains(normalizedPath, "\x00") {
+				return fmt.Errorf("path contains null byte: %s", normalizedPath)
+			}
+			header.Name = normalizedPath

68-76: Enhance path validation with additional checks.

Consider adding checks for:

  • Empty paths
  • Windows-style paths (e.g., C:\)
  • Unicode normalization
  • Maximum path segments
 func validatePath(path string) error {
+	if path == "" {
+		return fmt.Errorf("empty path")
+	}
 	if strings.Contains(path, "..") {
 		return fmt.Errorf("invalid tar path containing directory traversal: %s", path)
 	}
 	if strings.HasPrefix(path, "/") {
 		return fmt.Errorf("invalid tar path starting with /: %s", path)
 	}
+	if strings.Contains(path, ":") {
+		return fmt.Errorf("invalid tar path containing drive letter: %s", path)
+	}
+	if len(strings.Split(path, "/")) > 100 {
+		return fmt.Errorf("too many path segments: %s", path)
+	}
 	return nil
 }

112-114: Improve error message for tar stream reading failures.

Add more context to the error return.

-					return err
+					return fmt.Errorf("failed to read tar stream: %w", err)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7121868 and eadc3dd.

📒 Files selected for processing (3)
  • pkg/util/tarcopy/tarcopy.go (1 hunks)
  • pkg/wshrpc/wshremote/wshremote.go (7 hunks)
  • pkg/wshrpc/wshrpctypes.go (6 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/wshrpc/wshrpctypes.go
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
  • GitHub Check: Build for TestDriver.ai
🔇 Additional comments (5)
pkg/wshrpc/wshremote/wshremote.go (3)

21-28: LGTM! Import changes are appropriate.

The new imports support the tar streaming functionality and enhanced file operations.


262-267: LGTM! Timeout handling is well-implemented.

The timeout configuration with fallback to default value and proper context management is correct.


274-282: ⚠️ Potential issue

Fix error variable shadowing in walkFunc.

The error variable is shadowed in the walkFunc, which could mask errors from filepath.Walk.

Apply this diff to fix the error handling:

-           if err = writeHeader(info, path); err != nil {
+           if writeErr := writeHeader(info, path); writeErr != nil {
-               return err
+               return writeErr
            }

Likely invalid or redundant comment.

pkg/util/tarcopy/tarcopy.go (2)

24-32: LGTM! Well-organized constants.

The constants are well-named and their values are appropriate for their purposes.


124-139: LGTM! Robust resource cleanup implementation.

The gracefulClose function has:

  • Well-implemented retry logic
  • Good error logging
  • Appropriate use of constants

pkg/util/tarcopy/tarcopy.go Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (11)
pkg/remote/connparse/connparse_test.go (2)

193-215: Add t.Parallel() for consistent test parallelization.

The test follows the established pattern and correctly validates Unix-style relative paths. However, it's missing the t.Parallel() call that's present in other test functions.

 func TestParseURI_WSHCurrentPath(t *testing.T) {
+    t.Parallel()
     cstr := "./Documents/path/to/file"

217-239: Add t.Parallel() for consistent test parallelization.

The test correctly handles Windows-style path separators which is crucial for cross-platform compatibility. However, it's missing the t.Parallel() call that's present in other test functions.

 func TestParseURI_WSHCurrentPathWindows(t *testing.T) {
+    t.Parallel()
     cstr := ".\\Documents\\path\\to\\file"
pkg/remote/fileshare/wavefs/wavefs.go (3)

109-162: LGTM! Well-implemented tar streaming with proper error handling.

The implementation includes:

  • Proper context management with timeouts
  • Detailed logging
  • Error handling at each step

Consider making file mode configurable.

The file mode is currently hardcoded to 0644 (line 136). Consider making this configurable through the options parameter.

-			file.Mode = 0644
+			if opts.FileMode != 0 {
+				file.Mode = opts.FileMode
+			} else {
+				file.Mode = 0644
+			}

502-537: Document partial failure behavior in recursive deletion.

The implementation collects errors during recursive deletion but continues with remaining files. Consider documenting this behavior in the function comment to make it clear that:

  1. The operation is not atomic
  2. Some files may be deleted even if the operation returns an error
+// Delete removes a file or directory. When recursive is true, it attempts to delete
+// all entries under the given path. The operation is not atomic - if an error occurs
+// during recursive deletion, some files may have been deleted while others remain.
 func (c WaveClient) Delete(ctx context.Context, conn *connparse.Connection, recursive bool) error {

578-586: Add validation in getPathPrefix.

Consider adding validation for the connection parameter and documenting the function's behavior.

+// getPathPrefix returns the directory prefix for the given connection.
+// For a file path, it returns the directory containing the file.
+// For a directory path, it returns the directory path with a trailing slash.
 func getPathPrefix(conn *connparse.Connection) string {
+	if conn == nil {
+		return ""
+	}
 	fullUri := conn.GetFullURI()
 	pathPrefix := fullUri
 	lastSlash := strings.LastIndex(fullUri, "/")
pkg/util/tarcopy/tarcopy.go (3)

38-38: Consider adding error return for potential pipe creation failures.

The function signature could be improved by returning an error to handle potential failures in pipe creation or tar writer initialization.

-func TarCopySrc(ctx context.Context, pathPrefix string) (outputChan chan wshrpc.RespOrErrorUnion[iochantypes.Packet], writeHeader func(fi fs.FileInfo, file string) error, writer io.Writer, close func()) {
+func TarCopySrc(ctx context.Context, pathPrefix string) (outputChan chan wshrpc.RespOrErrorUnion[iochantypes.Packet], writeHeader func(fi fs.FileInfo, file string) error, writer io.Writer, close func(), err error) {

68-76: Consider additional path validations.

While the current validation is good, consider adding checks for:

  1. Empty paths
  2. Windows-style absolute paths (e.g., C:\)
 func validatePath(path string) error {
+    if path == "" {
+        return fmt.Errorf("empty tar path")
+    }
     if strings.Contains(path, "..") {
         return fmt.Errorf("invalid tar path containing directory traversal: %s", path)
     }
     if strings.HasPrefix(path, "/") {
         return fmt.Errorf("invalid tar path starting with /: %s", path)
     }
+    if filepath.VolumeName(path) != "" {
+        return fmt.Errorf("invalid tar path with volume name: %s", path)
+    }
     return nil
 }

103-113: Improve error specificity in tar reading.

The error handling could be more specific about the type of error encountered during tar reading.

             if err != nil {
                 // Do one more check for context error before returning
                 if ctx.Err() != nil {
                     return context.Cause(ctx)
                 }
                 if errors.Is(err, io.EOF) {
                     return nil
                 } else {
-                    return err
+                    return fmt.Errorf("failed to read tar entry: %w", err)
                 }
             }
pkg/wshrpc/wshremote/wshremote.go (1)

777-791: Optimize error handling in delete operation.

The current implementation makes an unnecessary os.Stat call after os.Remove fails. We can improve this by checking if the error indicates that the path is a directory.

 err = os.Remove(cleanedPath)
 if err != nil {
-  finfo, _ := os.Stat(cleanedPath)
-  if finfo != nil && finfo.IsDir() {
+  if errors.Is(err, syscall.EISDIR) {
     if !data.Recursive {
       return fmt.Errorf("cannot delete directory %q, recursive option not specified", data.Path)
     }
     err = os.RemoveAll(cleanedPath)
     if err != nil {
       return fmt.Errorf("cannot delete directory %q: %w", data.Path, err)
     }
   } else {
     return fmt.Errorf("cannot delete file %q: %w", data.Path, err)
   }

Note: Add "syscall" to the imports.

pkg/util/fileutil/fileutil.go (2)

Line range hint 23-33: Consider using debug level for path-related logs.

The logging statements are helpful for debugging but might be too verbose for production. Consider using a debug log level or making the logging configurable.

-		log.Printf("FixPath: path is not absolute: %s", path)
+		log.Debug().Msgf("FixPath: path is not absolute: %s", path)
-		log.Printf("FixPath: fixed path: %s", path)
+		log.Debug().Msgf("FixPath: fixed path: %s", path)

171-201: Add documentation for the FsFileInfo type and its methods.

While the implementation is correct, adding documentation would improve maintainability. Consider documenting:

  1. The purpose of the type
  2. The meaning of the "Internal" suffix in field names
  3. Why Sys() returns nil
+// FsFileInfo implements fs.FileInfo interface to provide file information
+// for remote file operations. Field names are suffixed with "Internal" to
+// avoid confusion with method names.
 type FsFileInfo struct {
     NameInternal    string
     ModeInternal    os.FileMode
     SizeInternal    int64
     ModTimeInternal int64
     IsDirInternal   bool
 }

+// Name returns the base name of the file
 func (f FsFileInfo) Name() string {
     return f.NameInternal
 }

+// Size returns the length in bytes for regular files
 func (f FsFileInfo) Size() int64 {
     return f.SizeInternal
 }

+// Mode returns the file mode bits
 func (f FsFileInfo) Mode() os.FileMode {
     return f.ModeInternal
 }

+// ModTime returns the modification time
 func (f FsFileInfo) ModTime() time.Time {
     return time.Unix(0, f.ModTimeInternal)
 }

+// IsDir returns whether the file is a directory
 func (f FsFileInfo) IsDir() bool {
     return f.IsDirInternal
 }

+// Sys returns nil as underlying data source is not available
 func (f FsFileInfo) Sys() interface{} {
     return nil
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eadc3dd and 3b28f01.

📒 Files selected for processing (6)
  • pkg/remote/connparse/connparse.go (2 hunks)
  • pkg/remote/connparse/connparse_test.go (1 hunks)
  • pkg/remote/fileshare/wavefs/wavefs.go (5 hunks)
  • pkg/util/fileutil/fileutil.go (2 hunks)
  • pkg/util/tarcopy/tarcopy.go (1 hunks)
  • pkg/wshrpc/wshremote/wshremote.go (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/remote/connparse/connparse.go
🧰 Additional context used
📓 Learnings (1)
pkg/remote/fileshare/wavefs/wavefs.go (1)
Learnt from: esimkowitz
PR: wavetermdev/waveterm#1725
File: pkg/remote/fileshare/wavefs/wavefs.go:441-494
Timestamp: 2025-01-29T04:21:11.649Z
Learning: The `CopyRemote` function in WaveTerm's file operations has proper error handling that prevents partial writes by failing fast and using context cancellation. Each step (path cleaning, file operations, tar reading/writing) is guarded by error checks that prevent proceeding with writes on error.
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
  • GitHub Check: Build for TestDriver.ai
🔇 Additional comments (13)
pkg/remote/connparse/connparse_test.go (1)

193-239: LGTM! Well-structured tests for cross-platform path handling.

The new tests comprehensively validate URI parsing for both Unix and Windows relative paths, which is essential for the PR's objective of enabling cross-remote copy/move operations. The tests maintain consistent behavior with the existing test suite while properly handling platform-specific path formats.

pkg/remote/fileshare/wavefs/wavefs.go (5)

7-34: LGTM! Well-structured imports and timeout constant.

The new imports support the tar streaming functionality, and the DefaultTimeout constant provides a good default value for operations.


182-189: LGTM! Improved robustness with nil checks.

Good addition of logging and nil checks for the opts parameter.


389-389: LGTM! Clear indication of unsupported operation.

Good change to return errors.ErrUnsupported instead of nil.


392-443: LGTM! Well-structured internal move and copy operations.

Good separation of concerns between internal and remote operations. Proper error handling and host validation.


445-498: LGTM! Robust implementation of remote copy with proper error handling.

The implementation includes:

  • Context cancellation for proper cleanup
  • Detailed logging
  • Proper error handling at each step
  • Event publishing for file system changes
pkg/util/tarcopy/tarcopy.go (4)

1-32: Well-structured package with clear organization!

The package follows Go best practices with proper license headers, organized imports, and well-named constants.


52-55: Robust path validation implementation!

The combination of filepath.Clean, strings.TrimPrefix, and validatePath provides strong protection against path traversal attacks.


97-99: Remove duplicate context error check.

The context error check is duplicated. The second check at lines 105-107 is redundant.

Also applies to: 105-107


122-137: Well-implemented resource cleanup with retry logic!

The gracefulClose function provides robust resource cleanup with:

  • Clear retry logic with configurable limits
  • Informative debug logging
  • Proper status reporting
pkg/wshrpc/wshremote/wshremote.go (2)

686-686: LGTM!

The error message for cross-host move operations is clear and appropriate.


278-299: ⚠️ Potential issue

Fix error variable shadowing and improve error handling in walkFunc.

The error variable is shadowed in the walkFunc, which could mask errors from filepath.Walk. Also, the error handling can be improved.

-walkFunc := func(path string, info fs.FileInfo, err error) error {
-  if readerCtx.Err() != nil {
-    return readerCtx.Err()
-  }
-  if err != nil {
-    return err
-  }
-  if err = writeHeader(info, path); err != nil {
+walkFunc := func(path string, info fs.FileInfo, walkErr error) error {
+  if readerCtx.Err() != nil {
+    return readerCtx.Err()
+  }
+  if walkErr != nil {
+    return fmt.Errorf("walk error at %q: %w", path, walkErr)
+  }
+  if err := writeHeader(info, path); err != nil {

Likely invalid or redundant comment.

pkg/util/fileutil/fileutil.go (1)

16-16: LGTM!

The new imports are properly organized and necessary for the added functionality.

Also applies to: 19-19

Comment on lines +432 to 440
file, err := os.OpenFile(nextPath, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, finfo.Mode())
if err != nil {
return fmt.Errorf("cannot create new file %q: %w", nextPath, err)
}
_, err = io.Copy(file, tarReader)
_, err = io.Copy(file, reader)
if err != nil {
return fmt.Errorf("cannot write file %q: %w", nextPath, err)
}
file.Chmod(finfo.Mode())
file.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve file handling with deferred close.

The file.Close() call should be deferred immediately after the file is opened to prevent resource leaks in case of errors.

 file, err := os.OpenFile(nextPath, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, finfo.Mode())
 if err != nil {
   return fmt.Errorf("cannot create new file %q: %w", nextPath, err)
 }
+defer file.Close()
 _, err = io.Copy(file, reader)
 if err != nil {
   return fmt.Errorf("cannot write file %q: %w", nextPath, err)
 }
-file.Close()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
file, err := os.OpenFile(nextPath, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, finfo.Mode())
if err != nil {
return fmt.Errorf("cannot create new file %q: %w", nextPath, err)
}
_, err = io.Copy(file, tarReader)
_, err = io.Copy(file, reader)
if err != nil {
return fmt.Errorf("cannot write file %q: %w", nextPath, err)
}
file.Chmod(finfo.Mode())
file.Close()
file, err := os.OpenFile(nextPath, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, finfo.Mode())
if err != nil {
return fmt.Errorf("cannot create new file %q: %w", nextPath, err)
}
defer file.Close()
_, err = io.Copy(file, reader)
if err != nil {
return fmt.Errorf("cannot write file %q: %w", nextPath, err)
}

pkg/wshrpc/wshremote/wshremote.go Outdated Show resolved Hide resolved
pkg/util/fileutil/fileutil.go Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
pkg/util/fileutil/fileutil.go (1)

202-210: 🛠️ Refactor suggestion

Add nil check and documentation for ToFsFileInfo.

The conversion function should validate its input and be documented.

+// ToFsFileInfo converts wshrpc.FileInfo to FsFileInfo.
+// It panics if fi is nil.
 func ToFsFileInfo(fi *wshrpc.FileInfo) FsFileInfo {
+    if fi == nil {
+        panic("ToFsFileInfo: nil FileInfo")
+    }
     return FsFileInfo{
         NameInternal:    fi.Name,
         ModeInternal:    fi.Mode,
         SizeInternal:    fi.Size,
         ModTimeInternal: fi.ModTime,
         IsDirInternal:   fi.IsDir,
     }
 }
🧹 Nitpick comments (1)
pkg/util/fileutil/fileutil.go (1)

168-174: Add documentation for FsFileInfo type.

Please add documentation explaining the purpose of this type and its relationship to fs.FileInfo interface.

+// FsFileInfo implements fs.FileInfo interface and provides a way to represent file information
+// in a format that can be easily serialized and transmitted between different parts of the system.
 type FsFileInfo struct {
     NameInternal    string
     ModeInternal    os.FileMode
     SizeInternal    int64
     ModTimeInternal int64
     IsDirInternal   bool
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3b28f01 and 317dae4.

📒 Files selected for processing (1)
  • pkg/util/fileutil/fileutil.go (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Build for TestDriver.ai
  • GitHub Check: Analyze (go)
🔇 Additional comments (4)
pkg/util/fileutil/fileutil.go (4)

15-15: LGTM! Required imports added.

The new imports support the added functionality - time for ModTime operations and wshrpc for FileInfo type conversion.

Also applies to: 18-18


22-22: LGTM! Improved error handling.

Good changes to prevent variable shadowing and properly handle filepath.Abs errors.

Also applies to: 26-29


200-200: LGTM! Good compile-time interface check.

The type assertion ensures FsFileInfo properly implements fs.FileInfo interface.


188-190: Add validation for ModTime conversion.

The ModTime conversion from nanoseconds should handle potential overflow cases.

 func (f FsFileInfo) ModTime() time.Time {
+    if f.ModTimeInternal < 0 {
+        return time.Time{}
+    }
     return time.Unix(0, f.ModTimeInternal)
 }

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
pkg/wshutil/wshutil.go (1)

487-488: LGTM! Consider documenting the cleanup sequence.

The addition of channel closing in the cleanup phase is correct and follows Go's best practices for resource management. This ensures proper cleanup of communication channels when the connection is terminated.

Consider adding a comment to document the cleanup sequence:

 defer func() {
     conn.Close()
+    // Close communication channels to prevent resource leaks
     close(proxy.FromRemoteCh)
     close(proxy.ToRemoteCh)
pkg/wshutil/wshrpcio.go (1)

25-42: Consider documenting the cleanup pattern for reuse.

This cleanup pattern is valuable for streaming operations. Consider:

  1. Adding comments explaining the pattern for future maintainers
  2. Creating a reusable helper if this pattern is needed elsewhere in the codebase
  3. Documenting this as a best practice for handling streaming errors
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 317dae4 and 9265cd0.

📒 Files selected for processing (2)
  • pkg/wshutil/wshrpcio.go (1 hunks)
  • pkg/wshutil/wshutil.go (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Build for TestDriver.ai
  • GitHub Check: Analyze (go)
🔇 Additional comments (2)
pkg/wshutil/wshrpcio.go (2)

36-37: LGTM! Comprehensive error handling with proper cleanup.

The error handling is well-implemented:

  • Sets cleanup flag before returning
  • Wraps errors with descriptive context
  • Consistently handles both data and newline write errors

Also applies to: 41-42


25-33: LGTM! Clean error handling with proper resource cleanup.

The deferred cleanup mechanism is well-implemented:

  • Prevents goroutine leaks by only starting cleanup goroutine when needed
  • Ensures channel is drained on errors, preventing blocked goroutines
  • No race conditions as the drain flag is properly scoped

Let's verify the cleanup behavior:

✅ Verification successful

Cleanup pattern is consistently implemented across the codebase

The implementation follows the established pattern in the codebase:

  • Similar cleanup mechanism already exists in pkg/web/web.go
  • Both use flag-controlled deferred cleanup
  • Both prevent goroutine leaks in streaming operations
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify that this cleanup pattern is consistently used across similar streaming operations

# Look for other streaming operations that might benefit from this pattern
rg -l "chan \[\]byte.*io\.Writer" --type go

# Check for potential goroutine leaks in error handling
ast-grep --pattern 'go func() {
  for range $_ {
  }
}()'

Length of output: 412


Script:

#!/bin/bash
# Get context around the cleanup pattern in web.go
rg "for range rtnCh" -B 10 -A 5 pkg/web/web.go

# Check error handling patterns around both cleanup locations
ast-grep --pattern 'if err != nil {
  $$$
  go func() {
    for range $_ {
    }
  }()
  $$$
}'

Length of output: 466

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
pkg/util/iochan/iochan.go (5)

21-22: Consider using a buffered channel with dynamic size.

The channel buffer size is hardcoded to 32, which might not be optimal for all scenarios. Consider making it configurable based on the expected data size or system resources.


37-50: Optimize buffer allocation and enhance error handling.

Several improvements could be made to this section:

  1. The buffer is always allocated with full chunkSize even for smaller reads
  2. Consider using sync.Pool for buffer reuse in high-throughput scenarios
  3. The checksum computation happens after data is sent, which could lead to inconsistency

Here's a suggested improvement:

-       buf := make([]byte, chunkSize)
+       // Use buffer pool for better memory management
+       buf := make([]byte, min(chunkSize, 32*1024)) // Start with smaller buffer
        if n, err := r.Read(buf); err != nil {
            if errors.Is(err, io.EOF) {
                ch <- wshrpc.RespOrErrorUnion[iochantypes.Packet]{Response: iochantypes.Packet{Checksum: sha256Hash.Sum(nil)}}
                return
            }
            ch <- wshutil.RespErr[iochantypes.Packet](fmt.Errorf("ReaderChan: read error: %v", err))
            return
        } else if n > 0 {
+           // Compute checksum before sending data
            if _, err := sha256Hash.Write(buf[:n]); err != nil {
                ch <- wshutil.RespErr[iochantypes.Packet](fmt.Errorf("ReaderChan: error writing to sha256 hash: %v", err))
                return
            }
            ch <- wshrpc.RespOrErrorUnion[iochantypes.Packet]{Response: iochantypes.Packet{Data: buf[:n]}}
        }

59-64: Consider adding timeout handling for slow writers.

The context handling is good, but consider adding a timeout mechanism for slow writers to prevent resource exhaustion.

+       writeTimeout := time.NewTimer(30 * time.Second)
+       defer writeTimeout.Stop()
        for {
            select {
            case <-ctx.Done():
                return
+           case <-writeTimeout.C:
+               cancel(fmt.Errorf("WriterChan: write timeout exceeded"))
+               return
            case resp, ok := <-ch:
+               writeTimeout.Reset(30 * time.Second)

101-105: Consider adding completion notification for channel draining.

The current implementation launches a goroutine but provides no way to know when draining is complete. This could be important for cleanup operations.

-func drainChannel(ch <-chan wshrpc.RespOrErrorUnion[iochantypes.Packet]) {
+func drainChannel(ch <-chan wshrpc.RespOrErrorUnion[iochantypes.Packet]) <-chan struct{} {
+       done := make(chan struct{})
        go func() {
                for range ch {
                }
+               close(done)
        }()
+       return done
}

Line range hint 1-105: Overall solid implementation with good error handling and data integrity checks.

The changes successfully implement the streaming functionality needed for remote copy/move operations with proper checksum verification. The error handling using context cancellation is well thought out.

A few architectural considerations for future improvements:

  1. Buffer pooling for better memory management in high-throughput scenarios
  2. Progressive checksum verification for large streams
  3. Configurable timeouts and buffer sizes
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9265cd0 and 0b9bdd5.

📒 Files selected for processing (1)
  • pkg/util/iochan/iochan.go (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
  • GitHub Check: Build for TestDriver.ai
🔇 Additional comments (2)
pkg/util/iochan/iochan.go (2)

8-17: LGTM! Well-structured imports for the new functionality.

The addition of crypto/sha256 for checksums and the new iochantypes package shows good separation of concerns.


84-91: Enhance checksum verification process.

The checksum verification happens only after processing all data. Consider implementing progressive verification or intermediate checksums for large streams.

Run this script to analyze the typical data sizes being processed:

esimkowitz and others added 4 commits January 30, 2025 17:33
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@esimkowitz esimkowitz merged commit 902ff9b into main Jan 31, 2025
5 of 6 checks passed
@esimkowitz esimkowitz deleted the evan/combine-wsh-file-commands branch January 31, 2025 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants