Skip to content

feat(websocket): add storage selection feature to enhance flexibility #626

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Nov 13, 2024

Conversation

kasugamirai
Copy link
Member

@kasugamirai kasugamirai commented Nov 12, 2024

Overview

What I've done

What I haven't done

How I tested

Screenshot

Which point I want you to review particularly

Memo

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced support for Google Cloud Storage (GCS) and local storage through conditional compilation.
    • Added new features: default, gcs-storage, and local-storage across various components.
    • Enhanced repository management for project storage with flexible configuration options.
    • Added a new GcsClient for GCS operations, including upload, download, and version management.
    • Updated example services to utilize the new storage repository initialization logic.
  • Bug Fixes

    • Improved error handling for GCS and local storage operations.
    • Added a new error variant for GCS-related issues in the application's error handling system.

These updates enhance the application's storage capabilities and provide more flexibility for users managing project data.

@kasugamirai kasugamirai requested a review from pyshx November 12, 2024 08:52
Copy link

netlify bot commented Nov 12, 2024

Deploy Preview for reearth-flow ready!

Name Link
🔨 Latest commit a88e787
🔍 Latest deploy log https://app.netlify.com/sites/reearth-flow/deploys/6733a2e929b84f0008b4da4b
😎 Deploy Preview https://deploy-preview-626--reearth-flow.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link
Contributor

coderabbitai bot commented Nov 12, 2024

Walkthrough

The pull request introduces several modifications across multiple files, primarily focusing on the implementation of storage features within a WebSocket application. Key changes include the addition of new features in Cargo.toml files for both the application and infrastructure, along with the introduction of a GcsClient for Google Cloud Storage. The storage mechanism in the application state has been updated to use a more flexible ProjectStorageRepository, allowing for conditional compilation based on selected storage features. Additionally, the .gitignore file has been updated to exclude .DS_Store files.

Changes

File Change Summary
websocket/.gitignore Added entry for .DS_Store to exclude macOS-specific files.
websocket/app/Cargo.toml Introduced new [features] section with default, gcs-storage, and local-storage. Updated flow-websocket-infra and flow-websocket-services dependencies to disable default features.
websocket/app/src/state.rs Replaced ProjectLocalRepository with ProjectStorageRepository. Updated new method to initialize storage based on local-storage feature.
websocket/crates/infra/Cargo.toml Added new [features] section with default, gcs-storage, and local-storage. Updated google-cloud-storage dependency to be optional.
websocket/crates/infra/src/persistence/gcs/gcs_client.rs Introduced GcsClient struct and methods for GCS operations. Extended GcsError enum for GCS-specific errors.
websocket/crates/infra/src/persistence/project_repository.rs Added conditional compilation for ProjectStorageRepository. Introduced ProjectGcsRepository and ProjectLocalRepository for managing snapshots.
websocket/crates/services/Cargo.toml Added new [features] section with default, gcs-storage, and local-storage. Updated flow-websocket-infra dependency to disable default features. Added example edit_session_service.
websocket/crates/services/examples/edit_session_service.rs Replaced ProjectLocalRepository with ProjectStorageRepository for storage management.
websocket/crates/services/examples/project_service.rs Replaced ProjectLocalRepository with ProjectStorageRepository for project storage management.
websocket/crates/infra/src/persistence/mod.rs Added conditional exports for ProjectGcsRepository and ProjectLocalRepository based on feature flags.
websocket/app/src/errors.rs Added new error variant GcsStorage to WsError enum for handling GCS-related errors.
websocket/crates/infra/src/persistence/redis/flow_project_redis_data_manager.rs Enhanced push_update method to include immediate merge updates after pushing to Redis stream.
websocket/Cargo.toml Updated flow-websocket-infra and flow-websocket-services dependencies to disable default features.

Possibly related PRs

Suggested labels

engine

Suggested reviewers

  • pyshx

🐇 In the code, we hop and play,
New features bloom, brightening the way.
With GCS and local storage in sight,
Our project grows, taking flight!
.DS_Store is gone, a clean slate,
Hooray for changes, oh, isn’t it great? 🐇✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@kasugamirai kasugamirai force-pushed the feat/storage branch 2 times, most recently from 955c6d6 to d54b935 Compare November 12, 2024 09:07
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (10)
websocket/app/Cargo.toml (1)

12-15: LGTM! Well-structured feature flags for storage backends.

The feature flags are well-organized, making the storage backend selection flexible while maintaining local storage as the default. This aligns with the PR's objective of enhancing flexibility.

Consider documenting these features in the README to help other developers understand:

  • Available storage options
  • How to enable different storage backends
  • Default configuration
websocket/crates/infra/Cargo.toml (1)

13-16: Consider adding feature documentation.

The feature flag structure looks good and follows Rust best practices. However, consider adding documentation comments above the [features] section to explain the purpose and implications of each storage option.

Add documentation like this:

+# Storage features:
+# - local-storage: Use local filesystem for storage (default)
+# - gcs-storage: Enable Google Cloud Storage integration
 [features]
 default = ["local-storage"]  
 gcs-storage = ["dep:google-cloud-storage"] 
 local-storage = []
websocket/crates/services/Cargo.toml (1)

40-42: Consider translating the Japanese comment to English

The example configuration is correct, but there's a Japanese comment that should be in English for consistency and broader understanding.

 [[example]]
 name = "edit_session_service"
-required-features = ["local-storage"]  # 指定示例需要的特性
+required-features = ["local-storage"]  # Specify required features for the example
websocket/app/src/state.rs (1)

Storage implementation needs test coverage and documentation improvements

Based on the verification results:

  • No tests exist for the storage implementation
  • Storage features (local-storage, gcs-storage) lack configuration documentation
  • The websocket README is minimal and doesn't cover storage configuration

Action items:

  • Add unit tests for ProjectStorageRepository implementation
  • Add integration tests covering both storage features
  • Update websocket/README.md with:
    • Storage feature configuration guide
    • Example configurations for both local and GCS storage
    • Feature flag documentation
  • Add logging statements during storage initialization
🔗 Analysis chain

Line range hint 44-72: Consider adding comprehensive testing and documentation

Given that storage selection is a critical feature:

  1. Add integration tests covering different storage configurations
  2. Document the storage feature selection process in the project's README
  3. Consider adding a configuration guide for each storage option
  4. Add logging for storage initialization to aid in debugging

Let's check for existing tests and documentation:


Let me check the content of relevant README files and search for any existing storage-related code to make a more informed decision.


Let me check for the storage implementation details and any existing test patterns in the codebase to make a final assessment.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for storage-related tests
rg -l "test.*storage" websocket/app/src/
rg -l "test.*storage" websocket/app/tests/

# Check for documentation
fd -g "README*" .
fd -g "CONTRIBUTING*" .
fd -g "CONFIGURATION*" .

Length of output: 473


Script:

#!/bin/bash
# Check content of websocket README
cat websocket/README.md

# Look for storage-related code and tests
rg -g "*.rs" "storage" websocket/app/src/
rg -g "*.rs" "#\[test\]" websocket/app/src/

# Check if there's a tests directory in the root
fd "tests" .

Length of output: 1585


Script:

#!/bin/bash
# Check storage implementation
ast-grep --pattern 'struct ProjectStorageRepository' 

# Look for test patterns in the codebase
rg -g "*.rs" "mod tests" websocket/app/src/
rg -g "*.rs" -B 2 -A 5 "ProjectStorageRepository" websocket/app/src/

# Check for any configuration or feature documentation
rg -g "*.{rs,toml}" "feature.*storage" websocket/app/

Length of output: 2747

websocket/crates/services/examples/project_service.rs (1)

Line range hint 14-14: Enhance example documentation

The example would benefit from additional documentation explaining:

  • Available storage features and their configuration
  • Required environment variables or configuration for each storage type
  • Prerequisites for running the example

Consider adding a documentation block like this:

 ///RUST_LOG=debug cargo run --example project_service
+/// 
+/// # Storage Configuration
+/// This example supports multiple storage backends:
+/// 
+/// ## Local Storage
+/// Enable with `--features local-storage`
+/// Stores data in ./local_storage directory
+/// 
+/// ## GCS Storage
+/// Enable with `--features gcs-storage`
+/// Requires:
+/// - GCS bucket configuration
+/// - Appropriate GCP credentials
websocket/crates/infra/src/persistence/gcs/gcs_client.rs (4)

Line range hint 83-127: Potential race conditions when updating metadata in upload_versioned

In the upload_versioned method, the metadata file (metadata_path) is read and written without any synchronization mechanisms. If multiple instances or threads invoke this method concurrently, it could lead to race conditions, resulting in inconsistent or corrupted metadata.

Consider implementing concurrency control strategies to handle simultaneous updates safely:

  • Optimistic Concurrency Control: Use conditional updates based on versioning or ETags provided by GCS to ensure that the metadata is only updated if it hasn't changed since it was last read.
  • Distributed Locking: Implement a distributed lock mechanism using GCS or an external service like Redis to serialize access to the metadata file.
  • Atomic Operations: Leverage atomic operations or transactions if supported by GCS to ensure that read-modify-write cycles are atomic.

Addressing this issue will enhance the robustness and reliability of the versioning system.


Line range hint 50-127: Leverage GCS's native object versioning for simplified management

Google Cloud Storage supports Object Versioning, which can automatically keep track of object generations without manual metadata management. By enabling this feature on your bucket, you can:

  • Automatically retain previous versions of objects.
  • Simplify the code by removing the need for the VersionMetadata struct and manual version tracking.
  • Reduce potential errors related to concurrency and metadata corruption.

Adjusting your implementation to utilize GCS's native versioning could improve performance and reduce complexity.


47-47: Use Vec instead of BTreeMap for version history if ordering is sufficient

The version_history field uses a BTreeMap<i64, String> to maintain ordered versions based on timestamps. If the primary requirement is to maintain the order of versions and access them sequentially, a Vec<(i64, String)> might be more efficient:

  • Simpler Data Structure: A vector is simpler and has less overhead than a map when keys are sequentially ordered timestamps.
  • Efficient Iteration: Sequential access patterns are faster with vectors.
  • Reduced Complexity: Simplifies code for adding and removing versions.

Consider whether a vector meets your needs and if it provides performance benefits in your use case.


244-248: Testing code conditioned on gcs-storage feature may reduce test coverage

The test module and mock implementations are under the #[cfg(feature = "gcs-storage")] attribute. This means that tests will only run when the gcs-storage feature is enabled:

#[cfg(test)]
mod tests {
    #[cfg(feature = "gcs-storage")]
    use super::*;
    // ...
}

Consider adjusting the conditional compilation to ensure tests are always included:

  • Always Include Tests: Remove the #[cfg(feature = "gcs-storage")] from the test module to ensure tests run regardless of feature flags.
  • Conditional Imports within Tests: Use conditional statements inside tests if certain code only exists with specific features.

Ensuring that tests run consistently improves code reliability and helps catch issues early.

websocket/crates/infra/src/persistence/project_repository.rs (1)

142-207: Suggestion: Refactor to eliminate code duplication between storage implementations

The implementations of ProjectSnapshotImpl for both ProjectGcsRepository and ProjectLocalRepository are similar, with methods performing analogous operations. Consider abstracting the common logic into a generic implementation or utilizing traits to reduce code duplication and enhance maintainability.

Also applies to: 211-272

🛑 Comments failed to post (3)
websocket/app/src/state.rs (1)

44-48: ⚠️ Potential issue

Several improvements needed in storage initialization

  1. The hardcoded path "./local_storage" should be configurable via environment variables for flexibility across different environments.
  2. There's no error handling for the case when neither storage feature is enabled.
  3. The GCS configuration should use proper feature gates instead of comments.
  4. Missing documentation about available storage features and their configuration.

Consider applying these changes:

+ // Storage features:
+ // - "local-storage": Uses local filesystem storage
+ // - "gcs-storage": Uses Google Cloud Storage
+ 
+ const DEFAULT_LOCAL_STORAGE_PATH: &str = "./local_storage";
+ 
  // Initialize storage based on feature
+ let storage = {
+     #[cfg(not(any(feature = "local-storage", feature = "gcs-storage")))]
+     compile_error!("At least one storage feature must be enabled");
+ 
      #[cfg(feature = "local-storage")]
-     let storage = Arc::new(ProjectStorageRepository::new("./local_storage".into()).await?);
+     {
+         let path = std::env::var("LOCAL_STORAGE_PATH")
+             .unwrap_or_else(|_| DEFAULT_LOCAL_STORAGE_PATH.to_string());
+         Arc::new(ProjectStorageRepository::new(path).await?)
+     }
 
-     // #[cfg(feature = "gcs-storage")]
-     // let storage = Arc::new(ProjectStorageRepository::new("your-gcs-bucket".to_string()).await?);
+     #[cfg(feature = "gcs-storage")]
+     {
+         let bucket = std::env::var("GCS_BUCKET")
+             .expect("GCS_BUCKET environment variable must be set when using gcs-storage feature");
+         Arc::new(ProjectStorageRepository::new(bucket).await?)
+     }
+ };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

        // Storage features:
        // - "local-storage": Uses local filesystem storage
        // - "gcs-storage": Uses Google Cloud Storage
        
        const DEFAULT_LOCAL_STORAGE_PATH: &str = "./local_storage";
        
        // Initialize storage based on feature
        let storage = {
            #[cfg(not(any(feature = "local-storage", feature = "gcs-storage")))]
            compile_error!("At least one storage feature must be enabled");

            #[cfg(feature = "local-storage")]
            {
                let path = std::env::var("LOCAL_STORAGE_PATH")
                    .unwrap_or_else(|_| DEFAULT_LOCAL_STORAGE_PATH.to_string());
                Arc::new(ProjectStorageRepository::new(path).await?)
            }

            #[cfg(feature = "gcs-storage")]
            {
                let bucket = std::env::var("GCS_BUCKET")
                    .expect("GCS_BUCKET environment variable must be set when using gcs-storage feature");
                Arc::new(ProjectStorageRepository::new(bucket).await?)
            }
        };
websocket/crates/services/examples/edit_session_service.rs (1)

37-41: 🛠️ Refactor suggestion

Add documentation and improve feature configuration

Several improvements needed for the storage initialization:

  1. Add documentation about required feature flags
  2. Provide error handling for when no storage feature is enabled
  3. Use environment variable for GCS bucket name instead of hardcoding

Consider this implementation:

+/// Storage initialization requires either `local-storage` or `gcs-storage` feature to be enabled.
+/// Configure through environment variables:
+/// - GCS_BUCKET: Google Cloud Storage bucket name (required when using gcs-storage feature)
 #[cfg(feature = "local-storage")]
 let storage = ProjectStorageRepository::new("./local_storage".into()).await?;
-// #[cfg(feature = "gcs-storage")]
-// let storage = ProjectStorageRepository::new("your-gcs-bucket".to_string()).await?;
+#[cfg(feature = "gcs-storage")]
+let storage = ProjectStorageRepository::new(
+    std::env::var("GCS_BUCKET")
+        .expect("GCS_BUCKET environment variable must be set")
+).await?;
+#[cfg(not(any(feature = "local-storage", feature = "gcs-storage")))]
+compile_error!("Either 'local-storage' or 'gcs-storage' feature must be enabled");
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

/// Storage initialization requires either `local-storage` or `gcs-storage` feature to be enabled.
/// Configure through environment variables:
/// - GCS_BUCKET: Google Cloud Storage bucket name (required when using gcs-storage feature)
#[cfg(feature = "local-storage")]
let storage = ProjectStorageRepository::new("./local_storage".into()).await?;
#[cfg(feature = "gcs-storage")]
let storage = ProjectStorageRepository::new(
    std::env::var("GCS_BUCKET")
        .expect("GCS_BUCKET environment variable must be set")
).await?;
#[cfg(not(any(feature = "local-storage", feature = "gcs-storage")))]
compile_error!("Either 'local-storage' or 'gcs-storage' feature must be enabled");
websocket/crates/infra/src/persistence/project_repository.rs (1)

24-26: ⚠️ Potential issue

Critical: Resolve conflicting pub use statements for ProjectStorageRepository

The unconditionally importing ProjectLocalRepository as ProjectStorageRepository on line 26 causes a conflict when the gcs-storage feature is enabled, as both ProjectGcsRepository and ProjectLocalRepository are being aliased to ProjectStorageRepository. This can lead to compilation errors or unintended behavior when multiple features are enabled simultaneously.

Apply this diff to fix the issue:

+#[cfg(feature = "local-storage")]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

#[cfg(feature = "gcs-storage")]
pub use self::gcs::ProjectGcsRepository as ProjectStorageRepository;
#[cfg(feature = "local-storage")]
pub use self::local::ProjectLocalRepository as ProjectStorageRepository;

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (11)
websocket/app/src/state.rs (2)

27-27: Consider restricting storage field visibility

While the change to use ProjectStorageRepository is good, consider making the storage field private (pub(crate)) and providing specific public methods for storage operations. This would better encapsulate the storage implementation details and make future changes easier.


Line range hint 44-77: Document storage implementation details

This architectural change introduces significant flexibility in storage backends. Consider:

  1. Adding documentation comments explaining:
    • Available storage backends
    • Configuration requirements for each backend
    • Feature flag usage
  2. Creating a README.md in the websocket directory describing the storage architecture
  3. Adding examples for different storage configurations

Would you like me to help create this documentation?

websocket/crates/services/examples/edit_session_service.rs (1)

19-19: Enhance example documentation with storage configuration.

The example's documentation should be expanded to include:

  1. Available storage features and how to enable them
  2. Required environment variables for each storage type
  3. Example commands for running with different storage backends

Update the documentation comment:

-///export REDIS_URL="redis://default:my_redis_password@localhost:6379/0"
-///RUST_LOG=debug cargo run --example edit_session_service
+/// Example demonstrating edit session service with configurable storage backend
+/// 
+/// Environment variables:
+/// - REDIS_URL: Redis connection string (default: redis://localhost:6379/0)
+/// - STORAGE_PATH: Path for local storage (when using local-storage feature)
+/// - RUST_LOG: Logging level
+/// 
+/// Example usage:
+/// ```bash
+/// # Using local storage
+/// export REDIS_URL="redis://default:my_redis_password@localhost:6379/0"
+/// export STORAGE_PATH="./my_storage"
+/// RUST_LOG=debug cargo run --example edit_session_service --features local-storage
+/// ```
websocket/crates/infra/src/persistence/gcs/gcs_client.rs (6)

Line range hint 24-35: Enhance error messages for better debugging

While the error handling is comprehensive, consider adding more context to error messages to aid in debugging.

 #[derive(Error, Debug)]
 pub enum GcsError {
     #[cfg(feature = "gcs-storage")]
-    #[error(transparent)]
+    #[error("GCS authentication failed: {0}")]
     Auth(#[from] google_cloud_storage::client::google_cloud_auth::error::Error),
     #[cfg(feature = "gcs-storage")]
-    #[error(transparent)]
+    #[error("GCS HTTP request failed: {0}")]
     Http(#[from] google_cloud_storage::http::Error),
-    #[error("Serialization error: {0}")]
+    #[error("Failed to serialize/deserialize GCS object: {0}")]
     Serialization(#[from] serde_json::Error),
-    #[error("UTF-8 conversion error: {0}")]
+    #[error("Failed to convert GCS object content to UTF-8: {0}")]
     Utf8(#[from] std::string::FromUtf8Error),
 }

Line range hint 37-42: Add bucket name validation

The GcsClient struct should validate the bucket name format during initialization.

 pub struct GcsClient {
     client: Client,
     bucket: String,
 }
+
+impl GcsClient {
+    fn validate_bucket_name(bucket: &str) -> Result<(), GcsError> {
+        if bucket.is_empty() || !bucket.chars().all(|c| c.is_ascii_alphanumeric() || c == '-') {
+            return Err(GcsError::InvalidBucketName(bucket.to_string()));
+        }
+        Ok(())
+    }
+}

Line range hint 108-157: Address potential race conditions in version management

The version management implementation could have race conditions when multiple clients update versions simultaneously. Consider using optimistic locking or a distributed lock mechanism.

Suggested approaches:

  1. Implement optimistic locking using object generation numbers
  2. Use a distributed lock service
  3. Implement CAS (Compare-and-Swap) operations

Would you like me to provide a detailed implementation for any of these approaches?


Line range hint 43-48: Make version limit configurable

The version history limit is hardcoded to 100. Consider making this configurable through the client initialization.

 pub struct GcsClient {
     client: Client,
     bucket: String,
+    version_limit: usize,
 }

 #[derive(Serialize, Deserialize)]
 struct VersionMetadata {
     latest_version: String,
     version_history: BTreeMap<i64, String>,
 }

Line range hint 243-324: Enhance test coverage

The current test suite covers basic functionality but lacks important scenarios:

  1. Error cases (network failures, invalid data)
  2. Concurrent operations
  3. Boundary conditions (empty data, maximum version limit)

Consider adding these test cases:

#[cfg(test)]
mod tests {
    // ... existing tests ...

    #[cfg(feature = "gcs-storage")]
    #[tokio::test]
    async fn test_upload_versioned_error_cases() {
        let mut mock = MockGcsClientMock::new();
        mock.expect_upload()
            .returning(|_, _| Err(GcsError::Http(/* mock HTTP error */)));
        
        let result = mock.upload_versioned("test_path".to_string(), &"test_data".to_string()).await;
        assert!(matches!(result, Err(GcsError::Http(_))));
    }

    #[cfg(feature = "gcs-storage")]
    #[tokio::test]
    async fn test_version_limit_boundary() {
        // Test behavior when version limit is reached
    }

    #[cfg(feature = "gcs-storage")]
    #[tokio::test]
    async fn test_concurrent_updates() {
        // Test concurrent version updates
    }
}

Line range hint 233-241: Enhance client security and reliability

The client initialization should include:

  1. Request timeouts
  2. Retry policies for transient failures
  3. Optional authentication configuration
 impl GcsClient {
     pub async fn new(bucket: String) -> Result<Self, GcsError> {
-        let config = ClientConfig::default().with_auth().await?;
+        let config = ClientConfig::default()
+            .with_auth()
+            .await?
+            .with_timeout(std::time::Duration::from_secs(30))
+            .with_retry_config(RetryConfig::default()
+                .with_max_retries(3)
+                .with_initial_delay(std::time::Duration::from_millis(100)));
         let client = Client::new(config);
         Ok(GcsClient { client, bucket })
     }
 }
websocket/crates/infra/src/persistence/project_repository.rs (2)

Line range hint 2-31: LGTM! Consider adding documentation for feature flags.

The feature flag setup correctly ensures mutual exclusivity of storage backends. The compile-time checks prevent invalid configurations.

Consider adding documentation comments explaining:

  • Available feature flags (gcs-storage, local-storage)
  • Their mutual exclusivity requirement
  • How to enable them in Cargo.toml

166-197: Consider adding path validation and simplifying version handling.

The snapshot path construction and version handling could be improved:

  1. Path construction should validate project_id for invalid characters
  2. Version handling in update_latest_snapshot has duplicate code

Consider refactoring update_latest_snapshot:

 async fn update_latest_snapshot(&self, snapshot: ProjectSnapshot) -> Result<(), Self::Error> {
-    let latest_version = self
-        .client
-        .get_latest_version(&format!("snapshot/{}", snapshot.metadata.project_id))
-        .await?;
-    if let Some(_version) = latest_version {
-        let path = format!("snapshot/{}", snapshot.metadata.project_id);
-        self.client.update_latest_versioned(path, &snapshot).await?;
-    } else {
-        let path = format!("snapshot/{}", snapshot.metadata.project_id);
-        self.client.upload_versioned(path, &snapshot).await?;
-    }
+    let path = format!("snapshot/{}", snapshot.metadata.project_id);
+    match self.client.get_latest_version(&path).await? {
+        Some(_) => self.client.update_latest_versioned(path, &snapshot).await?,
+        None => self.client.upload_versioned(path, &snapshot).await?,
+    }
     Ok(())
 }
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 955c6d6 and 0ee111f.

⛔ Files ignored due to path filters (1)
  • websocket/.DS_Store is excluded by !**/.DS_Store
📒 Files selected for processing (9)
  • websocket/.gitignore (1 hunks)
  • websocket/app/Cargo.toml (2 hunks)
  • websocket/app/src/state.rs (4 hunks)
  • websocket/crates/infra/Cargo.toml (1 hunks)
  • websocket/crates/infra/src/persistence/gcs/gcs_client.rs (9 hunks)
  • websocket/crates/infra/src/persistence/project_repository.rs (3 hunks)
  • websocket/crates/services/Cargo.toml (2 hunks)
  • websocket/crates/services/examples/edit_session_service.rs (4 hunks)
  • websocket/crates/services/examples/project_service.rs (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • websocket/.gitignore
  • websocket/app/Cargo.toml
  • websocket/crates/infra/Cargo.toml
  • websocket/crates/services/Cargo.toml
  • websocket/crates/services/examples/project_service.rs
🔇 Additional comments (8)
websocket/app/src/state.rs (3)

6-6: LGTM: Clean type abstraction for storage flexibility

The changes to imports and type definitions properly abstract the storage implementation, allowing for different storage backends while maintaining type safety.

Also applies to: 19-19


56-56: LGTM: Proper service initialization

The service initialization correctly uses the new storage abstraction.


72-72: LGTM: Verify feature flag configuration

The AppState construction correctly includes the storage field.

Let's verify the feature flag configuration in Cargo.toml:

✅ Verification successful

Let me gather more information about the storage configuration by checking the Cargo.toml content and related dependencies.


Feature flag configuration is correctly set up and implemented

The storage configuration is properly implemented with:

  • Default feature set to "local-storage"
  • Both storage options available: "local-storage" and "gcs-storage"
  • Correct conditional compilation using #[cfg(feature = "local-storage")]
  • Storage initialization properly integrated into AppState
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check feature flags configuration in Cargo.toml

# Find and display storage-related feature flags
echo "Checking websocket/app/Cargo.toml for storage features:"
rg -A 5 'features.*=.*\[' websocket/app/Cargo.toml

# Verify that at least one storage feature is enabled by default
echo -e "\nChecking for default features:"
rg 'default-features|default.*=.*\[' websocket/app/Cargo.toml

Length of output: 414


Script:

#!/bin/bash
# Get full content of Cargo.toml to understand storage configuration
echo "=== Cargo.toml content ==="
cat websocket/app/Cargo.toml

# Check for storage-related dependencies and their features
echo -e "\n=== Storage-related dependencies ==="
rg "storage|Storage" websocket/app/Cargo.toml -A 2

# Check for any storage-related code usage
echo -e "\n=== Storage usage in code ==="
rg "storage|Storage" websocket/app/src/ --type rust

Length of output: 2239

websocket/crates/services/examples/edit_session_service.rs (3)

6-6: LGTM: Import changes align with storage abstraction.

The addition of ProjectStorageRepository to the imports is consistent with the architectural changes for flexible storage selection.


51-51: LGTM: Service initialization correctly uses the new storage.

The storage is properly wrapped in Arc for thread-safe sharing across the service.


38-42: 🛠️ Refactor suggestion

Add feature selection documentation and improve error handling.

The storage initialization has several areas for improvement:

  1. Add documentation about available storage features and how to enable them.
  2. Handle the case when no storage feature is enabled.
  3. Consider making storage paths/bucket names configurable via environment variables.

Consider this implementation:

+ // Available storage features:
+ // - local-storage: Use local filesystem storage
+ // - gcs-storage: Use Google Cloud Storage (coming soon)
+ 
+ // Storage configuration from environment
+ let storage_path = std::env::var("STORAGE_PATH").unwrap_or_else(|_| "./local_storage".to_string());
+ 
+ #[cfg(not(any(feature = "local-storage", feature = "gcs-storage")))]
+ compile_error!("At least one storage feature must be enabled");
+ 
 #[cfg(feature = "local-storage")]
-let storage = ProjectStorageRepository::new("./local_storage".into()).await?;
+let storage = ProjectStorageRepository::new(storage_path.into()).await?;
 
 // #[cfg(feature = "gcs-storage")]
 // let storage = ProjectStorageRepository::new("your-gcs-bucket".to_string()).await?;

Let's verify the feature configuration:

websocket/crates/infra/src/persistence/gcs/gcs_client.rs (1)

1-21: LGTM! Well-organized imports with proper feature gates

The imports are properly organized and feature-gated, ensuring clean conditional compilation.

websocket/crates/infra/src/persistence/project_repository.rs (1)

Line range hint 34-50: LGTM! Error handling is comprehensive.

The error handling implementation properly:

  • Uses feature flags for storage-specific errors
  • Implements proper error conversion and propagation
  • Includes descriptive error messages

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (6)
websocket/app/src/state.rs (1)

Direct usage of concrete storage types detected

The storage features are not fully abstracted. Direct references to ProjectLocalRepository and ProjectGcsRepository were found in multiple files, including websocket/app/src/state.rs.

🔗 Analysis chain

Line range hint 1-95: Verify storage feature compatibility

Let's verify that the storage features are properly configured in the project's dependencies and that no direct usage of concrete types exists.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check Cargo.toml configurations
echo "Checking Cargo.toml configurations..."
rg -l 'features.*=.*\["(gcs|local)-storage"\]' 

# Check for direct usage of concrete types
echo "Checking for direct usage of concrete types..."
rg -l 'ProjectLocalRepository|ProjectGcsRepository' --type rust

# Check for proper feature declarations
echo "Checking for proper feature declarations..."
rg -l '^\[features\]' 

Length of output: 822

websocket/crates/services/examples/edit_session_service.rs (2)

41-44: Consider moving feature logging to storage initialization.

The feature logging would be more useful if moved closer to the storage initialization logic, making it clearer which storage backend is being used when the storage is actually created.

-    #[cfg(feature = "local-storage")]
-    debug!("local-storage feature is enabled");
-    #[cfg(feature = "gcs-storage")]
-    debug!("gcs-storage feature is enabled");

     // Initialize storage
     #[cfg(feature = "local-storage")]
+    debug!("Initializing local storage backend");
     let storage = ProjectLocalRepository::new("./local_storage".into()).await?;
     #[cfg(feature = "gcs-storage")]
+    debug!("Initializing GCS storage backend");
     let storage = ProjectGcsRepository::new("your-gcs-bucket".to_string()).await?;

23-23: Document environment variables in the example header.

Update the example documentation to include all required environment variables.

-///export REDIS_URL="redis://default:my_redis_password@localhost:6379/0"
-///RUST_LOG=debug cargo run --example edit_session_service
+/// Required environment variables:
+/// - REDIS_URL: Redis connection string (default: "redis://localhost:6379/0")
+/// - LOCAL_STORAGE_PATH: Path for local storage (when using local-storage feature)
+/// - GCS_BUCKET_NAME: GCS bucket name (required when using gcs-storage feature)
+/// - RUST_LOG: Log level (recommended: debug)
+///
+/// Example usage:
+/// ```bash
+/// export REDIS_URL="redis://default:my_redis_password@localhost:6379/0"
+/// export LOCAL_STORAGE_PATH="./local_storage"
+/// export GCS_BUCKET_NAME="my-bucket"
+/// RUST_LOG=debug cargo run --example edit_session_service
+/// ```
websocket/crates/infra/src/persistence/project_repository.rs (3)

Line range hint 26-43: Add documentation for error variants

Consider adding documentation comments for each error variant to explain when they occur and how to handle them. This will help developers understand and handle errors appropriately.

Example:

 #[derive(Error, Debug)]
 pub enum ProjectRepositoryError {
+    /// Errors from Google Cloud Storage operations
     #[cfg(feature = "gcs-storage")]
     #[error(transparent)]
     Gcs(#[from] GcsError),
 
+    /// Errors from local filesystem storage operations
     #[cfg(feature = "local-storage")]
     #[error(transparent)]
     Local(#[from] LocalStorageError),

160-163: Extract path construction to a private method

The path construction logic format!("snapshot/{}", ...) is duplicated across multiple methods. Consider extracting it to a private method for better maintainability and consistency.

 impl ProjectGcsRepository {
+    fn get_snapshot_path(&self, project_id: &str) -> String {
+        format!("snapshot/{}", project_id)
+    }
+
     async fn create_snapshot(&self, snapshot: ProjectSnapshot) -> Result<(), Self::Error> {
-        let path = format!("snapshot/{}", snapshot.metadata.project_id);
+        let path = self.get_snapshot_path(&snapshot.metadata.project_id);
         self.client.upload_versioned(path, &snapshot).await?;
         Ok(())
     }
     // Apply similar changes to other methods
 }

Also applies to: 166-173, 175-191, 193-197, 199-206


139-209: Consider adding monitoring and logging

The storage implementations would benefit from:

  1. Metrics for operation latency and error rates
  2. Structured logging for debugging and auditing
  3. Health checks to verify storage connectivity

This will help with operational monitoring and troubleshooting.

Also applies to: 210-272

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 0ee111f and a4b7c52.

⛔ Files ignored due to path filters (1)
  • websocket/Cargo.lock is excluded by !**/*.lock, !**/*.lock
📒 Files selected for processing (8)
  • websocket/app/Cargo.toml (2 hunks)
  • websocket/app/src/state.rs (4 hunks)
  • websocket/crates/infra/Cargo.toml (1 hunks)
  • websocket/crates/infra/src/persistence/mod.rs (1 hunks)
  • websocket/crates/infra/src/persistence/project_repository.rs (4 hunks)
  • websocket/crates/services/Cargo.toml (2 hunks)
  • websocket/crates/services/examples/edit_session_service.rs (4 hunks)
  • websocket/crates/services/examples/project_service.rs (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • websocket/crates/infra/src/persistence/mod.rs
🚧 Files skipped from review as they are similar to previous changes (4)
  • websocket/app/Cargo.toml
  • websocket/crates/infra/Cargo.toml
  • websocket/crates/services/Cargo.toml
  • websocket/crates/services/examples/project_service.rs
🔇 Additional comments (5)
websocket/app/src/state.rs (4)

24-24: LGTM! Clean abstraction of storage types

The type and field updates properly reflect the new storage abstraction while maintaining a clean interface.

Also applies to: 32-32


61-61: LGTM! Proper service initialization

The service initialization correctly uses the new storage abstraction.


77-77: LGTM! Consistent state initialization

The AppState initialization properly reflects the storage field update.


49-53: ⚠️ Potential issue

Previous configuration concerns still apply

The current implementation still uses hard-coded values and lacks proper configuration handling.

Additional GCS-specific concerns:

  1. The GCS bucket name should be configurable via environment variables
  2. GCS credentials need to be properly configured
  3. Error handling should be specific to GCS errors

Add proper GCS configuration:

#[cfg(feature = "gcs-storage")]
let storage = {
    let bucket = std::env::var("GCS_BUCKET")
        .map_err(|_| WsError::Configuration("GCS_BUCKET not set".into()))?;
    let credentials = std::env::var("GOOGLE_APPLICATION_CREDENTIALS")
        .map_err(|_| WsError::Configuration("GOOGLE_APPLICATION_CREDENTIALS not set".into()))?;
    Arc::new(ProjectStorageRepository::new(bucket).await?)
};
websocket/crates/services/examples/edit_session_service.rs (1)

60-60: LGTM: Service initialization looks correct.

The storage is properly wrapped in an Arc and passed to the service.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (1)
websocket/crates/infra/src/persistence/project_repository.rs (1)

Line range hint 26-42: Consider adding a dedicated error variant for validation failures

The error enum could benefit from a specific variant for validation failures, making it clearer when errors are due to invalid input rather than storage issues.

 #[derive(Error, Debug)]
 pub enum ProjectRepositoryError {
+    #[error("Validation error: {0}")]
+    Validation(String),
     #[cfg(feature = "gcs-storage")]
     #[error(transparent)]
     Gcs(#[from] GcsError),
     // ... other variants
 }
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between a4b7c52 and 395c18d.

⛔ Files ignored due to path filters (1)
  • websocket/Cargo.lock is excluded by !**/*.lock, !**/*.lock
📒 Files selected for processing (8)
  • websocket/app/Cargo.toml (2 hunks)
  • websocket/app/src/state.rs (4 hunks)
  • websocket/crates/infra/Cargo.toml (1 hunks)
  • websocket/crates/infra/src/persistence/mod.rs (1 hunks)
  • websocket/crates/infra/src/persistence/project_repository.rs (4 hunks)
  • websocket/crates/services/Cargo.toml (2 hunks)
  • websocket/crates/services/examples/edit_session_service.rs (4 hunks)
  • websocket/crates/services/examples/project_service.rs (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (7)
  • websocket/app/Cargo.toml
  • websocket/app/src/state.rs
  • websocket/crates/infra/Cargo.toml
  • websocket/crates/infra/src/persistence/mod.rs
  • websocket/crates/services/Cargo.toml
  • websocket/crates/services/examples/edit_session_service.rs
  • websocket/crates/services/examples/project_service.rs
🔇 Additional comments (3)
websocket/crates/infra/src/persistence/project_repository.rs (3)

219-224: Add path validation in constructor

The constructor should validate the base path and ensure it exists with proper permissions.


231-235: 🛠️ Refactor suggestion

Fix inconsistent path prefixes and add validation

The local storage uses "snapshots/" while GCS uses "snapshot/" as the path prefix. This inconsistency could cause issues when migrating between storage backends.

Additionally, consider centralizing the path construction and validation:

 impl ProjectLocalRepository {
+    const SNAPSHOT_PREFIX: &'static str = "snapshot/";  // Changed to match GCS
+
+    fn validate_and_build_path(&self, project_id: &str) -> Result<String, ProjectRepositoryError> {
+        if project_id.is_empty() || project_id.contains('/') {
+            return Err(ProjectRepositoryError::Validation(
+                "Invalid project_id format".to_string(),
+            ));
+        }
+        Ok(format!("{}{}", Self::SNAPSHOT_PREFIX, project_id))
+    }
 }

Also applies to: 241-247, 253-256, 259-262, 268-271


139-271: Add test coverage for storage implementations

Both GCS and local storage implementations lack test coverage for critical functionality.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (8)
websocket/app/src/errors.rs (1)

31-32: LGTM! Well-structured error handling for GCS storage.

The new GcsStorage variant follows Rust error handling best practices with proper use of transparent and from attributes, maintaining consistency with other variants.

Since this is part of a storage selection feature, ensure that this error variant is properly guarded by the gcs-storage feature flag mentioned in the summary. You might want to use:

#[cfg(feature = "gcs-storage")]
#[error(transparent)]
GcsStorage(#[from] GcsError),

This would prevent the variant from being compiled when GCS storage is not enabled.

websocket/crates/infra/src/persistence/gcs/gcs_client.rs (7)

1-15: Simplify conditional imports by grouping under a single #[cfg] attribute

To enhance readability and reduce repetition, consider grouping the conditional imports under a single #[cfg(feature = "gcs-storage")] block.

Apply this change:

#[cfg(feature = "gcs-storage")]
use {
    async_trait::async_trait,
    chrono::{DateTime, Utc},
    google_cloud_storage::client::{Client, ClientConfig},
    google_cloud_storage::http::objects::delete::DeleteObjectRequest,
    google_cloud_storage::http::objects::download::Range,
    google_cloud_storage::http::objects::get::GetObjectRequest,
    google_cloud_storage::http::objects::upload::{Media, UploadObjectRequest, UploadType},
};

Line range hint 55-70: Specify the object name in UploadObjectRequest for correct uploads

In the upload method, the UploadObjectRequest lacks the object field, which specifies the object's name in the bucket. Omitting this field may result in uploads failing or being stored with incorrect names.

Include the object field to ensure the object is correctly named in GCS:

let _uploaded = self
    .client
    .upload_object(
        &UploadObjectRequest {
            bucket: self.bucket.clone(),
+           object: path.clone(),
            ..Default::default()
        },
        bytes,
        &upload_type,
    )
    .await?;

Line range hint 55-70: Use serde_json::to_vec and from_slice to optimize serialization

To improve performance and avoid unnecessary string conversions, use serde_json::to_vec when serializing data and serde_json::from_slice when deserializing. This eliminates the intermediate String conversion, working directly with bytes.

Apply the following changes to the upload method:

async fn upload<T: Serialize + Send + Sync + 'static>(
    &self,
    path: String,
    data: &T,
) -> Result<(), GcsError> {
    let upload_type = UploadType::Simple(Media::new(path.clone()));
-   let bytes = serde_json::to_string(data)?;
+   let bytes = serde_json::to_vec(data)?;
    let _uploaded = self
        .client
        .upload_object(
            &UploadObjectRequest {
                bucket: self.bucket.clone(),
+               object: path,
                ..Default::default()
            },
            bytes,
            &upload_type,
        )
        .await?;
    Ok(())
}

And to the download method:

async fn download<T: for<'de> Deserialize<'de> + Send + 'static>(
    &self,
    path: String,
) -> Result<T, GcsError> {
    let bytes = self
        .client
        .download_object(
            &GetObjectRequest {
                bucket: self.bucket.clone(),
                object: path,
                ..Default::default()
            },
            &Range::default(),
        )
        .await?;
-   let src = String::from_utf8(bytes)?;
-   let data = serde_json::from_str(&src)?;
+   let data = serde_json::from_slice(&bytes)?;
    Ok(data)
}

Line range hint 89-121: Address potential race conditions in upload_versioned method

The upload_versioned method reads, modifies, and writes the VersionMetadata. Concurrent executions of this method could lead to race conditions, causing metadata inconsistency or loss of version history.

Implement concurrency control to ensure atomic updates to the metadata. Consider using GCS object versioning with precondition checks (ifGenerationMatch) to prevent overwriting concurrent updates. Here's how you might adjust the metadata update:

// Retrieve the current metadata along with its generation number
let (metadata, generation) = match self.download_with_generation::<VersionMetadata>(metadata_path.clone()).await {
    Ok(data) => data,
    Err(_) => (VersionMetadata { /* ... */ }, 0),
};

// Update metadata...

// When uploading, specify the generation number to ensure the metadata hasn't changed since read
self.upload_with_generation(metadata_path, &metadata, generation).await?;

Implementing methods like download_with_generation and upload_with_generation will allow you to handle the conditional requests properly.


Line range hint 89-121: Limit version history pruning logic only when necessary

The current logic in upload_versioned always checks and prunes the version history if it exceeds 100 versions, which could be inefficient. Consider optimizing by invoking the pruning logic only when the limit is exceeded.

Apply this change:

// Limit version history to last 100 versions
-if metadata.version_history.len() > 100 {
     while metadata.version_history.len() > 100 {
         if let Some(oldest) = metadata.version_history.keys().next().cloned() {
             metadata.version_history.remove(&oldest);
         } else {
             break;
         }
     }
-}

Line range hint 233-239: Handle authentication errors explicitly in GcsClient::new

When creating a new GcsClient, authentication errors might occur. Providing a more descriptive error message or handling could improve debugging and usability.

Consider wrapping the authentication error with a custom message:

pub async fn new(bucket: String) -> Result<Self, GcsError> {
    let config = ClientConfig::default()
-       .with_auth()
-       .await?;
+       .with_auth()
+       .await
+       .map_err(|e| GcsError::Auth(e))?;
    let client = Client::new(config);
    Ok(GcsClient { client, bucket })
}

244-252: Remove unnecessary #[cfg] attributes in test module

The entire test module is already conditionally compiled with #[cfg(test)]. Additional #[cfg(feature = "gcs-storage")] attributes within the test module may be redundant and reduce code readability.

Consider removing the inner #[cfg(feature = "gcs-storage")] attributes if the tests are only relevant when the gcs-storage feature is enabled. Alternatively, wrap the entire test module with both attributes:

#[cfg(all(test, feature = "gcs-storage"))]
mod tests {
    // Test code here...
}
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 395c18d and 745b6de.

📒 Files selected for processing (5)
  • websocket/Cargo.toml (1 hunks)
  • websocket/app/Cargo.toml (2 hunks)
  • websocket/app/src/errors.rs (2 hunks)
  • websocket/app/src/state.rs (4 hunks)
  • websocket/crates/infra/src/persistence/gcs/gcs_client.rs (8 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • websocket/app/Cargo.toml
  • websocket/app/src/state.rs
🔇 Additional comments (2)
websocket/app/src/errors.rs (1)

1-3: LGTM! Clean import organization.

The import statement is well-organized using nested paths, maintaining good code organization.

websocket/Cargo.toml (1)

36-37: LGTM! Verify build configuration across features.

The changes to disable default features align well with the PR objective of enhancing storage flexibility. This modification allows for more granular control over which storage features are included in the final build.

Let's verify the build configuration and feature dependencies:

✅ Verification successful

LGTM! Build configurations confirmed to handle feature flags appropriately.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check feature configurations and dependencies across workspace

# Check features defined in app/Cargo.toml
echo "Checking app features configuration..."
rg -A 10 '^\[features\]' app/Cargo.toml

# Check for any feature-gated code in infra crate
echo "Checking feature usage in infra crate..."
rg -g '*.rs' '#\[cfg\(feature' crates/infra/

# Verify no direct dependencies on storage implementations without feature flags
echo "Checking for unguarded storage dependencies..."
rg -g 'Cargo.toml' 'google-cloud-storage|local-storage' crates/

Length of output: 590


Script:

#!/bin/bash
# Description: Locate all Cargo.toml files and inspect feature configurations

echo "Searching for all Cargo.toml files in the repository..."
fd 'Cargo.toml' . -x echo "Found Cargo.toml at: {}"

echo "Inspecting feature configurations in all Cargo.toml files..."
rg -H '^\[features\]' **/Cargo.toml

echo "Checking for feature-gated storage dependencies in all crates..."
rg -g 'Cargo.toml' 'google-cloud-storage|local-storage' .

Length of output: 2549

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (11)
websocket/crates/services/examples/project_service.rs (2)

22-22: Add example command for gcs-storage feature

For completeness, consider adding the command to run with gcs-storage feature as well:

///RUST_LOG=debug cargo run --example project_service
/// RUST_LOG=debug cargo run --example project_service  --features local-storage
+/// RUST_LOG=debug cargo run --example project_service  --features gcs-storage

39-45: Add compile-time check for storage feature selection

Consider adding a compile-time check to ensure at least one storage feature is enabled:

+    #[cfg(not(any(feature = "local-storage", feature = "gcs-storage")))]
+    compile_error!("At least one storage feature must be enabled: 'local-storage' or 'gcs-storage'");
+
     #[cfg(feature = "local-storage")]
     #[allow(unused_variables)]
     let storage = ProjectLocalRepository::new("./local_storage".into()).await?;
websocket/crates/infra/src/persistence/gcs/gcs_client.rs (7)

1-21: Consider grouping related imports

The imports are correctly feature-gated but could be organized better by grouping:

  1. External crate imports (async_trait, chrono)
  2. Google Cloud Storage related imports
  3. Standard library imports
  4. Local crate imports

Line range hint 23-34: Consider adding more specific error variants

The error handling could be enhanced by adding specific variants for:

  • NotFound - when bucket or object doesn't exist
  • VersionMetadataError - for version metadata operations failures
  • InvalidPath - for malformed object paths

This would provide better error context to consumers of this client.


43-48: Enhance version metadata structure documentation and fields

Consider:

  1. Adding documentation explaining the versioning strategy
  2. Including additional metadata fields like:
    • creation_time
    • last_modified
    • content_type
    • size
  3. Adding derive Debug for better debugging support

This would make the versioning system more maintainable and feature-rich.


Line range hint 50-231: Address potential race conditions in version metadata updates

The version metadata updates are not atomic, which could lead to race conditions in concurrent scenarios. Consider:

  1. Using optimistic locking with a version counter
  2. Implementing retry logic for concurrent updates
  3. Adding transaction-like semantics for metadata operations

Line range hint 114-146: Improve version management strategy

Current implementation has several limitations:

  1. No cleanup of old version files when they're removed from version history
  2. Fixed limit of 100 versions might not be suitable for all use cases
  3. Linear scanning of version history could be inefficient

Consider:

  1. Implementing garbage collection for old versions
  2. Making version limit configurable
  3. Using a more efficient data structure for version lookup

Line range hint 233-241: Enhance client configuration and reliability

Consider adding:

  1. Configurable retry policy for transient failures
  2. Timeout settings
  3. Connection pooling configuration
  4. Custom endpoint support for testing

This would make the client more robust and configurable for different environments.


Line range hint 242-324: Enhance test coverage

Current tests are basic. Consider adding:

  1. Error case tests
  2. Integration tests with GCS emulator
  3. Concurrent operation tests
  4. Edge cases (empty paths, large objects)
  5. Version management tests

Example test for error cases:

#[cfg(feature = "gcs-storage")]
#[tokio::test]
async fn test_upload_versioned_error() {
    let mut mock = MockGcsClientMock::new();
    mock.expect_upload_versioned()
        .with(eq("test_path".to_string()), always())
        .returning(|_: String, _: &String| {
            Err(GcsError::Http(google_cloud_storage::http::Error::InvalidRequest))
        });

    let result = mock
        .upload_versioned("test_path".to_string(), &"test_data".to_string())
        .await;
    assert!(result.is_err());
}
websocket/crates/infra/src/persistence/redis/flow_project_redis_data_manager.rs (1)

298-300: Enhance error logging for merge failures.

While the error is properly propagated, adding more detailed error logging would help with debugging merge failures in production.

Consider adding more detailed error logging:

-        let _ = self.execute_merge_updates(project_id).await?;
-        debug!("Updates merged after push");
+        match self.execute_merge_updates(project_id).await {
+            Ok(_) => debug!("Updates merged successfully after push"),
+            Err(e) => {
+                debug!(
+                    error = %e,
+                    project_id = %project_id,
+                    "Failed to merge updates after push"
+                );
+                return Err(e);
+            }
+        };
websocket/crates/infra/src/persistence/project_repository.rs (1)

Line range hint 26-42: Consider adding a dedicated validation error variant

The error handling is comprehensive, but consider adding a dedicated Validation variant for input validation errors (e.g., invalid project IDs, paths) instead of using Custom.

 #[derive(Error, Debug)]
 pub enum ProjectRepositoryError {
+    #[error("Validation error: {0}")]
+    Validation(String),
     #[cfg(feature = "gcs-storage")]
     #[error(transparent)]
     Gcs(#[from] GcsError),
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 745b6de and a88e787.

⛔ Files ignored due to path filters (2)
  • websocket/.DS_Store is excluded by !**/.DS_Store
  • websocket/Cargo.lock is excluded by !**/*.lock, !**/*.lock
📒 Files selected for processing (13)
  • websocket/.gitignore (1 hunks)
  • websocket/Cargo.toml (1 hunks)
  • websocket/app/Cargo.toml (2 hunks)
  • websocket/app/src/errors.rs (2 hunks)
  • websocket/app/src/state.rs (4 hunks)
  • websocket/crates/infra/Cargo.toml (1 hunks)
  • websocket/crates/infra/src/persistence/gcs/gcs_client.rs (8 hunks)
  • websocket/crates/infra/src/persistence/mod.rs (1 hunks)
  • websocket/crates/infra/src/persistence/project_repository.rs (4 hunks)
  • websocket/crates/infra/src/persistence/redis/flow_project_redis_data_manager.rs (1 hunks)
  • websocket/crates/services/Cargo.toml (2 hunks)
  • websocket/crates/services/examples/edit_session_service.rs (4 hunks)
  • websocket/crates/services/examples/project_service.rs (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (9)
  • websocket/.gitignore
  • websocket/Cargo.toml
  • websocket/app/Cargo.toml
  • websocket/app/src/errors.rs
  • websocket/app/src/state.rs
  • websocket/crates/infra/Cargo.toml
  • websocket/crates/infra/src/persistence/mod.rs
  • websocket/crates/services/Cargo.toml
  • websocket/crates/services/examples/edit_session_service.rs
🔇 Additional comments (9)
websocket/crates/services/examples/project_service.rs (3)

4-7: LGTM: Clean feature-gated imports

The conditional imports for storage repositories are well-organized and properly feature-gated.


53-53: LGTM: Proper Arc wrapping for storage

The storage is correctly wrapped in Arc for shared ownership when passed to the service.


39-45: 🛠️ Refactor suggestion

Use environment variable for GCS bucket name

The GCS bucket name is currently hardcoded. Consider using an environment variable for better flexibility across different environments:

-    let storage = ProjectGcsRepository::new("your-gcs-bucket".to_string()).await?;
+    let storage = ProjectGcsRepository::new(
+        std::env::var("GCS_BUCKET").unwrap_or_else(|_| "your-gcs-bucket".to_string())
+    ).await?;

Verify the need for #[allow(unused_variables)]

The presence of #[allow(unused_variables)] on both storage initializations suggests potential dead code.

✅ Verification successful

Remove unnecessary #[allow(unused_variables)] attributes

The storage variable is actively used in service initialization, making the #[allow(unused_variables)] attributes unnecessary. Removing them will help catch any truly unused variables.

  • File: websocket/crates/services/examples/project_service.rs
    • Lines: 39-45
  • File: websocket/crates/services/examples/edit_session_service.rs
    • Lines: 60-65
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if the storage variable is actually used in the codebase
# Expected: The variable should be used in service initialization

# Search for direct usage of storage variable
rg -p "storage\s*[,)]" websocket/crates/services/examples/

# Search for ProjectService::new usage pattern
ast-grep --pattern 'ProjectService::new($$$)'

Length of output: 1204

websocket/crates/infra/src/persistence/redis/flow_project_redis_data_manager.rs (1)

298-300: Consider performance implications of immediate merging.

The addition of immediate update merging after push is a good optimization for data consistency. However, this could potentially impact latency for clients waiting for push confirmation.

Let's verify the performance impact:

websocket/crates/infra/src/persistence/project_repository.rs (5)

149-153: Add retry mechanism for cloud operations

The past review comment about adding a retry mechanism for cloud operations is still valid and should be addressed.


160-163: Centralize path construction and add validation

The past review comment about centralizing path construction and adding validation is still valid. The path construction is duplicated across methods and lacks proper validation.

Also applies to: 179-191


219-224: Add path validation in constructor

The past review comment about adding path validation in the constructor is still valid and should be addressed.


232-233: Fix inconsistent path prefixes between storage implementations

The past review comment about inconsistent path prefixes between storage implementations is still valid. The local storage uses "snapshots/" while GCS uses "snapshot/" as the path prefix.

Also applies to: 247-248


139-271: Verify usage of storage implementations

Let's verify how these storage implementations are being used in the codebase to ensure proper initialization and error handling.

#!/bin/bash
# Check usage of storage implementations
echo "=== GCS Storage Usage ==="
rg "ProjectGcsRepository::new|GcsClient::new" -A 5

echo -e "\n=== Local Storage Usage ==="
rg "ProjectLocalRepository::new|LocalClient::new" -A 5

let path = format!("snapshots/{}", project_id);
let versions = self.client.list_versions(&path, None).await?;
Ok(versions.iter().map(|(_, v)| v.clone()).collect())
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Test coverage is still commented out

  • websocket/crates/infra/src/persistence/project_repository.rs
  • engine/runtime/storage/src/storage.rs
🔗 Analysis chain

Uncomment and update test coverage

The test coverage is currently commented out. Let's verify the test coverage status:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for test coverage of storage implementations
echo "=== Test Files ==="
fd -e rs test | grep -i "storage\|repository"

echo -e "\n=== Test Coverage ==="
rg "mod tests" -A 5 $(fd -e rs | grep -i "storage\|repository")

Length of output: 3850


Script:

#!/bin/bash
# Check for test coverage of storage implementations

echo "=== Test Files ==="
fd -e rs test | grep -i 'storage\|repository'

echo -e "\n=== Test Coverage ==="
rg "mod tests" -A 5 $(fd -e rs | grep -i 'storage\|repository')

Length of output: 1440

@pyshx pyshx merged commit 62d02a5 into main Nov 13, 2024
17 checks passed
@pyshx pyshx deleted the feat/storage branch November 13, 2024 03:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants