-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ci,server,worker): implement schemata and items copy #1331
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThis pull request introduces a comprehensive feature for copying models in the ReEarth CMS system. The changes span multiple components, including server-side API endpoints, worker infrastructure, and testing suites. A new workflow for building a copier Docker image has been added, and the system now supports copying models with their associated schemas and items. The implementation includes modifications to repositories, use cases, and integration API specifications to enable this functionality across different layers of the application. Changes
Sequence DiagramsequenceDiagram
participant Client
participant ModelController
participant ModelInteractor
participant ModelRepository
participant TaskRunner
participant CopierWorker
Client->>ModelController: POST /models/{modelId}/copy
ModelController->>ModelInteractor: Copy model
ModelInteractor->>ModelRepository: Create copied model
ModelInteractor->>ModelRepository: Copy schema
ModelInteractor->>ModelRepository: Copy items
ModelInteractor->>TaskRunner: Trigger copy event
TaskRunner->>CopierWorker: Execute copy task
CopierWorker->>ModelRepository: Copy documents
CopierWorker-->>TaskRunner: Task completed
ModelInteractor-->>ModelController: Return copied model
ModelController-->>Client: 200 OK with copied model
Poem
Tip CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
✅ Deploy Preview for reearth-cms canceled.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 20
🧹 Nitpick comments (32)
worker/internal/app/main.go (2)
29-30
: Combine comments for clarity or remove extraneous comment.
The comment “// repos and gateways” adds context, but consider grouping lines 29-30 under a single explanatory comment if more logic is added in the future.
42-42
: Use a descriptive field name in ServerConfig if “Repos” might expand.
Currently, “Repos” is clear enough, but if different repository sets are introduced down the line, consider distinguishing them or grouping them in nested fields.worker/internal/app/repo.go (1)
19-19
: Confirm naming convention for init functions.
“initReposAndGateways” is descriptive. Ensure it’s not confused with Go’s built-in init functions.worker/internal/infrastructure/mongo/copier.go (3)
44-60
: Index listing logic might become expensive for large collections.
Listing and decoding each index can be costly if this function runs frequently. Consider caching or a more direct approach if frequent re-invocation is expected.
61-73
: Index creation for expiring documents.
SetExpireAfterSeconds suggests these documents expire after 24 hours. Verify that this meets business requirements or consider making it configurable.
86-93
: Check memory usage with large result sets.
Using Find and reading all documents could be memory-intensive if the filter is broad. Consider streaming or chunk-based copying if applicable.server/internal/usecase/interactor/model.go (4)
5-5
: Import of encoding/json.
Appropriate for copy logic. Ensure that repeated JSON marshalling/unmarshalling does not hamper performance in tight loops.
7-7
: Ensure fmt usage is minimal in production code.
We do see usage intriggerCopyEvent
. Evaluate if structured logging might be more consistent in some scenarios.
390-413
: Metadata schema creation.
The logic to clone the old meta schema is well structured and updates the new model’s metadata pointer. Confirm no references to the old model’s ID remain in the new schema.
440-458
: Asynchronous copy triggered.
The gateway’s TaskRunner is used to queue a copy event. Logging is minimal, but sufficient. If copy processes are large, consider adding monitoring or progress logs.server/e2e/integration_model_test.go (4)
60-62
: Add concise test documentation.
A brief comment at the start of the test function about what scenarios it validates would help future maintainers quickly discern its purpose.
63-78
: Increase negative test coverage.
In addition to unauthorized requests, consider adding test cases for invalid payloads or non-existent model IDs to cover edge scenarios (e.g., model ID is syntactically valid but does not exist).
79-86
: Move test data retrieval closer to usage.
For clarity and maintainability, consider retrieving the old model ID and object just before it’s used, keeping the test logic more localized. This helps future maintainers see data context in a single glance.
107-140
: Ensure metadata schema relationships remain consistent.
You correctly validate that the copied metadata schema differs from the original. However, for completeness, you might also ensure that any references (if any) are updated or resolved consistently in your system (e.g., if the metadata schema references yet another entity).worker/internal/usecase/repo/copier.go (1)
1-9
: Establish richly documented interface.
Add doc comments to clarify possible implementations of the “Copier” interface, explaining expectations regarding data integrity, performance constraints, and error handling.worker/internal/usecase/interactor/copier.go (1)
7-9
: Add contextual logging or tracing.
When calling “u.repos.Copier.Copy”, consider adding logs or distributed tracing to help diagnose potential data consistency issues or performance bottlenecks, especially as copy operations can be data-intensive.worker/internal/usecase/interactor/usecase.go (1)
10-10
: Rename field for clarity.
The field “repos” is clear in context, but to improve readability, you might consider a name like “repositories” or “repoContainer” to make the usage more self-descriptive.worker/internal/adapter/http/copy.go (1)
19-22
: Add documentation for CopyInput fieldsThe
Filter
andChanges
fields lack documentation explaining their purpose, format, and expected values. This is important for maintainability and API understanding.type CopyInput struct { + // Filter specifies the MongoDB query filter in JSON format Filter string `json:"filter"` + // Changes specifies the modifications to be applied in JSON format Changes string `json:"changes"` }worker/internal/usecase/interactor/webhook.go (1)
Line range hint
13-19
: Consider adding retry mechanism for concurrent webhook processingThe
GetAndSet
operation might have race conditions in a distributed environment. Consider implementing a retry mechanism with exponential backoff:+ maxRetries := 3 + for i := 0; i < maxRetries; i++ { found, err := u.repos.Webhook.GetAndSet(ctx, eid) if err != nil { log.Errorf("webhook usecase: failed to get webhook sent: %v", err) + continue } if found { return nil } + break + }server/pkg/task/task.go (1)
60-64
: Consider using structured Changes typeThe Changes field in CopyPayload is a string, but the Changes type is defined as a structured map. Consider using the structured type directly.
type CopyPayload struct { Collection string Filter string - Changes string + Changes Changes }worker/cmd/copier/main.go (1)
67-70
: Avoid hardcoding database nameThe database name "reearth_cms" is hardcoded. Consider making it configurable through environment variables.
- db := client.Database("reearth_cms") + dbName := os.Getenv("REEARTH_CMS_DB_NAME") + if dbName == "" { + dbName = "reearth_cms" // fallback to default + } + db := client.Database(dbName)worker/internal/infrastructure/mongo/webhook.go (1)
30-34
: Consider adding context timeout for index initializationWhile the implementation is correct, it would be better to use a context with timeout for the index initialization to prevent long-running operations.
func (r *Webhook) Init() error { + ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second) + defer cancel() - return r.InitIndex( - context.Background(), - ) + return r.InitIndex(ctx) }server/internal/usecase/interfaces/model.go (1)
23-26
: Add validation tags to CopyModelParam structConsider adding validation tags to ensure ModelId is not empty and Name follows any required format/length constraints.
type CopyModelParam struct { - ModelId id.ModelID - Name *string + ModelId id.ModelID `validate:"required"` + Name *string `validate:"omitempty,min=1,max=100"` }server/pkg/schema/schema.go (1)
176-182
: Add error return for validation failuresConsider modifying the method to return an error when validation fails.
-func (s *Schema) CopyFrom(s2 *Schema) { +func (s *Schema) CopyFrom(s2 *Schema) error { if s == nil || s2 == nil { - return + return errors.New("source or destination schema is nil") } s.fields = slices.Clone(s2.fields) s.titleField = s2.TitleField().CloneRef() + return nil }.github/workflows/build_copier.yml (2)
33-41
: Improve shell script syntax and readabilityUse
-n
instead of! -z
for better shell script readability and maintainability.- if [[ ! -z "$TAG" ]]; then + if [[ -n "$TAG" ]]; then
93-95
: Optimize multiple redirects to GITHUB_OUTPUTConsider using a grouped redirect for better efficiency.
- echo "platforms=$PLATFORMS" >> "$GITHUB_OUTPUT" - echo "version=$VERSION" >> "$GITHUB_OUTPUT" - echo "tags=$TAGS" >> "$GITHUB_OUTPUT" + { + echo "platforms=$PLATFORMS" + echo "version=$VERSION" + echo "tags=$TAGS" + } >> "$GITHUB_OUTPUT"server/internal/infrastructure/gcp/taskrunner.go (1)
161-162
: Extract duplicated timeout valuesThe timeout values are duplicated across the file. Consider extracting them to constants.
+const ( + buildTimeout = "86400s" // 1 day + buildQueueTtl = "86400s" // 1 day +) build := &cloudbuild.Build{ - Timeout: "86400s", // 1 day - QueueTtl: "86400s", // 1 day + Timeout: buildTimeout, + QueueTtl: buildQueueTtl,server/internal/adapter/integration/model.go (1)
115-128
: Consider enhancing error handling with more specific error types.The error handling could be more granular to provide better feedback to API consumers. Consider handling specific error cases like validation errors, permission errors, etc., separately.
func (s *Server) CopyModel(ctx context.Context, request CopyModelRequestObject) (CopyModelResponseObject, error) { uc := adapter.Usecases(ctx) op := adapter.Operator(ctx) m, err := uc.Model.Copy(ctx, interfaces.CopyModelParam{ ModelId: request.ModelId, Name: request.Body.Name, }, op) if err != nil { - if errors.Is(err, rerror.ErrNotFound) { - return CopyModel404Response{}, err - } - return CopyModel500Response{}, err + switch { + case errors.Is(err, rerror.ErrNotFound): + return CopyModel404Response{}, err + case errors.Is(err, rerror.ErrInvalidParams): + return CopyModel400Response{}, err + case errors.Is(err, rerror.ErrPermissionDenied): + return CopyModel403Response{}, err + default: + return CopyModel500Response{}, err + } }server/internal/usecase/interactor/model_test.go (3)
576-629
: Add more test cases for comprehensive coverage.The test suite should include additional cases:
- Invalid model name (empty or too long)
- Permission denied scenarios
- Concurrent copy operations
- Copy with metadata validation
tests := []struct { name string param interfaces.CopyModelParam setupMock func() wantErr bool validate func(t *testing.T, got *model.Model) }{ + { + name: "empty name", + param: interfaces.CopyModelParam{ + ModelId: m.ID(), + Name: lo.ToPtr(""), + }, + setupMock: func() { + mRunner.EXPECT().Run(ctx, gomock.Any()).Times(0) + }, + wantErr: true, + validate: func(t *testing.T, got *model.Model) { + assert.Nil(t, got) + }, + }, + { + name: "permission denied", + param: interfaces.CopyModelParam{ + ModelId: m.ID(), + Name: lo.ToPtr("Copied Model"), + }, + setupMock: func() { + mRunner.EXPECT().Run(ctx, gomock.Any()).Times(1).Return(rerror.ErrPermissionDenied) + }, + wantErr: true, + validate: func(t *testing.T, got *model.Model) { + assert.Nil(t, got) + }, + },
546-555
: Consider using test helper functions for setup.The test setup code could be refactored into helper functions to improve readability and reusability.
+func setupTestSchema(t *testing.T, wid accountdomain.WorkspaceID, pid id.ProjectID) (*schema.Schema, *schema.Schema) { + fId1 := id.NewFieldID() + sfKey1 := id.RandomKey() + sf1 := schema.NewField(schema.NewBool().TypeProperty()).ID(fId1).Key(sfKey1).MustBuild() + s1 := schema.New().NewID().Workspace(wid).Project(pid).Fields([]*schema.Field{sf1}).MustBuild() + + fId2 := id.NewFieldID() + sfKey2 := id.RandomKey() + sf2 := schema.NewField(schema.NewBool().TypeProperty()).ID(fId2).Key(sfKey2).MustBuild() + s2 := schema.New().NewID().Workspace(wid).Project(pid).Fields([]*schema.Field{sf2}).MustBuild() + + return s1, s2 +}
567-574
: Add error handling for database setup.The test setup should handle database errors more gracefully.
- err := db.Project.Save(ctx, p.Clone()) - assert.NoError(t, err) - err = db.Model.Save(ctx, m.Clone()) - assert.NoError(t, err) - err = db.Schema.Save(ctx, s1.Clone()) - assert.NoError(t, err) - err = db.Schema.Save(ctx, s2.Clone()) - assert.NoError(t, err) + for _, err := range []error{ + db.Project.Save(ctx, p.Clone()), + db.Model.Save(ctx, m.Clone()), + db.Schema.Save(ctx, s1.Clone()), + db.Schema.Save(ctx, s2.Clone()), + } { + if err != nil { + t.Fatalf("failed to setup test database: %v", err) + } + }.github/workflows/build_worker.yml (1)
144-167
: Optimize script by combining redirects.The script logic is correct, but we can improve it by combining the redirects to
$GITHUB_OUTPUT
.Here's the optimized version:
run: | if [[ -n $TAG ]]; then PLATFORMS=linux/amd64,linux/arm64 VERSION=$TAG TAGS=$IMAGE_NAME:$TAG if [[ ! $TAG =~ '-' ]]; then TAGS+=,${IMAGE_NAME}:${TAG%.*} TAGS+=,${IMAGE_NAME}:${TAG%%.*} TAGS+=,${IMAGE_NAME}:latest fi else PLATFORMS=linux/amd64 VERSION=$SHA TAGS=$IMAGE_NAME:$NAME fi - echo "platforms=$PLATFORMS" >> "$GITHUB_OUTPUT" - echo "version=$VERSION" >> "$GITHUB_OUTPUT" - echo "tags=$TAGS" >> "$GITHUB_OUTPUT" + { + echo "platforms=$PLATFORMS" + echo "version=$VERSION" + echo "tags=$TAGS" + } >> "$GITHUB_OUTPUT"🧰 Tools
🪛 actionlint (1.7.4)
150-150: shellcheck reported issue in this script: SC2129:style:15:1: Consider using { cmd1; cmd2; } >> file instead of individual redirects
(shellcheck)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (34)
.github/workflows/build_copier.yml
(1 hunks).github/workflows/build_worker.yml
(1 hunks)server/e2e/integration_model_test.go
(1 hunks)server/internal/adapter/integration/model.go
(1 hunks)server/internal/adapter/integration/server.gen.go
(7 hunks)server/internal/infrastructure/gcp/config.go
(1 hunks)server/internal/infrastructure/gcp/taskrunner.go
(4 hunks)server/internal/infrastructure/gcp/taskrunner_test.go
(2 hunks)server/internal/usecase/interactor/model.go
(3 hunks)server/internal/usecase/interactor/model_test.go
(2 hunks)server/internal/usecase/interfaces/model.go
(2 hunks)server/pkg/integrationapi/types.gen.go
(3 hunks)server/pkg/schema/schema.go
(1 hunks)server/pkg/schema/schema_test.go
(1 hunks)server/pkg/task/task.go
(2 hunks)server/schemas/integration.yml
(2 hunks)worker/Dockerfile
(0 hunks)worker/cmd/copier/main.go
(1 hunks)worker/copier.Dockerfile
(1 hunks)worker/internal/adapter/http/copy.go
(1 hunks)worker/internal/adapter/http/main.go
(1 hunks)worker/internal/app/main.go
(4 hunks)worker/internal/app/repo.go
(2 hunks)worker/internal/infrastructure/mongo/common_test.go
(1 hunks)worker/internal/infrastructure/mongo/container.go
(1 hunks)worker/internal/infrastructure/mongo/copier.go
(1 hunks)worker/internal/infrastructure/mongo/copier_test.go
(1 hunks)worker/internal/infrastructure/mongo/webhook.go
(1 hunks)worker/internal/infrastructure/mongo/webhook_test.go
(0 hunks)worker/internal/usecase/interactor/copier.go
(1 hunks)worker/internal/usecase/interactor/usecase.go
(1 hunks)worker/internal/usecase/interactor/webhook.go
(2 hunks)worker/internal/usecase/repo/container.go
(1 hunks)worker/internal/usecase/repo/copier.go
(1 hunks)
💤 Files with no reviewable changes (2)
- worker/internal/infrastructure/mongo/webhook_test.go
- worker/Dockerfile
✅ Files skipped from review due to trivial changes (3)
- worker/internal/infrastructure/mongo/common_test.go
- worker/internal/usecase/repo/container.go
- worker/copier.Dockerfile
🧰 Additional context used
🪛 actionlint (1.7.4)
.github/workflows/build_copier.yml
31-31: shellcheck reported issue in this script: SC2236:style:4:9: Use -n instead of ! -z
(shellcheck)
78-78: shellcheck reported issue in this script: SC2129:style:15:1: Consider using { cmd1; cmd2; } >> file instead of individual redirects
(shellcheck)
.github/workflows/build_worker.yml
150-150: shellcheck reported issue in this script: SC2129:style:15:1: Consider using { cmd1; cmd2; } >> file instead of individual redirects
(shellcheck)
🔇 Additional comments (34)
worker/internal/app/main.go (3)
14-14
: Adopt consistent import ordering if required by linter.
No functional issues, but check if there's a project-wide rule about import grouping or ordering.
55-55
: Field addition is valid.
This newly introduced field, Repos *repo.Container, seamlessly integrates with the earlier changes. No issues found.
33-33
: Validate concurrency or dependency injection patterns.
Here, we pass both gateways and repos to initialize the usecases. Ensure that no concurrency or data-race issues arise if these containers are used in parallel.
worker/internal/app/repo.go (5)
9-9
: Import alias usage seems appropriate.
Nice use of an alias (rmongo) to differentiate from the standard mongo package.
11-15
: Imports provide needed references.
All newly introduced dependencies (repo, mongo, otelmongo, etc.) align well with the new repository approach.
71-71
: Return signature updated.
Returning both gateways and repos aligns with the updated use case. Implementation details are consistent with the function’s name.
5-5
: Check for potential time package usage in other files.
If the project already has a custom time wrapper, consider whether to use it for consistency.
✅ Verification successful
Based on the search results, I can see that the time
package is used extensively throughout the codebase for various purposes like:
- Timestamps in tests
- Setting update times for models, views, projects, etc.
- Handling datetime values in items and fields
- Managing expiration times for uploads
- Authentication timing
There is no evidence of a custom time wrapper being used - all usages directly reference the standard time
package. The codebase consistently uses the standard Go time
package for all time-related operations.
Standard time package usage is appropriate here
The standard time
package is being used consistently across the codebase, and there's no custom time wrapper to maintain consistency with.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Searching for any custom time wrapper usage
rg -A 3 "time.Now|customTime"
Length of output: 52794
53-70
: Validate MongoDB connection error handling.
The code handles connection failure by logging a fatal error, which is acceptable for a worker service. Check if graceful retries or backoff are desired if DB connectivity is crucial.
✅ Verification successful
Let me search for any retry mechanisms or connection handling patterns in the codebase.
MongoDB connection handling is appropriate for the worker service.
The current implementation using log.Fatalf
for MongoDB connection errors is suitable for a worker service because:
- The code sets a reasonable connection timeout of 10 seconds
- The worker service requires a functioning database to operate, and failing fast is appropriate
- The service is likely managed by container orchestration (as seen in docker-compose files) which will handle restarts
- The codebase shows a pattern of using fatal errors for critical infrastructure initialization failures
While retry mechanisms exist in other parts of the codebase (e.g., task runners, locks), they are used for operational tasks rather than infrastructure initialization. The current approach aligns with the application's architecture and deployment strategy.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Searching for repeated usage or definition of DB connection logic that might conflict with this approach
rg -A 10 "mongo.Connect"
Length of output: 3274
Script:
#!/bin/bash
# Search for retry-related code or connection handling patterns
rg -i "retry|backoff|reconnect" -A 5
# Also check for any MongoDB connection configuration in config files
fd -e yaml -e yml -e json -e toml | xargs rg -i "mongo|database"
Length of output: 17573
worker/internal/infrastructure/mongo/copier.go (3)
39-42
: Fallback error handling for uninitialized collection.
Returning rerror.ErrInternal is a good practice. Ensure higher-level calls handle or log the error.
133-144
: ID generation strategy.
generateId function only supports “item”, “schema”, “model”. For unrecognized types, it returns an empty string. If the calling function is not prepared for that scenario, it may be an issue.
✅ Verification successful
Ignoring the boolean return value is safe in this context
The code is actually safe. The generateId
function is only called in the ChangeTypeNew
case where change.Value
is expected to be one of the supported types. The boolean return value is intentionally ignored (newId, _ := generateId(change.Value)
) as the empty string returned for unsupported types is an acceptable fallback value for the result[k] = newId
assignment.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for usage of generateId results to see unhandled false returns
rg -A 3 "generateId("
Length of output: 96
Script:
#!/bin/bash
# Search for usage of generateId function with proper escaping
rg -A 3 "generateId\("
Length of output: 632
Script:
#!/bin/bash
# Get more context around the usage of generateId to understand the error handling
rg -B 5 -A 10 "generateId\("
Length of output: 2162
75-84
: Verify filter parsing.
JSON unmarshal from a string is convenient but can raise runtime errors if the input is not verified. Confirm that upstream logic sanitizes or validates the filter string.
server/internal/usecase/interactor/model.go (6)
16-16
: Task package usage for copying.
The introduction of “task” references is consistent with the worker approach. No immediate issues.
355-375
: copyModel logic.
We see the creation of a new model with a random key. This is correct to avoid key collisions. Confirm uniqueness constraints are adequately tested.
415-438
: Building filter and changes for the item copy event.
The JSON marshalling ensures it’s compatible with the worker’s copying logic. It’s good that “id” is re-generated. Confirm that no references to the old item ID remain.
22-22
: Import of go.mongodb.org/mongo-driver/bson
Used for building filter objects. Ensure no direct DB logic is creeping into the business logic layer.
✅ Verification successful
Let me search for any direct MongoDB usage in the business logic layer.
MongoDB usage is limited to filter serialization only
The bson
package is only used to serialize a filter object to JSON before passing it to a task. This is not direct database logic but rather a data structure conversion, which is acceptable in the business logic layer.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Look for direct calls to Mongo driver in the model interactor beyond tasks
rg -A 5 "mongo\."
Length of output: 16017
Script:
#!/bin/bash
# Look for bson usage in the usecase/interactor directory
rg -A 5 "bson\." server/internal/usecase/interactor/
Length of output: 474
321-353
: Full copy operation entry point.
The transaction-based approach is good, as it ensures partial results are rolled back. Thoroughly test large copies or concurrency.
✅ Verification successful
Model copy operation is properly implemented with transaction safety
The code analysis reveals that the Copy operation is well-implemented with proper transaction handling and no concurrent access issues:
- The operation is wrapped in a transaction using
Run1
with proper rollback on failure - No concurrent goroutines or mutex locks are used/needed as the operation is already transaction-protected
- Copy operations are properly isolated through the repository layer
- The implementation includes proper error handling at each step (model, schema, items, metadata)
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Identify concurrency usage around 'Copy' in the codebase
rg -A 7 "i\.Copy\(.*context\.Context"
Length of output: 39
Script:
#!/bin/bash
# Let's try a broader search for Copy method usage and concurrent operations
rg -i "Copy.*context" -A 5
# Also search for any goroutine or concurrent operations near model operations
rg "go\s+func.*model" -A 5
# Check for mutex usage in model-related files
fd "model.*\.go" --exec rg "sync\.Mutex|Lock\(\)|Unlock\(\)" {} \;
Length of output: 8609
377-388
: Schema duplication.
We see newSchema.CopyFrom(oldSchema). Confirm that relations, fields, or constraints are properly replicated if they exist.
✅ Verification successful
Let me check the actual implementation of CopyFrom
to verify what exactly is being copied.
Let me check the Field type to understand what's being cloned along with the schema.
Schema duplication is properly implemented
The implementation of CopyFrom
in server/pkg/schema/schema.go
shows that it correctly handles the duplication of schema components:
- Fields are deep copied using
slices.Clone
- Title field reference is cloned using
CloneRef()
- The Field struct contains all necessary schema components including id, name, description, key, constraints (unique, multiple, required), type properties, and default values
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Searching for calls to 'CopyFrom' to see if there are additional constraints/fields
rg -A 3 "CopyFrom"
Length of output: 1440
Script:
#!/bin/bash
# Let's look at the Schema struct and CopyFrom implementation
ast-grep --pattern 'type Schema struct {
$$$
}'
# Also get the full implementation of CopyFrom
ast-grep --pattern 'func (s *Schema) CopyFrom($_) {
$$$
}'
Length of output: 3522
Script:
#!/bin/bash
# Get the Field type definition
ast-grep --pattern 'type Field struct {
$$$
}'
# Also check if Field has any Clone method
ast-grep --pattern 'func (f *Field) Clone() {
$$$
}'
Length of output: 1884
server/e2e/integration_model_test.go (1)
87-106
: Validate additional properties after copy.
While the “id”, “projectId”, “schemaId”, etc. are verified, consider confirming that fields like “key” or “public” are handled properly (especially if the copy logic changes these fields).
worker/internal/usecase/interactor/usecase.go (1)
13-17
: Validate container dependencies.
Before building the Usecase, ensure that “r” is not nil and that required repository fields within the container are initialized. Consider adding a quick nil check to avoid runtime panics.
worker/internal/adapter/http/main.go (1)
8-8
: LGTM!
The addition of CopyController
follows the existing pattern and is properly initialized in the constructor.
Also applies to: 12-15
server/internal/infrastructure/gcp/taskrunner_test.go (2)
17-18
: LGTM: Clear variable naming for image configurations
The separation of decompressorImage
and copierImage
variables improves clarity and aligns with the single responsibility principle.
36-39
: Verify copier configuration integration
The configuration looks good, but we should ensure the copier topic is properly configured for the copy operations.
Consider adding a CopierTopic
configuration similar to DecompressorTopic
for consistency and future extensibility.
server/pkg/task/task.go (1)
67-70
: LGTM: Well-defined change types
The ChangeType constants are clearly defined and provide good type safety for copy operations.
server/internal/usecase/interfaces/model.go (1)
64-64
: LGTM! Interface method signature is well-defined
The Copy method signature properly includes context and operator parameters, maintaining consistency with other interface methods.
server/pkg/schema/schema.go (1)
176-182
: 🛠️ Refactor suggestion
Verify deep copy of field references
While the implementation handles basic copying, there might be issues with field references not being properly deep copied. Consider the following improvements:
- Add validation to ensure field types are compatible
- Implement proper deep copy for field references
func (s *Schema) CopyFrom(s2 *Schema) {
if s == nil || s2 == nil {
return
}
- s.fields = slices.Clone(s2.fields)
+ s.fields = make([]*Field, len(s2.fields))
+ for i, f := range s2.fields {
+ s.fields[i] = f.Clone() // Assuming Field has a Clone method
+ }
s.titleField = s2.TitleField().CloneRef()
}
✅ Verification successful
Based on the gathered evidence, I can now provide the final response:
Field references are properly deep copied
The implementation correctly handles deep copying of field references:
Field.Clone()
properly clones all fields includingtypeProperty
anddefaultValue
TypeProperty.Clone()
performs a deep copy of all its fields, including the reference field- Each field type (text, number, reference, etc.) has its own Clone method
The current implementation in CopyFrom
using slices.Clone
is sufficient since Field
objects are properly cloned through their Clone()
method.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for field reference usage in schemas
ast-grep --pattern 'type Field struct {
$$$
TypeProperty() $_ {
$$$
}
$$$
}'
Length of output: 90
Script:
#!/bin/bash
# Let's check for Field struct definition and its methods
rg "type Field struct" -A 10
# Check for Clone or similar methods on Field
rg "func.*Field.*Clone" -A 5
# Check for field-related methods in Schema
rg "func.*Schema.*field" -i -A 5
Length of output: 84186
Script:
#!/bin/bash
# Let's check the Field.Clone() implementation
rg "func \(f \*Field\) Clone\(\)" -A 10 server/pkg/schema/field.go
Length of output: 274
Script:
#!/bin/bash
# Let's see the complete Field.Clone() implementation and check if it properly clones all fields
rg "func \(f \*Field\) Clone\(\)" -B 5 -A 20 server/pkg/schema/field.go
# Also check if TypeProperty has a Clone method
rg "func.*TypeProperty.*Clone" -A 5 server/pkg/schema/
Length of output: 940
Script:
#!/bin/bash
# Let's check the TypeProperty struct and its Clone implementation
rg "type TypeProperty struct" -A 10 server/pkg/schema/type_property.go
# And see the complete Clone implementation
rg "func \(t \*TypeProperty\) Clone\(\)" -B 5 -A 20 server/pkg/schema/type_property.go
Length of output: 1159
server/pkg/integrationapi/types.gen.go (2)
708-711
: LGTM! Simple and focused request body for model copy operation.
The request body only contains an optional name field, keeping the API minimal and focused.
542-542
: Verify API compatibility for TitleField JSON tag change.
The JSON tag for TitleField has been changed from PascalCase to camelCase. While this aligns with JSON naming conventions, it's a breaking change that could affect API consumers.
Run the following script to check for existing usages:
server/schemas/integration.yml (1)
452-484
: LGTM! Well-defined endpoint with comprehensive error handling.
The endpoint definition is complete with:
- Clear path parameters
- Bearer token security
- Proper error responses (400, 401, 404, 500)
- Reuse of existing model schema for response
server/internal/adapter/integration/server.gen.go (3)
78-80
: LGTM! Clear method documentation and consistent interface definition.
The CopyModel method is well-documented and follows the same pattern as other API methods.
513-529
: LGTM! Robust parameter handling and error checking.
The handler implementation includes:
- Proper parameter binding
- Standard error handling
- Security scope setting
2278-2324
: LGTM! Complete response type implementations.
The response types are well-defined with:
- Proper content type headers
- Error response implementations
- Consistent with OpenAPI spec
.github/workflows/build_worker.yml (3)
127-131
: LGTM! Job definition follows established patterns.
The new docker_copier
job is well-structured and consistent with existing jobs in the workflow.
132-143
: LGTM! Setup steps are properly configured.
The setup steps use the latest stable versions of actions and follow the same secure pattern for Docker authentication.
168-178
: Verify the existence of copier.Dockerfile.
The build configuration looks correct, but we should verify that ./worker/copier.Dockerfile
exists and contains the necessary configuration for the copier service.
✅ Verification successful
Dockerfile exists and follows best practices
The ./worker/copier.Dockerfile
exists and follows recommended Docker practices:
- Uses multi-stage build to minimize image size
- Copies only the necessary binary from builder
- Uses scratch as base image for runtime
- Sets proper ENTRYPOINT
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check if the Dockerfile exists and show its content
if [ -f "./worker/copier.Dockerfile" ]; then
echo "Dockerfile exists. Content:"
cat "./worker/copier.Dockerfile"
else
echo "Error: ./worker/copier.Dockerfile not found"
exit 1
fi
Length of output: 302
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
♻️ Duplicate comments (2)
server/internal/infrastructure/gcp/taskrunner.go (2)
167-174
: 🛠️ Refactor suggestionExtract hardcoded build arguments to configuration
Similar to the decompressAsset function, these build arguments should be configurable.
+type CopierConfig struct { + ThreadCount string + GCLimit string + ChunkSize string + DiskLimit string + SkipTop bool + OldWindows bool +} type TaskConfig struct { // ... existing fields ... + CopierConfig *CopierConfig } -Args: []string{ - "-v", - "-n=192", - "-gc=5000", - "-chunk=1m", - "-disk-limit=20g", - "-skip-top", - "-old-windows", -}, +args := []string{"-v"} +if conf.CopierConfig != nil { + if conf.CopierConfig.ThreadCount != "" { + args = append(args, "-n="+conf.CopierConfig.ThreadCount) + } + // ... add other configurable arguments +}
176-178
:⚠️ Potential issueAdd input validation for environment variables
The Copy payload fields are used directly in environment variables without validation. This could lead to injection vulnerabilities.
+func sanitizeEnvValue(value string) string { + // Remove any characters that could cause issues in environment variables + // This is a basic example - implement proper sanitization based on your requirements + return strings.Map(func(r rune) rune { + if unicode.IsLetter(r) || unicode.IsNumber(r) || r == '-' || r == '_' || r == '.' { + return r + } + return -1 + }, value) +} Env: []string{ - "REEARTH_CMS_COPIER_COLLECTION=" + p.Copy.Collection, - "REEARTH_CMS_COPIER_FILTER=" + p.Copy.Filter, - "REEARTH_CMS_COPIER_CHANGES=" + p.Copy.Changes, + "REEARTH_CMS_COPIER_COLLECTION=" + sanitizeEnvValue(p.Copy.Collection), + "REEARTH_CMS_COPIER_FILTER=" + sanitizeEnvValue(p.Copy.Filter), + "REEARTH_CMS_COPIER_CHANGES=" + sanitizeEnvValue(p.Copy.Changes), },
🧹 Nitpick comments (5)
server/internal/infrastructure/gcp/taskrunner.go (1)
182-184
: Consider making disk size configurable for copy operationsThe copy operation uses a hardcoded defaultDiskSizeGb value. Consider making this configurable similar to the decompressor.
Options: &cloudbuild.BuildOptions{ - DiskSizeGb: defaultDiskSizeGb, + DiskSizeGb: conf.CopierDiskSizeGb, },server/pkg/task/task.go (2)
64-68
: Add documentation for Changes and Change typesThe types lack documentation explaining their purpose and usage.
Add documentation:
+// Changes represents a map of field changes to be applied during copy operations type Changes map[string]Change + +// Change represents a single field modification with its type and new value type Change struct { Type ChangeType Value string }
71-74
: Consider adding validation for ChangeType valuesThe ChangeType constants should be validated when used.
Add validation:
const ( ChangeTypeSet ChangeType = "set" ChangeTypeNew ChangeType = "new" ) + +// IsValid checks if the ChangeType is a valid value +func (ct ChangeType) IsValid() bool { + switch ct { + case ChangeTypeSet, ChangeTypeNew: + return true + default: + return false + } +}.github/workflows/build_copier.yml (1)
7-10
: Consider adding timeout to concurrency configurationThe concurrency configuration lacks a timeout, which could lead to stuck workflows.
Add timeout:
concurrency: group: ${{ github.workflow }}-${{ github.event.workflow_run.head_branch }} cancel-in-progress: true + timeout-minutes: 60
server/internal/usecase/interactor/model.go (1)
415-439
: Add retry mechanism for copyItemsThe copyItems method should handle transient failures during the copy operation.
Add retry logic:
func (i Model) copyItems(ctx context.Context, oldSchemaID, newSchemaID id.SchemaID, newModelID id.ModelID) error { + maxRetries := 3 + var lastErr error + + for attempt := 1; attempt <= maxRetries; attempt++ { collection := "item" filter, err := json.Marshal(bson.M{"schema": oldSchemaID.String()}) if err != nil { return err } changes, err := json.Marshal(task.Changes{ "id": { Type: task.ChangeTypeNew, Value: "item", }, "schema": { Type: task.ChangeTypeSet, Value: newSchemaID.String(), }, "model": { Type: task.ChangeTypeSet, Value: newModelID.String(), }, }) if err != nil { return err } - return i.triggerCopyEvent(ctx, collection, string(filter), string(changes)) + if err := i.triggerCopyEvent(ctx, collection, string(filter), string(changes)); err != nil { + lastErr = err + log.Warnf("attempt %d failed: %v", attempt, err) + continue + } + return nil + } + return fmt.Errorf("failed after %d attempts: %v", maxRetries, lastErr) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
.github/workflows/build_copier.yml
(1 hunks)server/internal/infrastructure/gcp/taskrunner.go
(4 hunks)server/internal/usecase/interactor/model.go
(3 hunks)server/pkg/task/task.go
(2 hunks)
🧰 Additional context used
🪛 actionlint (1.7.4)
.github/workflows/build_copier.yml
31-31: shellcheck reported issue in this script: SC2236:style:4:9: Use -n instead of ! -z
(shellcheck)
78-78: shellcheck reported issue in this script: SC2129:style:15:1: Consider using { cmd1; cmd2; } >> file instead of individual redirects
(shellcheck)
🔇 Additional comments (3)
server/internal/infrastructure/gcp/taskrunner.go (1)
119-120
: 🛠️ Refactor suggestion
Extract hardcoded build step arguments to configuration
The build step arguments contain multiple hardcoded values that should be configurable. This makes the application less flexible and harder to maintain.
+type DecompressorConfig struct {
+ ThreadCount string
+ GCLimit string
+ ChunkSize string
+ DiskLimit string
+ SkipTop bool
+ OldWindows bool
+}
type TaskConfig struct {
// ... existing fields ...
+ DecompressorConfig *DecompressorConfig
}
-Args: []string{"-v", "-n=192", "-gc=5000", "-chunk=1m", "-disk-limit=20g", "-gzip-ext=" + conf.DecompressorGzipExt, "-skip-top", "-old-windows", src, dest},
+args := []string{"-v"}
+if conf.DecompressorConfig != nil {
+ if conf.DecompressorConfig.ThreadCount != "" {
+ args = append(args, "-n="+conf.DecompressorConfig.ThreadCount)
+ }
+ // ... add other configurable arguments
+}
+args = append(args, "-gzip-ext="+conf.DecompressorGzipExt, src, dest)
Likely invalid or redundant comment.
server/pkg/task/task.go (1)
48-52
: 🛠️ Refactor suggestion
Consider enhancing CopyPayload validation
The CopyPayload struct contains critical fields for copy operations but lacks comprehensive validation.
Consider adding field-level validation:
type CopyPayload struct {
Collection string
Filter string
Changes string
+
+ // Validate checks if the payload fields are valid
+ func (p *CopyPayload) Validate() error {
+ if p == nil {
+ return errors.New("payload cannot be nil")
+ }
+ if p.Collection == "" {
+ return errors.New("collection is required")
+ }
+ if p.Filter == "" {
+ return errors.New("filter is required")
+ }
+ if !json.Valid([]byte(p.Changes)) {
+ return errors.New("changes must be valid JSON")
+ }
+ return nil
+ }
}
Likely invalid or redundant comment.
.github/workflows/build_copier.yml (1)
97-106
: 🛠️ Refactor suggestion
Add resource limits to Docker build step
The Docker build step lacks resource constraints, which could impact CI/CD performance.
Add resource limits:
uses: docker/build-push-action@v6
with:
context: ./worker
file: ./worker/copier.Dockerfile
platforms: ${{ steps.options.outputs.platforms }}
push: true
build-args: VERSION=${{ steps.options.outputs.version }}
tags: ${{ steps.options.outputs.tags }}
cache-from: type=gha
cache-to: type=gha,mode=max
+ timeout-minutes: 60
+ memory: 6G
+ cpu-limit: 2
Likely invalid or redundant comment.
Overview
This PR implements schemata and items copy
Summary by CodeRabbit
Release Notes
New Features
Improvements
Bug Fixes
Tests