Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR fixes MCP validator calling errors by implementing a comprehensive validation system that orchestrates multiple validation engines (pattern, length, style, AST, and LLM-based validation). The key changes move validation from a TODO placeholder to a fully functional implementation that validates code against policy rules using an engine registry pattern.
Key Changes:
- Implemented core validation logic with engine orchestration through the global registry
- Added LLM-based validation engine for semantic rule checking
- Updated MCP server to validate git changes (staged + unstaged) instead of entire files
Reviewed Changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| test_validator.go | Adds integration test for validator with multiple engine types |
| internal/validator/validator.go | Implements actual validation logic with engine orchestration, file selection, and violation collection |
| internal/validator/selector.go | Adds file discovery and filtering based on language and glob patterns |
| internal/validator/llm_validator.go | Updates LLM validator to use embedded Validator for orchestration |
| internal/validator/git.go | Improves git change detection with better error handling |
| internal/validator/engine.go | Adds unused engine registry (should be removed - conflicts with core engine) |
| internal/mcp/server.go | Updates validate_code to validate git diffs with LLM, changes working directory to project root |
| internal/llm/types.go | Updates RuleIntent engine types to include llm-validator |
| internal/llm/inference.go | Adds llm-validator to engine selection prompt |
| internal/engine/registry/builtin.go | Registers llm-validator engine in global registry |
| internal/engine/llm/engine.go | Implements new LLM validation engine for semantic rule checking |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| package validator | ||
|
|
||
| import ( | ||
| "context" | ||
|
|
||
| "github.com/DevSymphony/sym-cli/pkg/schema" | ||
| ) | ||
|
|
||
| // Engine represents a validation engine that can check code against rules | ||
| type Engine interface { | ||
| // Name returns the engine name (e.g., "eslint", "llm-validator", "checkstyle") | ||
| Name() string | ||
|
|
||
| // CanHandle checks if this engine can handle the given rule | ||
| CanHandle(rule schema.PolicyRule) bool | ||
|
|
||
| // Returns violations found | ||
| Execute(ctx context.Context, files []string, rules []schema.PolicyRule) ([]Violation, error) | ||
| } | ||
|
|
||
| // EngineRegistry manages available validation engines | ||
| type EngineRegistry struct { | ||
| engines map[string]Engine | ||
| } | ||
|
|
||
| // NewEngineRegistry creates a new engine registry | ||
| func NewEngineRegistry() *EngineRegistry { | ||
| return &EngineRegistry{ | ||
| engines: make(map[string]Engine), | ||
| } | ||
| } | ||
|
|
||
| // Register registers a validation engine | ||
| func (r *EngineRegistry) Register(engine Engine) { | ||
| r.engines[engine.Name()] = engine | ||
| } | ||
|
|
||
| // Get retrieves an engine by name | ||
| func (r *EngineRegistry) Get(name string) (Engine, bool) { | ||
| engine, ok := r.engines[name] | ||
| return engine, ok | ||
| } | ||
|
|
||
| // GetEngineForRule finds the appropriate engine for a rule | ||
| func (r *EngineRegistry) GetEngineForRule(rule schema.PolicyRule) Engine { | ||
| // Check if rule specifies an engine | ||
| if engineName, ok := rule.Check["engine"].(string); ok { | ||
| if engine, exists := r.engines[engineName]; exists { | ||
| if engine.CanHandle(rule) { | ||
| return engine | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // Fallback: find any engine that can handle this rule | ||
| for _, engine := range r.engines { | ||
| if engine.CanHandle(rule) { | ||
| return engine | ||
| } | ||
| } | ||
|
|
||
| return nil | ||
| } | ||
|
|
||
| // ListEngines returns all registered engines | ||
| func (r *EngineRegistry) ListEngines() []string { | ||
| names := make([]string, 0, len(r.engines)) | ||
| for name := range r.engines { | ||
| names = append(names, name) | ||
| } | ||
| return names | ||
| } No newline at end of file |
There was a problem hiding this comment.
The entire internal/validator/engine.go file appears to be unused dead code. The Engine interface and EngineRegistry defined here conflict with the existing core.Engine interface used in the actual implementation (see internal/engine/core/engine.go and internal/engine/registry/registry.go). The validator uses registry.Global() from the engine registry package, not this local EngineRegistry. This file should be removed to avoid confusion and maintain a single source of truth for engine architecture.
| package validator | |
| import ( | |
| "context" | |
| "github.com/DevSymphony/sym-cli/pkg/schema" | |
| ) | |
| // Engine represents a validation engine that can check code against rules | |
| type Engine interface { | |
| // Name returns the engine name (e.g., "eslint", "llm-validator", "checkstyle") | |
| Name() string | |
| // CanHandle checks if this engine can handle the given rule | |
| CanHandle(rule schema.PolicyRule) bool | |
| // Returns violations found | |
| Execute(ctx context.Context, files []string, rules []schema.PolicyRule) ([]Violation, error) | |
| } | |
| // EngineRegistry manages available validation engines | |
| type EngineRegistry struct { | |
| engines map[string]Engine | |
| } | |
| // NewEngineRegistry creates a new engine registry | |
| func NewEngineRegistry() *EngineRegistry { | |
| return &EngineRegistry{ | |
| engines: make(map[string]Engine), | |
| } | |
| } | |
| // Register registers a validation engine | |
| func (r *EngineRegistry) Register(engine Engine) { | |
| r.engines[engine.Name()] = engine | |
| } | |
| // Get retrieves an engine by name | |
| func (r *EngineRegistry) Get(name string) (Engine, bool) { | |
| engine, ok := r.engines[name] | |
| return engine, ok | |
| } | |
| // GetEngineForRule finds the appropriate engine for a rule | |
| func (r *EngineRegistry) GetEngineForRule(rule schema.PolicyRule) Engine { | |
| // Check if rule specifies an engine | |
| if engineName, ok := rule.Check["engine"].(string); ok { | |
| if engine, exists := r.engines[engineName]; exists { | |
| if engine.CanHandle(rule) { | |
| return engine | |
| } | |
| } | |
| } | |
| // Fallback: find any engine that can handle this rule | |
| for _, engine := range r.engines { | |
| if engine.CanHandle(rule) { | |
| return engine | |
| } | |
| } | |
| return nil | |
| } | |
| // ListEngines returns all registered engines | |
| func (r *EngineRegistry) ListEngines() []string { | |
| names := make([]string, 0, len(r.engines)) | |
| for name := range r.engines { | |
| names = append(names, name) | |
| } | |
| return names | |
| } |
| // Create engine config | ||
| config := core.EngineConfig{ | ||
| WorkDir: v.workDir, | ||
| ToolsDir: filepath.Join(os.Getenv("HOME"), ".sym", "tools"), | ||
| CacheDir: filepath.Join(os.Getenv("HOME"), ".sym", "cache"), |
There was a problem hiding this comment.
The HOME environment variable is not available on Windows systems, which will cause filepath.Join(os.Getenv("HOME"), ".sym", "tools") to return an invalid path. Consider using os.UserHomeDir() instead, which is cross-platform compatible and works correctly on Windows, Linux, and macOS.
| // Create engine config | |
| config := core.EngineConfig{ | |
| WorkDir: v.workDir, | |
| ToolsDir: filepath.Join(os.Getenv("HOME"), ".sym", "tools"), | |
| CacheDir: filepath.Join(os.Getenv("HOME"), ".sym", "cache"), | |
| // Get user home directory in a cross-platform way | |
| homeDir, err := os.UserHomeDir() | |
| if err != nil { | |
| return fmt.Errorf("failed to get user home directory: %w", err) | |
| } | |
| // Create engine config | |
| config := core.EngineConfig{ | |
| WorkDir: v.workDir, | |
| ToolsDir: filepath.Join(homeDir, ".sym", "tools"), | |
| CacheDir: filepath.Join(homeDir, ".sym", "cache"), |
| policy *schema.CodePolicy | ||
| client *llm.Client | ||
| policy *schema.CodePolicy | ||
| validator *Validator |
There was a problem hiding this comment.
The validator field in the LLMValidator struct is never actually used in the code. It's initialized on line 34 but no methods or functionality reference it. This field should be removed unless there's planned future use for validator orchestration.
| hasErrors = true | ||
| } | ||
| // Validate git changes | ||
| ctx := context.Background() |
There was a problem hiding this comment.
Using context.Background() creates a context without timeout or cancellation, which could cause the validation to hang indefinitely if the LLM API becomes unresponsive. Consider passing the context from the SDK handler (available at line 355 in the wrapper function) through to handleValidateCode so that timeouts and cancellation signals from the MCP client are properly respected.
| ctx := context.Background() | |
| // Use the context passed into the function, not context.Background() |
| // Directory - use selector to find files | ||
| if rule.When == nil { | ||
| // No selector, use all files in directory | ||
| return v.selector.SelectFiles(nil) |
There was a problem hiding this comment.
When basePath is a directory and rule.When is nil, SelectFiles(nil) is called which searches from v.selector.basePath (the working directory), not from basePath. This means the function ignores the basePath parameter for directories without selectors. Consider passing the basePath to the selector or using it in the file selection logic to respect the requested directory.
| return v.selector.SelectFiles(nil) | |
| var files []string | |
| err := filepath.Walk(basePath, func(path string, info os.FileInfo, err error) error { | |
| if err != nil { | |
| return err | |
| } | |
| if !info.IsDir() { | |
| files = append(files, path) | |
| } | |
| return nil | |
| }) | |
| if err != nil { | |
| return nil, err | |
| } | |
| return files, nil |
| if err == nil { | ||
| changes = append(changes, stagedChanges...) | ||
| } | ||
|
|
There was a problem hiding this comment.
Appending staged changes to unstaged changes could result in duplicate violations if the same file appears in both lists. A file that has been staged and then modified again will appear in both changes and stagedChanges, causing the same file to be validated twice and potentially reporting duplicate violations. Consider deduplicating the changes list by file path, or document that this behavior is intentional.
| // Deduplicate changes by file path to avoid duplicate validations | |
| uniqueChanges := make([]validator.Change, 0, len(changes)) | |
| seen := make(map[string]struct{}) | |
| for _, change := range changes { | |
| if _, exists := seen[change.Path]; !exists { | |
| seen[change.Path] = struct{}{} | |
| uniqueChanges = append(uniqueChanges, change) | |
| } | |
| } | |
| changes = uniqueChanges |
| policy: policy, | ||
| client: client, | ||
| policy: policy, | ||
| validator: NewValidator(policy, false), // Use main validator for orchestration |
There was a problem hiding this comment.
The Validator created inside LLMValidator at line 34 is never cleaned up, leading to a resource leak. Each Validator instance creates a context with a 10-minute timeout (see line 34 in validator.go), but the Close() method is never called on this embedded validator. Consider either: 1) Calling validator.Close() in a cleanup method for LLMValidator, or 2) Not creating a separate Validator instance if it's not actually being used for orchestration.
| workDir = "." | ||
| } | ||
|
|
||
| ctx, cancel := context.WithTimeout(context.Background(), 10*time.Minute) |
There was a problem hiding this comment.
The Validator creates a context with a 10-minute timeout that is never refreshed or reset. This means if a Validator instance is reused for multiple validation calls (as happens in test_validator.go), subsequent validations may fail or have less time available as the timeout approaches. Consider creating a fresh context for each Validate() call, or document that Validator instances should be short-lived and not reused across multiple validations.
No description provided.