Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .claude/skills/create-eval/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ When creating a full eval from scratch, you will need to:
1. Create one or more tasks, see [tasks.md](tasks.md)
2. Create a mcp config file, see [mcpConfig.md](mcpConfig.md)
3. Create an agent file, see [agent.md](agent.md)
4. Create a top-level eval file that references the rest of the files, see [eval.yaml](eval.yaml)
4. Create a top-level eval file that references the rest of the files, see [eval.md](eval.md)

However, in most cases you will not be creating an entirely new set of evals from scratch - you will just be modifying or
extending an existing eval. In this case, you will only need to modify some of these files.
Expand All @@ -27,7 +27,7 @@ extending an existing eval. In this case, you will only need to modify some of t
To run the evals, use:

```bash
gevals run <path to eval yaml file>
gevals eval <path to eval yaml file>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, forgot to update this when we added #19

On that note, would you mind adding a line here explaining the -run flag? I'm assuming a lot of the time, claude will want to run only a subset of an eval suite

```

The `gevals` binary may or may not be in the `$PATH`. If it is not in the path, ask the user where it is.
Expand Down
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,10 +69,12 @@ kind: Agent
metadata:
name: "claude-code"
commands:
useVirtualHome: false
argTemplateMcpServer: "--mcp-config {{ .File }}"
argTemplateAllowedTools: "mcp__{{ .ServerName }}__{{ .ToolName }}"
allowedToolsJoinSeparator: ","
runPrompt: |-
claude {{ .McpServerFileArgs }} --print "{{ .Prompt }}"
claude {{ .McpServerFileArgs }} --strict-mcp-config --allowedTools "{{ .AllowedToolArgs }}" --print "{{ .Prompt }}"
```

**tasks/create-pod.yaml** - Test task:
Expand Down
2 changes: 1 addition & 1 deletion pkg/agent/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ type AgentCommands struct {
// A template command to run the agent with a prompt and some mcp servers
// the prompt will be in {{ .Prompt }}
// the servers will be in {{ .McpServerFileArgs }}
// the allowed tools will be in {{ .AllowedTools }}
// the allowed tools will be in {{ .AllowedToolArgs }}
RunPrompt string `json:"runPrompt"`

// An optional command to get the version of the agent
Expand Down