Skip to content

Add repo-memory tool for git-based persistent storage with threat detection integration#5433

Merged
pelikhan merged 23 commits intomainfrom
copilot/add-repo-memory-tool
Dec 5, 2025
Merged

Add repo-memory tool for git-based persistent storage with threat detection integration#5433
pelikhan merged 23 commits intomainfrom
copilot/add-repo-memory-tool

Conversation

Copy link
Contributor

Copilot AI commented Dec 3, 2025

Implementation Plan for repo-memory Tool ✅

This PR adds a "repo-memory" tool that stores persistent data in git branches rather than GitHub Actions cache, enabling version-controlled memory with automatic conflict resolution and threat detection integration.

Requirements Checklist:

  • Add repo-memory configuration to frontmatter schema
  • Create RepoMemoryConfig and RepoMemoryToolConfig types
  • Parse repo-memory configuration from tools section
  • Add compiler logic to generate git clone steps
  • Generate prompt section informing agent about repo-memory location
  • Add post-run steps to push changes back to repo-memory branch
  • Handle merge conflicts (current version wins)
  • Support all configuration options
  • Files stored under memory/{id}/ directory
  • Add validation for file size and count limits
  • Write unit tests and integration tests
  • All tests passing ✅
  • Test workflow created ✅
  • Deep-report and daily-firewall-report workflows updated ✅
  • Default max file size changed to 10KB ✅
  • Merged main branch ✅
  • Updated dev.md to use repo-memory for poems ✅
  • Separate push_repo_memory job implemented ✅
  • Threat detection integration - only push if safe ✅
  • JavaScript implementation using github-script action ✅
  • Reuse compiler helper for git configuration ✅
  • Copy files from artifact to git repo before pushing ✅
  • Sparse checkout optimization ✅
  • COMPLETE: Handle sparse checkout in JavaScript ✅

Latest Changes (Sparse Checkout Handling):

JavaScript Update for Sparse Checkout:

  • Added git sparse-checkout disable command before branch operations
  • Ensures full repository access after sparse checkout was configured
  • Gracefully handles cases where sparse checkout wasn't enabled (catches and logs error)
  • All subsequent git operations work correctly with the repository

Why This Is Needed:
When sparse-checkout: . is used in actions/checkout, only git metadata is initialized without checking out file contents. Before we can work with the memory branch, we need to disable sparse checkout so we can:

  1. Checkout the memory branch with full file access
  2. Copy files from artifacts to the repository
  3. Stage, commit, and push changes

Implementation Details:

// Disable sparse checkout to work with full branch content
core.info(`Disabling sparse checkout...`);
try {
  execSync("git sparse-checkout disable", { stdio: "pipe" });
} catch (error) {
  // Ignore if sparse checkout wasn't enabled
  core.info("Sparse checkout was not enabled or already disabled");
}

Benefits:

  • Workflow starts faster (no unnecessary file downloads from main branch)
  • Git operations work correctly on the memory branch
  • Robust handling of both sparse and full checkout configurations
  • Clear logging for debugging

Complete Workflow:

  1. Checkout with sparse-checkout: . (fast, no file downloads)
  2. Configure git with credentials and user
  3. Disable sparse checkout in JavaScript
  4. Checkout memory branch (now works correctly)
  5. Copy and validate files from artifact
  6. Commit and push changes

All tests passing, workflows compiled successfully.

Original prompt

Add a "repo-memory" tool that acts similarly to the cache-memory tool but is stored in a git branch.

  • add to front matter
  • use current repository by default but allow specifying target-repo
  • memory-id is the workflow id or a user provided identifier
  • file glob filter to specify the type of files allowed
  • maximum size per file is 1mb
  • maximum file count per commit is 100
  • use memory/{memory-id} as the default branch name but it can be specified
  • if branch is missing, creates orphaned branch
  • files go under the "memory/{id}"

Compilers adds a step to clone the repo/branch/folder in a /tmp/ location (make sure git is configured to access it)
A prompt is added to explain where the files are and that it is a memory repo.
After the agent run, the changes are pushed to the merge. If merge issues, current version wins.


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 3 commits December 3, 2025 16:13
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Add repo-memory tool for file caching in git branches Add repo-memory tool for git-based persistent storage Dec 3, 2025
Copilot AI requested a review from pelikhan December 3, 2025 16:35
@pelikhan pelikhan marked this pull request as ready for review December 4, 2025 07:18
Copy link
Contributor

@pelikhan pelikhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot

  • add test workflow in test workflow folder that uses the repo-memory
  • update deepreport agentic workflow with deep memory to store long duration insights and notes. Only allow markdown files.

…ry with markdown-only constraint

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Dec 4, 2025

@copilot

  • add test workflow in test workflow folder that uses the repo-memory
  • update deepreport agentic workflow with deep memory to store long duration insights and notes. Only allow markdown fil...

Done! Changes in commit dc19331:

Test workflow added:

  • Created .github/workflows/tests/test-repo-memory.md that tests repo-memory persistence across runs
  • Validates file accumulation, counter increments, and timestamp tracking

Deep-report workflow updated:

  • Changed from cache-memory to repo-memory with branch memory/deep-report
  • Added file-glob: ["*.md"] constraint - only markdown files allowed
  • Updated all memory paths from /tmp/gh-aw/cache-memory/ to /tmp/gh-aw/repo-memory-default/memory/default/
  • Updated instructions to use .md extensions for all memory files (patterns, trends, timestamps, flagged items)

The workflow now stores long-term insights in version-controlled markdown files on the memory/deep-report branch.

Copilot AI requested a review from pelikhan December 4, 2025 13:03
@pelikhan
Copy link
Contributor

pelikhan commented Dec 4, 2025

@copilot move test-repo-memory to pkg/cli/workflows

Copy link
Contributor

@pelikhan pelikhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot set default max file size to 10kb

…ws and rename to test-copilot-repo-memory

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Dec 4, 2025

@copilot move test-repo-memory to pkg/cli/workflows

Done in commit 61f5a6c. Moved test workflow from .github/workflows/tests/test-repo-memory.md to pkg/cli/workflows/test-copilot-repo-memory.md following the naming convention of other test workflows in that directory.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Dec 4, 2025

@copilot set default max file size to 10kb

Done in commit 6be5281. Changed default max file size from 1MB (1048576 bytes) to 10KB (10240 bytes). Updated in:

  • repo_memory.go: All 4 default value locations
  • repo_memory_test.go: Test expectations
  • Both schema files: main_workflow_schema.json and included_file_schema.json

All tests pass. Generated workflows now display "10240 bytes (0.01 MB)" in agent prompts and validate with "max: 10240 bytes" in push steps.

@pelikhan
Copy link
Contributor

pelikhan commented Dec 4, 2025

@copilot update daily firewall report agentic workflow to use repo-memory

…script action

- Created push_repo_memory.cjs JavaScript implementation
- Removed push_repo_memory.sh shell script
- Updated repo_memory.go to use github-script action instead of inline bash
- Added embed directive for push_repo_memory.cjs in js.go
- Removed shell script embed from sh.go
- JavaScript implementation provides same functionality:
  - Git user configuration as github-actions[bot]
  - File pattern validation with glob filters
  - File size and count validation
  - Merge conflict resolution with ours strategy
  - Environment variable-based configuration
- All tests passing, code formatted and linted
- Workflows compile successfully with new JavaScript-based implementation

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Dec 5, 2025

Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • https://api.github.com/user
    • Triggering command: /usr/bin/gh gh api user --jq .login (http block)
    • Triggering command: /usr/bin/gh gh api user --jq .login icHhvRVdj GO111MODULE ps GOINSECURE GOMOD GOMODCACHE ps ortc�� entics.git stmain.go 1/x64/bin/node GOINSECURE GOMOD GOMODCACHE ache/go/1.25.0/x--jq (http block)
    • Triggering command: /usr/bin/gh gh api user --jq .login epo GO111MODULE fb399eaa4913fe26c68e16a04423cd6e72f/log.json GOINSECURE GOMOD GOMODCACHE che/go-build/81/8151ef1031f12ab4--engine env ference.git ference.git /opt/hostedtoolcache/go/1.25.0/x64/bin/go by/ab55a508a71f0/usr/bin/containerd-shim-runc-v2 GOMOD GOMODCACHE go (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI requested a review from pelikhan December 5, 2025 07:01
@pelikhan
Copy link
Contributor

pelikhan commented Dec 5, 2025

@copilot reuse helper code that configures the github user from other jobs in the compiler, do not set it up on the JavaScript

…avaScript

- Added call to generateGitConfigurationSteps() in buildPushRepoMemoryJob
- Removed git user configuration from push_repo_memory.cjs JavaScript file
- Git user now configured globally as github-actions[bot] via compiler helper
- Fixed FILE_GLOB_FILTER YAML quoting to prevent alias interpretation
- Updated test expectations from shell variables to JavaScript template literals
- All tests passing, code formatted and linted

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Dec 5, 2025

@copilot reuse helper code that configures the github user from other jobs in the compiler, do not set it up on the JavaScript

Done in commit b11490b.

Changes:

  • Added call to generateGitConfigurationSteps() after checkout in push_repo_memory job
  • Removed git user configuration from push_repo_memory.cjs JavaScript
  • Git now configured using standard compiler helper (same as create-pull-request, push-to-pull-request-branch, etc.)
  • Fixed FILE_GLOB_FILTER quoting to prevent YAML alias issues
  • Updated test expectations

The job now uses the shared git configuration helper that sets up github-actions[bot] globally, consistent with all other git operations in the codebase.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 5, 2025

⚠️ Dev Hawk Report - Failure Analysis

Workflow Run: #3038

  • Status: failure
  • Commit: b11490b
  • Duration: ~92 seconds
  • Failed Job: push_repo_memory

Root Cause Analysis

The workflow failed in the push_repo_memory job during step 5: "Push repo-memory changes (default)". This job runs after the main agent completes to persist memory data to a git branch.

Key Issue: The recent refactoring removed git user configuration from the JavaScript file (push_repo_memory.cjs) and expected it to be configured via the compiler helper's generateGitConfigurationSteps(). However, the generated workflow shows the git configuration step was successfully added (step 3: "Configure Git credentials" completed successfully), which means the failure occurred during the actual push operation in the JavaScript logic.

Changes in Failed Commit

The commit modified:

  • .github/workflows/dev.lock.yml - Added git configuration step
  • .github/workflows/daily-firewall-report.lock.yml - Added git configuration step
  • .github/workflows/deep-report.lock.yml - Added git configuration step
  • ⚠️ pkg/workflow/js/push_repo_memory.cjs - Removed 9 lines (git config logic)
  • pkg/workflow/repo_memory.go - Added compiler helper call
  • pkg/workflow/repo_memory_integration_test.go - Updated test expectations

Error Pattern

Looking at the job execution timeline:

  1. ✅ Checkout repository (succeeded)
  2. ✅ Configure Git credentials (succeeded)
  3. ✅ Download repo-memory artifact (succeeded)
  4. ❌ Push repo-memory changes - FAILED IMMEDIATELY (< 1 second)

The immediate failure suggests a JavaScript runtime error rather than a git operation failure. The removed git configuration code may have been referenced elsewhere in the script, or the script may be attempting git operations that still expect inline configuration.

Recommended Actions

  • Check JavaScript syntax: The removal of git config lines may have left orphaned code or broken the control flow
  • Verify git operations in JS: Ensure the JavaScript file doesn't still try to configure git or reference the removed code
  • Review execSync calls: Check if any git commands in the JS file assume local git config that was removed
  • Add error handling: The JavaScript should have try-catch blocks to report clearer error messages
  • Test locally: Run the compiled workflow locally or trigger a test with better error visibility

Investigation Notes

What we know:

  • Git credentials were configured successfully via the compiler helper
  • The failure occurred within the github-script action itself
  • The job completed in < 1 second, indicating an early failure (not a git network issue)
  • This is the first run after the refactoring that removed git config from JavaScript

Most likely cause:
The JavaScript file (push_repo_memory.cjs) likely has:

  1. A reference to removed code/variables, causing a ReferenceError
  2. Missing error handling that would show the actual error
  3. An assumption that git config would be set differently than the global config now provides

Next steps:
Review the actual JavaScript file content to identify what was removed and what remains, ensuring no broken references.

🦅 Observed from above by Dev Hawk

@pelikhan
Copy link
Contributor

pelikhan commented Dec 5, 2025

@copilot the push to repo memory is missing the step that copies the files from artifact folder into the git repository checkout folder. Then the push must be done from the git folder.

Use javascript to copy each file and validate file name, size restrictions on each file.

…ushing

Major architectural fix for push_repo_memory job:
- Download artifacts to temp location
- Checkout/create memory branch in git repo
- Copy files from artifact to git repo with validation
- Validate file names, sizes, and counts before copying
- Commit and push from git repository (not artifact directory)

JavaScript implementation changes:
- Added ARTIFACT_DIR and MEMORY_ID environment variables
- Removed MEMORY_DIR (was confusing - artifact vs repo)
- Files copied from artifact/memory/{id}/ to repo/memory/{id}/
- Validation happens during copy (not after git operations)
- Git operations now work correctly on actual repository

Go compiler changes:
- Updated buildPushRepoMemoryJob to pass ARTIFACT_DIR and MEMORY_ID
- Changed from MEMORY_DIR to ARTIFACT_DIR in environment variables

Test updates:
- Updated integration test expectations for new validation messages
- All unit tests passing

This fixes the critical issue where git operations were being performed on
the artifact directory instead of the checked-out repository.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Dec 5, 2025

@copilot the push to repo memory is missing the step that copies the files from artifact folder into the git repository checkout folder. Then the push must be done from the git folder.

Use javascript...

Done in commit 107d9ab.

Fixed the workflow:

  1. Downloads artifacts to temp location (/tmp/gh-aw/repo-memory-{id}/)
  2. Checks out the repository (already done via actions/checkout)
  3. Fetches/creates the memory branch in the repository
  4. Copies files from artifact to repository with JavaScript validation:
    • File name validation against glob patterns
    • File size validation (fails if > MAX_FILE_SIZE)
    • File count validation (fails if > MAX_FILE_COUNT)
  5. Commits and pushes from the repository directory

Environment variables changed:

  • MEMORY_DIRARTIFACT_DIR + MEMORY_ID
  • Files read from: {ARTIFACT_DIR}/memory/{MEMORY_ID}/
  • Files written to: {GITHUB_WORKSPACE}/memory/{MEMORY_ID}/

All validations happen in JavaScript during the copy operation, before any git operations.

@pelikhan
Copy link
Contributor

pelikhan commented Dec 5, 2025

@copilot configure actions/checkout to only configure git, but not checkout the repo content in the push repo memory job

…sh_repo_memory job

- Added sparse-checkout: . parameter to actions/checkout in push_repo_memory job
- This avoids downloading repository files since we immediately checkout the memory branch
- Reduces unnecessary data transfer and speeds up the workflow
- Git is properly configured but no repository content is checked out
- All tests passing, workflows recompiled successfully

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Dec 5, 2025

@copilot configure actions/checkout to only configure git, but not checkout the repo content in the push repo memory job

Done in commit 4da5496. Added sparse-checkout: . parameter to actions/checkout step in push_repo_memory job. This configures git without downloading repository file contents, since we immediately checkout the memory branch anyway. Reduces data transfer and improves workflow performance.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 5, 2025

⚠️ Dev Hawk Report - Failure Analysis

Workflow Run: #3040

  • Status: failure
  • Commit: 4da5496
  • Duration: ~1m 23s

Root Cause Analysis

The workflow failed in the push_repo_memory job during the "Push repo-memory changes (default)" step. This failure appears to be related to the sparse checkout optimization that was just implemented in this commit.

What Happened

  1. activation job completed successfully
  2. agent job completed successfully (created poem and saved to repo memory)
  3. push_repo_memory job failed at step 5

The Sparse Checkout Change

The commit message indicates:

Configure checkout to only setup git without checking out files in push_repo_memory job

  • Added sparse-checkout: . parameter to actions/checkout in push_repo_memory job

The workflow now uses:

- name: Checkout repository
  uses: actions/checkout@93cb6efe18208431cddfb8368fd83d5badbf9bfd # v5
  with:
    persist-credentials: false
    sparse-checkout: .

Likely Issue

The push_repo_memory job's JavaScript script (step 5) attempts to:

  1. Fetch and checkout the memory branch
  2. Copy files from the artifact to the repository
  3. Commit and push changes

Problem: With sparse-checkout: ., the git repository is initialized but has no working tree content. The script then tries to execute git commands that may expect a proper working directory structure. The error likely occurred when:

  • Running git checkout "${branchName}" after fetching
  • Running git status --porcelain to check for changes
  • Or during the git push operation

Recommended Actions

  • Review the sparse-checkout configuration: The sparse-checkout: . setting may be too restrictive for this use case. The push job needs to work with git branches and may need at least minimal working tree setup.

  • Alternative approach: Instead of using sparse-checkout: ., consider:

    • Using fetch-depth: 1 without sparse-checkout (faster but still checks out files)
    • Or modifying the JavaScript to handle the sparse checkout state properly
  • Check workflow logs: Access the full logs for step 5 to see the exact error message (logs may not be available immediately due to GitHub's log processing delay)

  • Test the fix: The optimization is sound in theory, but the implementation needs to account for git operations that require a working tree

Investigation Notes

Timing: The logs show this happened very quickly (within seconds), suggesting an immediate failure rather than a timeout or resource issue.

Pattern: This is a configuration error, not a code logic error. The sparse checkout optimization conflicts with how the git operations are performed in the push script.

Context: This optimization makes sense for the agent job (where we immediately switch to the memory branch anyway), but the push_repo_memory job has different requirements since it needs to:

  1. Start on the default branch (or a clean slate)
  2. Checkout/create the memory branch
  3. Manipulate files in the working tree
  4. Commit and push

Dev Hawk is monitoring your workflow runs and will update this analysis when more information becomes available. 🦅

🦅 Observed from above by Dev Hawk

@pelikhan
Copy link
Contributor

pelikhan commented Dec 5, 2025

@copilot update the JavaScript that pushes the changes to assume a sparse checkout and run git commands accordingly

- Added git sparse-checkout disable before branch operations
- Ensures repository is fully accessible after sparse checkout setup
- Handles case where sparse checkout wasn't enabled (ignores error)
- All git operations now work correctly with sparse checkout configuration
- Updated compiled workflows with new JavaScript implementation
- All tests passing

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Dec 5, 2025

@copilot update the JavaScript that pushes the changes to assume a sparse checkout and run git commands accordingly

Done in commit 2d675bd. Added git sparse-checkout disable command in JavaScript before checking out the memory branch. This ensures the repository is fully accessible for branch operations after sparse checkout was configured. The command gracefully handles cases where sparse checkout wasn't enabled by catching and logging the error.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 5, 2025

✅ Dev Hawk Report - Success

Workflow Run: #3041

  • Status: ✅ success
  • Commit: 2d675bd
  • Duration: 1m 35s
  • Triggered: 2025-12-05T15:38:21Z

The Dev workflow completed successfully! 🎉

What Was Validated

This run validated the latest sparse checkout handling improvements:

  • Git sparse checkout properly disabled before branch operations
  • Memory branch checkout works correctly after sparse checkout configuration
  • File copying from artifacts to git repository functions as expected
  • Commit and push operations succeed with proper repository access

The workflow confirms that all build, test, and compilation steps pass with the updated push_repo_memory JavaScript implementation.

🦅 Observed from above by Dev Hawk

@github-actions
Copy link
Contributor

github-actions bot commented Dec 5, 2025

✅ Dev Hawk Report - Success

Workflow Run: #3042

  • Status: ✅ success
  • Commit: 2d675bd
  • Duration: ~1 minute 34 seconds
  • Branch: copilot/add-repo-memory-tool

The Dev workflow completed successfully! 🎉

Summary

This workflow run validated the sparse checkout handling improvements for the repo-memory tool:

  • Updated push_repo_memory JavaScript to properly disable sparse checkout before branch operations
  • Ensures repository is fully accessible after sparse checkout setup
  • All git operations now work correctly with sparse checkout configuration
  • All tests passing ✅

The implementation correctly handles the case where sparse-checkout: . is used in actions/checkout, ensuring that the memory branch can be properly accessed, modified, and pushed.

🦅 Observed from above by Dev Hawk

@pelikhan pelikhan merged commit 7f04a2a into main Dec 5, 2025
11 checks passed
@pelikhan pelikhan deleted the copilot/add-repo-memory-tool branch December 5, 2025 16:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants