Skip to content

Conversation

@nandastone
Copy link

@nandastone nandastone commented Jan 8, 2026

Adds support for .stashignore files that allow users to exclude files and directories from scanning using gitignore-style patterns. Closes #1139.

Some LLM usage as Go is not my preferred language.

Changes

  • New StashIgnoreFilter in pkg/file/stashignore.go that checks for .stashignore files during scans
  • Integrated into scanFilter alongside existing exclusion logic. The scanner doesn't appear to expose plugin hooks for file filtering.
  • Supports nested .stashignore files (patterns cascade from library root to subdirectories)
  • Uses go-gitignore library for pattern matching
  • Added user documentation in Tasks.md

Usage

Place a .stashignore file in any directory within your library:

# Ignore temp files

*.tmp
temp/

# But keep this one

!important.tmp

Test Plan

  • 16 unit tests covering pattern syntax (wildcards, negation, nested files, etc.)
  • 4 integration tests with real scanner and database
  • All existing tests pass

@feederbox826
Copy link
Collaborator

Why not extend or interface with it from the existing ignores menu?

@nandastone
Copy link
Author

nandastone commented Jan 9, 2026

Why not extend or interface with it from the existing ignores menu?

Different mental models:

  • Global excludes = "never scan X anywhere" (admin policy).
  • .stashignore = "in this folder, skip these" (local override). Both are valid, complementary use cases.

The .stashignore file lives with your content. You see what's ignored by looking at the folder. You can share it, back it up, version control it, etc.

And finally, glob syntax (as opposed to regex) is a familiar convention: .gitignore, .dockerignore, .npmignore, etc. The different syntax is why the PR doesn't "append" the discovered file rules to the existing regex rules system.

We also wouldn't want to try and "sync" the .stashignore rules to the regex rules, a two way sync would be brittle and complex.

@WithoutPants WithoutPants added the feature Pull requests that add a new feature label Jan 12, 2026
@WithoutPants WithoutPants added this to the Version 0.31.0 milestone Jan 12, 2026
Copy link
Collaborator

@WithoutPants WithoutPants left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Static code review only

Comment on lines 43 to 46
// Always accept .stashignore files themselves so they can be read.
if filepath.Base(path) == stashIgnoreFilename {
return true
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this true? I don't think we actually need to accept the files themselves.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, we load directly from the FS, not via our scanner, so this exception isn't needed. Removed.

Comment on lines 153 to 158
patterns, err := f.loadIgnoreFile(stashIgnorePath)
if err != nil || patterns == nil {
// Cache negative result (file doesn't exist or has no patterns).
f.cache.Store(dir, &ignoreEntry{patterns: nil, dir: dir})
return nil
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we log something if the file fails to be loaded?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now logging unexpected file system errors e.g. couldn't open file. "Expected" errors like "file doesn't exist" are ignored.

// collectIgnoreEntries gathers all ignore entries from library root to the given directory.
// It walks up the directory tree from dir to libraryRoot and returns entries in order
// from root to most specific.
func (f *StashIgnoreFilter) collectIgnoreEntries(dir string, libraryRoot string) []*ignoreEntry {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having to do this for every scanned file is likely to cause some non-trivial overhead. We'd perhaps get some benefit from caching the ignoreEntry list for a given directory path. It needn't be exhaustive - a LRU cache would probably do and would save calculating everything, it could be used to shortcut the calculation for sub-directories as well.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point, I've implemented both optimisations:

  1. LRU cache for collected entries keyed by dir + libraryRoot. This means files in the same directory will avoid re-walking the tree.
  2. If parent's entries are cached, we just check if the child dir has a .stashignore and extend (if it has one), avoiding walking the tree.

All existing tests pass, but I didn't add tests specifically for the optimisations. Is that ok? if not, how might I approach those tests?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature Pull requests that add a new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] add ignore file to ignore a folder during scanning

3 participants