Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not rely on IFS from env when loading combined patterns #197

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

michaelfindlater
Copy link

Issue #, if available:
No issue #, but discovered this bug in the course of writing and testing a custom provider.

A secret provider is an executable that when invoked outputs prohibited patterns separated by new lines.

Regexes seemed to be broken up in a strange way, even when the provider output each correctly on a newline as per the docs (above).

Description of changes:

What?

  • load patterns delimited by '\n' rather than default IFS ' \t\n' (space, tab, newline)
  • use while instead of for to avoid modifying/relying on the global value of IFS

Why?

  • combined_patterns() concatenates patterns with | for use in a single grep command
  • using the default value of IFS means patterns containing a space character delimited by space, rather than newline
  • this causes unexpected behavior (false matches from words split by space)

Expected

For example, consider the following list of patterns:

^my-regex$
^my regex with spaces$

load_combined_patterns() should transform them into: ^my-regex$|^my regex with spaces$

Actual

The above example ends up as ^my-regex$|^my|regex|with|spaces$.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@creswick
Copy link
Contributor

Thanks @michaelfindlater! I'm concerned that this may break existing workflows, and may pose some cross-platform incompatibilities.

Could you add some tests to verify that (i) this change doesn't regress due to future changes, and (ii) confirm that the behavior is as expected when the IFS separator will work with both standard line ending sequences? (\r\n and \n)?

(Git-secrets uses bats for tests -- there is some general guidance in the README here https://github.com/sstephenson/bats )

@michaelfindlater
Copy link
Author

Hi @creswick! Thank you. As requested I have added some tests to help prevent this change regressing in the future.

Summary:

  • Windows \r\n shouldn't be a problem because \r is stripped within load_patterns(), which is called within load_combined_patterns()
  • As extra protection, I added \r to the IFS used within load_combined_patterns()
  • The tests call --scan with no arguments which invokes git-grep and, with arguments, invokes grep. The issue was only present when git-grep was invoked. I could reproduce the issue by running the tests with the old for loop.

I have tested this on macOS, Linux and WSL2.

If there's anything I can do further please do not hesitate to ask.

@amine-bee
Copy link

amine-bee commented Jan 4, 2023

Hello @michaelfindlater

I don't know if you also tested it but Windows \r\n are not stripped even if the code says so.
The patterns loaded using a custom provider on Windows are all messed up.

Feeding windows git secrets with patterns generated from an unix system - made the thing work. (\n)

@sparr
Copy link
Contributor

sparr commented Jun 21, 2023

@michaelfindlater Test coverage for @amine-bee's failure mode are also needed here, and a fix for whatever failure is produced by having a provider output \r\n-delimited patterns (or a confirmation that it's no longer happening).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants