Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The invocations to git grep fails if the total length of filepaths causes the command line be over 32767 #192

Open
lemonade-dm opened this issue Apr 14, 2021 · 2 comments · May be fixed by #153

Comments

@lemonade-dm
Copy link

lemonade-dm commented Apr 14, 2021

So in the git-secrets script below there is an invocation to git grep where it supplies the list of source controlled files to the command

GREP_OPTIONS= LC_ALL=C git grep -nwHEI ${options} "${combined_patterns}" -- "${files[@]}"

If the aggregate list of the file paths supplied to the git grep command causes the comamnd line length to be larger than 32767 characters on Windows which is the max command line length
The invocation fails with the following error

C:\Users\<user name>\git-secrets\git-secrets: line 116: /mingw64/libexec/git-core/git: Argument list too long

This can occur with a merge from a large code base where the number of files and paths to those files causes the file list that would be supplied to git grep to be large.
Having several hundred files that are part of the merge that has a relative path similar to filepaths as Code/CoreLibrary/EditorViewport/ViewportFile.ext will eventually cause the single the git grep command to be over the 32767 line length

Now using "echo" and "xargs" allows the git-grep command to be invoked multiple times without issue
GREP_OPTIONS= LC_ALL=C printf "%s\0" "${files[@]}" | xargs -0 git grep -nwHEI ${options} "${combined_patterns}"

Now the caveat with using xargs, is that when any of invocations that it spawns off returns a return code between 1-125, xargs returns the return code of 123
This is explained in the xargs man page here

Now the process_output() function is checking for a return code of 1 to determine if the pattern wasn't matched.

git-secrets/git-secrets

Lines 131 to 145 in 80230af

process_output() {
local status="$1" output="$2"
local allowed=$(load_allowed)
case "$status" in
0)
[ -z "${allowed}" ] && echo "${output}" >&2 && return 1
# Determine with a negative grep if the found matches are allowed
echo "${output}" | GREP_OPTIONS= LC_ALL=C grep -Ev "${allowed}" >&2 \
&& return 1 || return 0
;;
1) return 0 ;;
*) exit $status
esac
}

Because of that the return code of 123 from xargs needs to be mapped to 1, to make sure that process_output
still function successfully

Therefore the full set of changes to the git_grep functions need to be the following

if (( ${#files[@]} )); then
  GREP_OPTIONS= LC_ALL=C printf "%s\0" "${files[@]}" | xargs -0 git grep -nwHEI ${options} "${combined_patterns}"
else
  GREP_OPTIONS= LC_ALL=C git grep -nwHEI ${options} "${combined_patterns}"
fi
rc=$?
[ $rc -eq 123 ] && return 1 # xargs returns 123 when the invocations return a value between 1-125
[ $rc -eq 0 ] && return 0
return $rc
@jrn
Copy link

jrn commented Apr 19, 2021

I'm guessing #153 would help with this.

@lemonade-dm
Copy link
Author

Yes, after making this PR, I found out that PR #153 already existed.
This PR only covers git_grep function, where #153 covers regular_grep as well.

@sparr sparr linked a pull request Jun 21, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants