Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows Unicode Character Prop fix, update appveyor, travis #37

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

MSP-Greg
Copy link

@MSP-Greg MSP-Greg commented Dec 3, 2017

I ended up here while trying to test the rdoc repo under windows. It uses kpeg to build two files.

There is a quirk in windows regexp where a Unicode character property like \p{**} can be used in a hard coded regexp, but one can't create one unless the encoding of it is UTF-8. So, this commit fixes that, in addition to several minor fixes:

  1. .travis.yml - added current 2.2 thru 2.4.

  2. appveyor.yml - added it, passes 2.0 thru trunk. See here. Going forward, if you don't want to set it up but would like a check, ping me and I can run one.

  3. lib/kpeg/format_parser.rb - commented out two unused variables.

  4. lib/kpeg/grammar.rb - this contains the windows patch re encoding. Tried to keep the constraint on it pretty tight.

  5. test/test_kpeg.rb, test/test_kpeg_code_generator.rb - changed some tests over to assert_nil from assert_equal.

  6. test/test_kpeg_string_escape.rb - ends with two blank lines, removed one...

Thanks, Greg

@MSP-Greg
Copy link
Author

MSP-Greg commented Dec 3, 2017

Update - with this PR, RDoc builds on Windows for all versions tested (2.2 - trunk). Without the patch, all versions fail.

@aycabta
Copy link

aycabta commented Dec 27, 2017

@evanphx This Pull Request is very important for RDoc because of test on AppVeyor fails. Please review this.

@graaff
Copy link

graaff commented May 20, 2018

This is actually a wider issue that does not affect only Windows. Trying to use kpeg on Linux with a non-UTF-8 locale results in similar issues. This breaks building the rdoc literals. Downstream bug report: https://bugs.gentoo.org/640150

Simple reproducer based on the rdoc source code:

LANG=C ruby -S kpeg -fsv -o lib/rdoc/markdown/literals.rb lib/rdoc/markdown/literals.kpeg

/usr/lib64/ruby/gems/2.3.0/gems/kpeg-1.1.0/lib/kpeg/grammar.rb:133:in `initialize': invalid character property name {Zl}: /\n|\r\n?|\p{Zl}|\p{Zp}/ (RegexpError)
	from /usr/lib64/ruby/gems/2.3.0/gems/kpeg-1.1.0/lib/kpeg/grammar.rb:133:in `new'
	from /usr/lib64/ruby/gems/2.3.0/gems/kpeg-1.1.0/lib/kpeg/grammar.rb:133:in `initialize'

I have generalized the patch to exclude the windows test and instead force encoding when it is not Encoding::UTF_8. This works as expected and allows things to work with LANG=C.

@evanphx
Copy link
Owner

evanphx commented Nov 1, 2022

I realize it's been absolutely forever since this PR was opened. I'm happy to fix it up if we still need it, please let me know @MSP-Greg.

@MSP-Greg
Copy link
Author

MSP-Greg commented Nov 2, 2022

No problem. Obviously, GitHub provides a pretty good UI for reviewing one's open PR's, and I need to use it.

This came out of running CI for Windows in RDoc, which is being done currently. Let me look into it. If it's still needed, maybe change to GitHub Actions?

@evanphx
Copy link
Owner

evanphx commented Nov 2, 2022

Let me know, happy to help figure it out.

@graaff
Copy link

graaff commented Nov 4, 2022

I've just checked that my reproducer mentioned earlier still causes the same issue with kpeg 1.3.2 and the rdoc 6.4.0 sources on ruby 3.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants