-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
text wrapping fixes #69
Conversation
When handling a string (say "abcde") that exceeded the specified width (say 2), wrapi_text() uses two values to put the final string together: * start: a string that serves as frozen/good start. It was calculated during the last iteration that exceeded the maximum width ("ab\nc", for the example above when splitting on each character and at the iteration for "e"). * add: a string representing the characters after `start` through the character of the current iteration ("d\ne"). When wrapi_text() joins these two values, it includes a newline, leading to a spurious newline between "c" and "d". Drop the newline, leaving just the original character that was used to split the string ("" in the example above).
This is already used in two spots, and the next commit will introduce a third. (While touching these lines, switch to more conventional spacing.)
wrapi_text() splits a string into pieces and then builds it back up, inserting a newline before the next piece if it would lead to the string exceeding the specified width. The `shift` variable is set whenever extending the string with the next piece would exceed the maximum width. It marks the last position that did _not_ exceed the bound. The logic for incrementing `shift`, however, is flawed. It's calculated by adding the current `shift` value to the current iteration's index, resulting in skipped values and NAs (from out-of-bounds indexing). One way to correct the value would be to set `shift` to the index for the current iteration. Instead set it to the index for the _next_ iteration, redefining `shift` to be the first piece to consider when extending the string. Defining it this way lets us use `shift` rather than `1 + shift` when indexing.
wrapi_text() splits a string into pieces and then builds it back up, inserting a newline before the next piece if it would lead to the string exceeding the specified width. The result is stored in the `start` variable, which gets extended each time the width is exceeded. To extend `start`, wrapi_text() appends the value of {pieces that did not exceed width}{sep}{newline}{piece that exceeded width} wrapi_text() incorrectly assumes that there is at least one piece that did _not_ exceed the width since the last time `start` was extended, leading to a malformed result with the last piece repeated in an incorrect position. Update the logic to account for the fact that successive pieces can exceed the specified width.
Regression test failures when executed with
|
|
||
expect_identical( | ||
wrap_text("foo/bar/baz", width = 4), | ||
"foo/\nbar/\nbaz" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that all these cases involve multiple splits on the same separator. Although I haven't looked into it closely, I think that is why none of these issues were noticed when looking at the bbr example in gh-57. For example, man/check_nonmem_table_output.Rd
is split to man/NEWLINEcheck_nonmem_table_NEWLINEoutput.Rd
. However, despite that having two newlines inserted, that's one split on "/" and then another on "_". For any given separator, paste_pieces()
doesn't get far enough into its logic to hit the bugs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The exact cause of the issue was a bit confusing, but the fixes and regression tests look good
There are still some aspects I'm not sure about, but fwiw here's how I understand the link between b4ecbd1 and the "Unable to read an entire line..." error. The out-of-bounds indexing can produce very long lines due to all the NAs. Trying to render mlr, for example, errors with
And here are the long lines in that tex file:
If I try to increase the value in
That, I think, provides a clear indication that NA garbage is behind the issue. It'd be possible to dig deeper and figure out, given the bad indexing logic, what features of these particular packages are leading to so many NAs, but I haven't done so.
Thanks for reviewing. I was getting pretty turned around while stepping through the |
This series fixes three
wrap_text()
bugs. I started looking into this because rendering the scorecards for six packages on MPN (gsDesign, mlr, pals, r2rtf, tidyposterior, and vcr) failed with a TeX error related to long lines ("Unable to read an entire line..."). The failure was introduced by 773c9ee (bug fix and refactor: tracability matrix, 2024-03-19, gh-57).That failure is resolved by the fourth commit ("wrapi_text: fix indexing logic"). The second and fifth commits have the other two fixes.