Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUGFIX] Fix comment parsing to support multiple comments #663

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ziegenberg
Copy link
Contributor

Because of an eager consumption of whitespace, the rule parsing would swallow a trailing comment, meaning the comment for the next rule would be affected. This patch addresses this by only consuming real whitespace without comments after a rule.

Fixes #173

@ziegenberg
Copy link
Contributor Author

It would be nice to have this in an 8.6.1 release.

Because of an eager consumption of whitespace, the rule parsing would
swallow a trailing comment, meaning the comment for the next rule would
be affected. This patch addresses this by only consuming real whitespace
without comments after a rule.

Fixes MyIntervals#173

Signed-off-by: Daniel Ziegenberg <daniel@ziegenberg.at>
Copy link
Contributor

@oliverklee oliverklee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this PR, @ziegenberg! ❤️

Looks generally good to me. 👍 I've added some review comments.

Once this is merged, I'm all for backporting this to the 8.x branch and release the 8.x branch once support for PHP 8.4 is merged as well.

@@ -220,18 +220,26 @@ public function parseCharacter($bIsForIdentifier)
}

/**
* @return array<int, Comment>|void
* Consumes whitespace and comments and returns the comments.
* If $withoutComment is true, only whitespace is consumed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put code in comments in backticks so it gets displayed as code.

*
* @param bool $withoutComment Do not consume comments, only whitespace.
*
* @return array<int, Comment>|void List of comments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can be even more specific (and omit the void return type, which is incorrect for this method and also is not allowed in union types):

Suggested change
* @return array<int, Comment>|void List of comments.
* @return list<Comment> list of comments

*
* @throws UnexpectedEOFException
* @throws UnexpectedTokenException
*/
public function consumeWhiteSpace(): array
public function consumeWhiteSpace(bool $withoutComment = false): array
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a big fan of boolean function parameters that switch behavior. Instead, please let's have a separate, well-named method, e.g., comsumeWhitespaceWithoutComment or something alike.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can then also move the new method to Rule. (I'm not 100% sure about this and would welcome your thoughts on this, @ziegenberg @sabberworm @JakeQZ.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would go even further and rename the current function comsumeWhitespace to comsumeWhitespaceWithComment and create a new function comsumeWhitespace which precisely only consumes white space. I was baffled to see comsumeWhitespace also consume comments and return them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this would probably be a breaking change. So, I introduced the boolean function parameter to stay within the current wording (which I deem wrong) and still make the change non-breaking.

I would consider having comsumeWhitespace and a new comsumeWhitespaceWithoutComment a not-so-good solution. That would cement the semantically wrong situation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of renaming this method!

Would you be willing to do this as a (or more specific), two pre-patch PRs?

  1. one (to be backported) that renames the method, adds the specific type annotation, and re-adds the old method to redirect to the new method, and marks it as @deprecated will be removed in version 9.0 (including a corresponding line in the changelog
  2. another (not to be backported) that removes the old method (including a corresponding entry in the changelog)

I'd be willing to review these two pre-PRs quickly so we can rebase on them.

We also might want to mark the renamed method as @internal as it needs to be public for technical reasons, but is not expected to be called from outside this library. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a big fan of comsumeWhitespaceWithoutComment, having a function name telling you what it does not do. But I'll create the two pre-patch PRs as asked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alle consume* functions are in the class ParserState, so why move it to Rule?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it seems I misread your comment and got confused. 🤯

Okay, so I'm fine with this plan:

I would go even further and rename the current function comsumeWhitespace to comsumeWhitespaceWithComment and create a new function comsumeWhitespace which precisely only consumes white space.

We'll need to do this in a non-breaking way for the parts that we backport, though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All consume* functions are in the class ParserState, so why move it to Rule?

I've thought about this some more and now think having it in ParserState is perfectly fine. :-)

@@ -1161,13 +1161,16 @@ public function commentExtracting(): void
*/
public function flatCommentExtracting(): void
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're planning to split the tests and testcases into smaller, more focused tests. To make this easier for us, please don't expand the test. Instead, could you please duplicate it into two separate tests, one with a single comment and one with multiple? Thanks!

@oliverklee
Copy link
Contributor

oliverklee commented Aug 25, 2024

Battle plan as discussed in our call right now:

  1. 8.x We add a consumeWhiteSpaceWithComments
  2. 8.x consumeWhiteSpace redirects to consumeWhiteSpaceWithComments, redirect it to the new method and mark it as deprecated
  3. 8.x change the respective call to not call the consume* method and have the corresponding lines directly instead
  4. 9.0 remove/change the deprecated method

@JakeQZ
Copy link
Contributor

JakeQZ commented Aug 26, 2024

Battle plan as discussed in our call right now:

1. 8.x We add a `consumeWhiteSpaceWithComments`
2. 8.x `consumeWhiteSpace` redirects to `consumeWhiteSpaceWithComments`, redirect it to the new method and mark it as deprecated
3. 8.x change the respective call to not call the `consume*` method and have the corresponding lines directly instead
4. 9.0 remove/change the deprecated method

I don't think we should ever reintroduce a method with the same name as one that previously existed, with different default funcionality. It will catch out anyone doing a major update through several major versions, which happens even (and probably moreso) in sizeable companies as they defer the work required to support the update, deeming it low priority when updating requires some other changes.

So I think we may as well come up with a new method name now; maybe consumeWhitespaceOnly. (Unfortunately, PHP method names are not case sensitive, so we can't use consumeWhitespace.)

@oliverklee
Copy link
Contributor

@ziegenberg Do we still need this PR now that #671/#672 have been merged?

@oliverklee
Copy link
Contributor

I propose we move the discussion of the next steps here: #679

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug with parsing multiple comments
3 participants