Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct parsing for f-string string literal concatenation #91

Open
knutwannheden opened this issue Oct 19, 2024 · 0 comments · May be fixed by #92
Open

Correct parsing for f-string string literal concatenation #91

knutwannheden opened this issue Oct 19, 2024 · 0 comments · May be fixed by #92
Assignees
Labels
bug Something isn't working

Comments

@knutwannheden
Copy link
Contributor

Python has a feature called string literal concatenation and allows splitting a a literal into two (typically on separate lines) and then have them joined into a single string literal at compile time, which is what differentiates them for string literals concatenated using +. Currently, the parser parses string literal concatenation into a single J.Literal and makes sure that the concatenation is then reflected by the valueSource property. F-strings, on the other hand, are parsed into a Py.FormattedString node, where the parts are stored as Expressions in the parts property and the start delimiter (e.g. f") in the delimiter property).

Now, when string literal concatenation is combined with f-strings, this breaks down. So instead the parser should try to produce Py.Binary nodes with the operator set to Py.Binary.Type.StringConcatenation and the literals to the left and right properties. The difficulty here is that the Python AST has already abstracted and the individual literals have already been merged into one ast.Constant or ast.JoinedStr so that the visitor needs to use the tokenize() function to extract the information from there at the correct offset. This in turn can also be problematic because the tokenizer barks if the INDENT and DEDENT tokens don't match up when the tokenization is started mid-stream.

@knutwannheden knutwannheden added the bug Something isn't working label Oct 19, 2024
@knutwannheden knutwannheden changed the title Correct parsing for f-string literal concatenation Correct parsing for f-string string literal concatenation Oct 19, 2024
knutwannheden added a commit that referenced this issue Oct 19, 2024
Fixes #91

Add support for f-string literal concatenation in the Python parser.

* **Py.java**
  - Add a new nested `StringLiteralConcatenation` type to handle string literal concatenation.
  - Implement `getType()` to return `JavaType.Primitive.String`.
  - Implement `setType()` to return `this`.

* **_parser_visitor.py**
  - Update the `visit_Constant` method to handle string literal concatenation combined with f-strings.
  - Add logic to produce `Py.StringLiteralConcatenation` nodes for string literal concatenation.
  - Update the `__map_fstring` method to handle the difficult piece of logic for f-string concatenation.

* **fstring_test.py**
  - Add tests to verify correct parsing of string literal concatenation combined with f-strings.
  - Add tests to verify correct parsing of f-string literal concatenation with comments.

---

For more details, open the [Copilot Workspace session](https://copilot-workspace.githubnext.com/openrewrite/rewrite-python/issues/91?shareId=XXXX-XXXX-XXXX-XXXX).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.

1 participant