You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Python has a feature called string literal concatenation and allows splitting a a literal into two (typically on separate lines) and then have them joined into a single string literal at compile time, which is what differentiates them for string literals concatenated using +. Currently, the parser parses string literal concatenation into a single J.Literal and makes sure that the concatenation is then reflected by the valueSource property. F-strings, on the other hand, are parsed into a Py.FormattedString node, where the parts are stored as Expressions in the parts property and the start delimiter (e.g. f") in the delimiter property).
Now, when string literal concatenation is combined with f-strings, this breaks down. So instead the parser should try to produce Py.Binary nodes with the operator set to Py.Binary.Type.StringConcatenation and the literals to the left and right properties. The difficulty here is that the Python AST has already abstracted and the individual literals have already been merged into one ast.Constant or ast.JoinedStr so that the visitor needs to use the tokenize() function to extract the information from there at the correct offset. This in turn can also be problematic because the tokenizer barks if the INDENT and DEDENT tokens don't match up when the tokenization is started mid-stream.
The text was updated successfully, but these errors were encountered:
knutwannheden
changed the title
Correct parsing for f-string literal concatenation
Correct parsing for f-string string literal concatenation
Oct 19, 2024
Fixes#91
Add support for f-string literal concatenation in the Python parser.
* **Py.java**
- Add a new nested `StringLiteralConcatenation` type to handle string literal concatenation.
- Implement `getType()` to return `JavaType.Primitive.String`.
- Implement `setType()` to return `this`.
* **_parser_visitor.py**
- Update the `visit_Constant` method to handle string literal concatenation combined with f-strings.
- Add logic to produce `Py.StringLiteralConcatenation` nodes for string literal concatenation.
- Update the `__map_fstring` method to handle the difficult piece of logic for f-string concatenation.
* **fstring_test.py**
- Add tests to verify correct parsing of string literal concatenation combined with f-strings.
- Add tests to verify correct parsing of f-string literal concatenation with comments.
---
For more details, open the [Copilot Workspace session](https://copilot-workspace.githubnext.com/openrewrite/rewrite-python/issues/91?shareId=XXXX-XXXX-XXXX-XXXX).
Python has a feature called string literal concatenation and allows splitting a a literal into two (typically on separate lines) and then have them joined into a single string literal at compile time, which is what differentiates them for string literals concatenated using
+
. Currently, the parser parses string literal concatenation into a singleJ.Literal
and makes sure that the concatenation is then reflected by thevalueSource
property. F-strings, on the other hand, are parsed into aPy.FormattedString
node, where the parts are stored asExpression
s in theparts
property and the start delimiter (e.g.f"
) in thedelimiter
property).Now, when string literal concatenation is combined with f-strings, this breaks down. So instead the parser should try to produce
Py.Binary
nodes with theoperator
set toPy.Binary.Type.StringConcatenation
and the literals to theleft
andright
properties. The difficulty here is that the Python AST has already abstracted and the individual literals have already been merged into oneast.Constant
orast.JoinedStr
so that the visitor needs to use thetokenize()
function to extract the information from there at the correct offset. This in turn can also be problematic because the tokenizer barks if theINDENT
andDEDENT
tokens don't match up when the tokenization is started mid-stream.The text was updated successfully, but these errors were encountered: