Skip to content

Correct parsing for f-string string literal concatenation #91

Open
@knutwannheden

Description

@knutwannheden

Python has a feature called string literal concatenation and allows splitting a a literal into two (typically on separate lines) and then have them joined into a single string literal at compile time, which is what differentiates them for string literals concatenated using +. Currently, the parser parses string literal concatenation into a single J.Literal and makes sure that the concatenation is then reflected by the valueSource property. F-strings, on the other hand, are parsed into a Py.FormattedString node, where the parts are stored as Expressions in the parts property and the start delimiter (e.g. f") in the delimiter property).

Now, when string literal concatenation is combined with f-strings, this breaks down. So instead the parser should try to produce Py.Binary nodes with the operator set to Py.Binary.Type.StringConcatenation and the literals to the left and right properties. The difficulty here is that the Python AST has already abstracted and the individual literals have already been merged into one ast.Constant or ast.JoinedStr so that the visitor needs to use the tokenize() function to extract the information from there at the correct offset. This in turn can also be problematic because the tokenizer barks if the INDENT and DEDENT tokens don't match up when the tokenization is started mid-stream.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingpython

Type

No type

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions