You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Prefix matching will be done on the number of indent spaces.
Therefore, only a number needs to be stored instead of a complex matching fn.
For end matching, it might be possible to provide an enum to cover all needed scenarios.
Storing the enum directly in the iterator should make parsing faster, and significantly reduce complexity. Scoping is still needed for enclosed elements.
Blankline ... Iterator ends on blankline, or if iterator end is reached
Needed for Paragraph, Quote Block, Line Block.
NewlineMatch(Vec) ... Matches given tokens once a newline is matched
BlankOrNewlineMatch(Vec) ... Either ends on blankline, or matches given tokens once a newline is matched
Needed for Heading and lists, because they do not require a blank line in case the next line starts with the element keyword.
Match(Token) ... Ends if the token is matched
Needed mostly for inline elements, but also for tables.
MatchEither(Token1, Token2) ... Ends if either Token1 or Token2 matches
Needed to handle ambiguous inline elements.
Try to combine block and inline iterator
The inline iterator is currently needed, because base tokens are converted to inline tokens,
and open formats are stored in a slice.
Generic token iterator
It might be possible to make the base token iterator generic.
The generic token type must have functions to determine if a token is a newline, blankline, or EOI.
With tokens being generic, a conversion layer may be added to convert between base tokens to inline tokens. This has the benefit of reducing API duplication, because base and inline iterators get merged.
Use end matching for inline formats
Inline formats use an open format map to determine if a format is open or not.
This open map is needed to decide if a keyword should open or close a format.
With iterators being nested, it might be possible to add a function that checks whether a format is already open (by having a parent parser that handles the respective format), or not.
If this works, no open map would be needed for inline parsing, which makes inline parsing much easier.
To achieve this, iterators must know for what element parsing they are used.
Could be done by adding a field with type ElementKind. To resolve ambiguous tokens, it must be possible to cache exactly one token.
The text was updated successfully, but these errors were encountered:
It must be possible to get the number of prefix spaces of all parent iterators for correct indentation of block quoted logic.
Unimarkup block content in the logic part is started with """.
To get indentation consistency, prefix for all enclosed lines must be set so that the content starts at the same "visual depth" of the leftmost double quote (in left-to-right flow).
let block = """
# Heading
Paragraph
""";
The number of spaces that were skipped by parent iterators makes it easy to calculate the needed indentation, because with this information, the start.col_grapheme value of the first quote token can be used.
Is your feature request related to other issues/PRs?
unimarkup/specification#55 unimarkup/specification#56
Remove matching fns
Prefix matching will be done on the number of indent spaces.
Therefore, only a number needs to be stored instead of a complex matching fn.
For end matching, it might be possible to provide an enum to cover all needed scenarios.
Storing the enum directly in the iterator should make parsing faster, and significantly reduce complexity. Scoping is still needed for enclosed elements.
Blankline ... Iterator ends on blankline, or if iterator end is reached
Needed for Paragraph, Quote Block, Line Block.
NewlineMatch(Vec) ... Matches given tokens once a newline is matched
Needed for enclosed blocks. Assuming issue [REQ] Relax enclosing block parsing specification#56 gets accepted.
BlankOrNewlineMatch(Vec) ... Either ends on blankline, or matches given tokens once a newline is matched
Needed for Heading and lists, because they do not require a blank line in case the next line starts with the element keyword.
Match(Token) ... Ends if the token is matched
Needed mostly for inline elements, but also for tables.
MatchEither(Token1, Token2) ... Ends if either Token1 or Token2 matches
Needed to handle ambiguous inline elements.
Try to combine block and inline iterator
The inline iterator is currently needed, because base tokens are converted to inline tokens,
and open formats are stored in a slice.
Generic token iterator
It might be possible to make the base token iterator generic.
The generic token type must have functions to determine if a token is a newline, blankline, or EOI.
With tokens being generic, a conversion layer may be added to convert between base tokens to inline tokens. This has the benefit of reducing API duplication, because base and inline iterators get merged.
Use end matching for inline formats
Inline formats use an open format map to determine if a format is open or not.
This open map is needed to decide if a keyword should open or close a format.
With iterators being nested, it might be possible to add a function that checks whether a format is already open (by having a parent parser that handles the respective format), or not.
If this works, no open map would be needed for inline parsing, which makes inline parsing much easier.
To achieve this, iterators must know for what element parsing they are used.
Could be done by adding a field with type ElementKind. To resolve ambiguous tokens, it must be possible to cache exactly one token.
The text was updated successfully, but these errors were encountered: