Skip to content

Commit

Permalink
Merge pull request #105 from planetarypy/104-end-statement
Browse files Browse the repository at this point in the history
Remove request of next token after end-statement
  • Loading branch information
rbeyer authored Aug 11, 2022
2 parents 7821f85 + 8f34f0d commit 6b872ff
Show file tree
Hide file tree
Showing 3 changed files with 36 additions and 11 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/python-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ jobs:
strategy:
matrix:
os: [ubuntu-latest, macos-latest]
python-version: [3.6, 3.7, 3.8, 3.9]
python-version: ['3.6', '3.7', '3.8', '3.9', '3.10']
# install-target: ['.', '.[allopts]']

steps:
Expand Down
12 changes: 12 additions & 0 deletions HISTORY.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,18 @@ and the release date, in year-month-day format (see examples below).
Not Yet Released
----------------

Fixed
+++++
* The parser was requesting the next token after an end-statement, even
though nothing was done with this token (in the future it could
be a comment that should be processed). In the very rare case
where all of the "data" bytes in a file with an attached PVL label
(like a .IMG or .cub file) actually convert to UTF with no
whitespace characters, that next token will take an unacceptable
amount of time to return, if it does at all. The parser now does
not request additional tokens once an end-statement is identified
(Issue 104).


1.3.1 (2022-02-05)
------------------
Expand Down
33 changes: 23 additions & 10 deletions pvl/parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -496,16 +496,29 @@ def parse_end_statement(self, tokens: abc.Generator) -> None:
f'"{end}"'
)

try:
t = next(tokens)
if t.is_WSC():
# maybe process comment
return
else:
tokens.send(t)
return
except LexerError:
pass
# The following commented code was originally put in place to deal
# with the possible future situation of being able to process
# the possible comment after an end-statement.
# In practice, an edge case was discovered (Issue 104) where "data"
# after an END statement *all* properly converted to UTF with no
# whitespace characters. So this request for the next token
# resulted in lexing more than 100 million "valid characters"
# and did not return in a prompt manner. If we ever enable
# processing of comments, we'll have to figure out how to handle
# this case. An alternate to removing this code is to leave it
# but put in a limit on the size that a lexeme can grow to,
# but that implies an additional if-statement for each character.
# This is the better solution for now.
# try:
# t = next(tokens)
# if t.is_WSC():
# # maybe process comment
# return
# else:
# tokens.send(t)
# return
# except LexerError:
# pass
except StopIteration:
pass

Expand Down

0 comments on commit 6b872ff

Please sign in to comment.