Skip to content

Update pyparsing to 3.3.1#2412

Closed
pyup-bot wants to merge 1 commit intomasterfrom
pyup-update-pyparsing-3.1.2-to-3.3.1
Closed

Update pyparsing to 3.3.1#2412
pyup-bot wants to merge 1 commit intomasterfrom
pyup-update-pyparsing-3.1.2-to-3.3.1

Conversation

@pyup-bot
Copy link
Collaborator

This PR updates pyparsing from 3.1.2 to 3.3.1.

Changelog

3.3.1

------------------------------
- Added license info to metadata, following PEP-639. Thanks to Gedalia Pasternak and
Marc Mueller for submitted issue and PR. Fixes 626.

3.3.0

------------------------------
===========================================================================================
The version 3.3.0 release will begin emitting `DeprecationWarnings` for pyparsing methods
that have been renamed to PEP8-compliant names (introduced in pyparsing 3.0.0, in August,
2021, with legacy names retained as aliases). In preparation, I added in pyparsing
3.2.2 a utility for finding and replacing the legacy method names with the new names.
This utility is located at `pyparsing/tools/cvt_pep8_names.py`. This script will scan all
Python files specified on the command line, and if the `-u` option is selected, will
replace all occurrences of the old method names with the new PEP8-compliant names,
updating the files in place.

Here is an example that converts all the files in the pyparsing `/examples` directory:

   python -m pyparsing.tools.cvt_pyparsing_pep8_names -u examples/*.py

The new names are compatible with pyparsing versions 3.0.0 and later.
===========================================================================================

- Deprecated `indentedBlock`, when converted using the `cvt_pyparsing_pep8_names`
utility, will emit `UserWarnings` that additional code changes will be required.
This is because the new `IndentedBlock` class no longer requires the calling code
to supply an indent stack, while adding support for nested indentation levels
and grouping.

- Deprecated `locatedExpr`, when converted using the `cvt_pyparsing_pep8_names`
utility, will emit `UserWarnings` that additional code changes may be required.
The new `Located` class removes the extra grouping level of the parsed values.
(If the original `locatedExpr` parser was defined with a results name, then
the extra grouping is retained, so that the results name nesting works properly;
in this case, no code changes would be required.)

- Updated all examples and test cases to use PEP8 names (unless the test case is specifically
designed to test behavior of a legacy method). Added railroad diagrams for some examples.

- Added exception handling when calling `formatted_message()`, so that `str(exception)`
always returns at least _something_.

- All unit tests pass with Python 3.14, including 3.14t. This does _not_ necessarily
mean that pyparsing is now thread-safe, just that when run in the free-threaded
interpreter, there were no errors. None of the unit tests try to do any parsing
with multiple threads - they test the basic functionality of the library, under various
versions of packrat and left-recursive parsing.

- Added AI instructions so that AI agents can be prompted with best practices
for generating parsers using pyparsing code. These instructions are in the
`ai/best_practices.md` file, and can be accessed programmatically by calling
`pyparsing.show_best_practices()` or running `python -m pyparsing.ai.show_best_practices`
from the command line, after installing the `pyparsing` package.

- Implemented a TINY language parser/interpreter using pyparsing, in the `examples/tiny`
directory. This is a little tutorial language that I used to demonstrate how to use pyparsing to
build a simple interpreter, following a recommended parser+AST+engine+run structure.
The `docs` sub-directory also includes transcripts of the AI session used to create the
parser and the interpreter. The `samples` sub-directory includes a few sample TINY programs.

- Fixed minor formatting bugs in `pyparsing.testing.with_line_numbers`, found during development
of the TINY language example.

- Added test in `DelimitedList` and `nested_expr` which auto-suppress delimiting commas to
avoid wrapping in a `Suppress` if delimiter is already a `Suppress`.

- Added performance benchmarking tools and documentation:
- `tests/perf_pyparsing.py` runs a series of benchmark parsing tests, exercising different
 aspects of the pyparsing package. For cross-version analysis, this script can export
 results as CSV and append to a consolidated data file.
- Runner scripts `run_perf_all_tags.bat` (Windows) and `run_perf_all_tags.sh` (Ubuntu/bash)
 execute the benchmark across multiple Python versions (3.9–3.14) and pyparsing versions
 (3.1.1 through 3.3.0), aggregating results into `perf_pyparsing.csv` at the repo root.
- See `tests/README.md` for usage instructions.

- Used performance benchmarking to identify and revert an inefficient utility method used in
`transform_string` (introduced in pyparsing 3.2.0b2).

3.2.5

-------------------------------
- JINX! Well, 3.2.4 had a bug for `Word` expressions that include a space
character, if that expression was then copied, either directly with .copy() or
by adding a results name, or included in another construct (like `DelimitedList`)
that makes a copy internally. Issue 618, reported by mstinberg, among others -
thanks, and sorry for the inconvenience.

3.2.4

-------------------------------
- Barring any catastrophic bugs in this release, this will be the last release in
the 3.2.x line. The next release, 3.3.0, will begin emitting `DeprecationWarnings`
when the pre-PEP8 methods are used (see header notes above for more information,
including available automation for converting any existing code using
pyparsing with the old names).

- Fixed bug when using a copy of a `Word` expression (either by using the explicit
`copy()` method, or attaching a results name), and setting a new expression name,
a raised `ParseException` still used the original expression name. Also affected
`Regex` expressions with `as_match` or `as_group_list` = True. Reported by
Waqas Ilyas, in Issue 612 - good catch!

- Fixed type annotation for `replace_with`, to accept `Any` type. Fixes Issue 602,
reported by esquonk.

- Added locking around potential race condition in `ParserElement.reset_cache`, as
well as other cache-related methods. Fixes Issue 604, reported by CarlosDescalziIM.

- Substantial update to docstrings and doc generation in preparation for 3.3.0,
great effort by FeRD, thanks!

- Notable addition by FeRD to convert docstring examples to work with doctest! This
was long overdue, thanks so much!

3.2.3

---------------------------
- Fixed bug released in 3.2.2 in which `nested_expr` could overwrite parse actions
for defined content, and could truncate list of items within a nested list.
Fixes Issue 600, reported by hoxbro and luisglft, with helpful diag logs and
repro code.

3.2.2

---------------------------
- Released `cvt_pyparsing_pep8_names.py` conversion utility to upgrade pyparsing-based
programs and libraries that use legacy camelCase names to use the new PEP8-compliant
snake_case method names. The converter can also be imported into other scripts as

     from pyparsing.tools.cvt_pyparsing_pep8_names import pep8_converter

- Fixed bug in `nested_expr` where nested contents were stripped of whitespace when
the default whitespace characters were cleared (raised in this StackOverflow
question https://stackoverflow.com/questions/79327649 by Ben Alan). Also addressed
bug in resolving PEP8 compliant argument name and legacy argument name.

- Fixed bug in `rest_of_line` and the underlying `Regex` class, in which matching a
pattern that could match an empty string (such as `".*"` or `"[A-Z]*"` would not raise
a `ParseException` at or beyond the end of the input string. This could cause an
infinite parsing loop when parsing `rest_of_line` at the end of the input string.
Reported by user Kylotan, thanks! (Issue 593)

- Enhancements and extra input validation for `pyparsing.util.make_compressed_re` - see
usage in `examples/complex_chemical_formulas.py` and result in the generated railroad
diagram `examples/complex_chemical_formulas_diagram.html`. Properly escapes characters
like "." and "*" that have special meaning in regular expressions.

- Fixed bug in `one_of()` to properly escape characters that are regular expression markers
(such as '*', '+', '?', etc.) before building the internal regex.

- Better exception message for `MatchFirst` and `Or` expressions, showing all alternatives
rather than just the first one. Fixes Issue 592, reported by Focke, thanks!

- Added return type annotation of "-> None" for all `__init__()` methods, to satisfy
`mypy --strict` type checking. PR submitted by FeRD, thank you!

- Added optional argument `show_hidden` to `create_diagram` to show
elements that are used internally by pyparsing, but are not part of the actual
parser grammar. For instance, the `Tag` class can insert values into the parsed
results but it does not actually parse any input, so by default it is not included
in a railroad diagram. By calling `create_diagram` with `show_hidden = True`,
these internal elements will be included. (You can see this in the tag_metadata.py
script in the examples directory.)

- Fixed bug in `number_words.py` example. Also added `ebnf_number_words.py` to demonstrate
using the `ebnf.py` EBNF parser generator to build a similar parser directly from
EBNF.

- Fixed syntax warning raised in `bigquery_view_parser.py`, invalid escape sequence "\s".
Reported by sameer-google, nice catch! (Issue 598)

- Added support for Python 3.14.

3.2.1

------------------------------
- Updated generated railroad diagrams to make non-terminal elements links to their related
sub-diagrams. This _greatly_ improves navigation of the diagram, especially for
large, complex parsers.

- Simplified railroad diagrams emitted for parsers using `infix_notation`, by hiding
lookahead terms. Renamed internally generated expressions for clarity, and improved
diagramming.

- Improved performance of `cpp_style_comment`, `c_style_comment`, `common.fnumber`
and `common.ieee_float` `Regex` expressions. PRs submitted by Gabriel Gerlero,
nice work, thanks!

- Add missing type annotations to `match_only_at_col`, `replace_with`, `remove_quotes`,
`with_attribute`, and `with_class`. Issue 585 reported by rafrafrek.

- Added generated diagrams for many of the examples.

- Replaced old `examples/0README.html` file with `examples/README.md` file.

3.2.0

-------------------------------
- Discontinued support for Python 3.6, 3.7, and 3.8. Adopted new Python features from
Python versions 3.7-3.9:
- Updated type annotations to use built-in container types instead of names
 imported from the `typing` module (e.g., `list[str]` vs `List[str]`).
- Reworked portions of the packrat cache to leverage insertion-preserving ordering
 in dicts (including removal of uses of `OrderedDict`).
- Changed `pdb.set_trace()` call in `ParserElement.set_break()` to `breakpoint()`.
- Converted `typing.NamedTuple` to `dataclasses.dataclass` in railroad diagramming
 code.
- Added `from __future__ import annotations` to clean up some type annotations.
(with assistance from ISyncWithFoo, issue 535, thanks for the help!)

- POSSIBLE BREAKING CHANGES

 The following bugfixes may result in subtle changes in the results returned or
 exceptions raised by pyparsing.

 - Fixed code in `ParseElementEnhance` subclasses that
   replaced detailed exception messages raised in contained expressions with a
   less-specific and less-informative generic exception message and location.

   If your code has conditional logic based on the message content in raised
   `ParseExceptions`, this bugfix may require changes in your code.

 - Fixed bug in `transform_string()` where whitespace
   in the input string was not properly preserved in the output string.

   If your code uses `transform_string`, this bugfix may require changes in
   your code.

 - Fixed bug where an `IndexError` raised in a parse action was
   incorrectly handled as an `IndexError` raised as part of the `ParserElement`
   parsing methods, and reraised as a `ParseException`. Now an `IndexError`
   that raises inside a parse action will properly propagate out as an `IndexError`.
   (Issue 573, reported by August Karlstedt, thanks!)

   If your code raises `IndexError`s in parse actions, this bugfix may require
   changes in your code.

- FIXES AND NEW FEATURES

 - Added type annotations to remainder of `pyparsing` package, and added `mypy`
   run to `tox.ini`, so that type annotations are now run as part of pyparsing's CI.
   Addresses Issue 373, raised by Iwan Aucamp, thanks!

 - Exception message format can now be customized, by overriding
   `ParseBaseException.format_message`:

       def custom_exception_message(exc) -> str:
           found_phrase = f", found {exc.found}" if exc.found else ""
           return f"{exc.lineno}:{exc.column} {exc.msg}{found_phrase}"

       ParseBaseException.formatted_message = custom_exception_message

   (PR 571 submitted by Odysseyas Krystalakos, nice work!)

 - `run_tests` now detects if an exception is raised in a parse action, and will
   report it with an enhanced error message, with the exception type, string,
   and parse action name.

 - `QuotedString` now handles translation of escaped integer, hex, octal, and
   Unicode sequences to their corresponding characters.

 - Fixed the displayed output of `Regex` terms to deduplicate repeated backslashes,
   for easier reading in debugging, printing, and railroad diagrams.

 - Fixed (or at least reduced) elusive bug when generating railroad diagrams,
   where some diagram elements were just empty blocks. Fix submitted by RoDuth,
   thanks a ton!

 - Fixed railroad diagrams that get generated with a parser containing a `Regex` element
   defined using a verbose pattern - the pattern gets flattened and comments removed
   before creating the corresponding diagram element.

 - Defined a more performant regular expression used internally by `common_html_entity`.

 - `Regex` instances can now be created using a callable that takes no arguments
   and just returns a string or a compiled regular expression, so that creating complex
   regular expression patterns can be deferred until they are actually used for the first
   time in the parser.

 - Added optional `flatten` Boolean argument to `ParseResults.as_list()`, to
   return the parsed values in a flattened list.

 - Added `indent` and `base_1` arguments to `pyparsing.testing.with_line_numbers`. When
   using `with_line_numbers` inside a parse action, set `base_1`=False, since the
   reported `loc` value is 0-based. `indent` can be a leading string (typically of
   spaces or tabs) to indent the numbered string passed to `with_line_numbers`.
   Added while working on 557, reported by Bernd Wechner.

- NEW/ENHANCED EXAMPLES

 - Added query syntax to `mongodb_query_expression.py` with:
   - better support for array fields ("contains all",
     "contains any", and "contains none")
   - "like" and "not like" operators to support SQL "%" wildcard matching
     and "=~" operator to support regex matching
   - text search using "search for"
   - dates and datetimes as query values
   - `a[0]` style array referencing

 - Added `lox_parser.py` example, a parser for the Lox language used as a tutorial in
   Robert Nystrom's "Crafting Interpreters" (http://craftinginterpreters.com/).
   With helpful corrections from RoDuth.

 - Added `complex_chemical_formulas.py` example, to add parsing capability for
   formulas such as "3(C₆H₅OH)₂".

 - Updated `tag_emitter.py` to use new `Tag` class, introduced in pyparsing
   3.1.3.

3.1.4

----------------------------
- Fixed a regression introduced in pyparsing 3.1.3, addition of a type annotation that
referenced `re.Pattern`. Since this type was introduced in Python 3.7, using this type
definition broke Python 3.6 installs of pyparsing 3.1.3. PR submitted by Felix Fontein,
nice work!

3.1.3

----------------------------
- Added new `Tag` ParserElement, for inserting metadata into the parsed results.
This allows a parser to add metadata or annotations to the parsed tokens.
The `Tag` element also accepts an optional `value` parameter, defaulting to `True`.
See the new `tag_metadata.py` example in the `examples` directory.

Example:

      add tag indicating mood
     end_punc = "." | ("!" + Tag("enthusiastic"))
     greeting = "Hello" + Word(alphas) + end_punc

     result = greeting.parse_string("Hello World.")
     print(result.dump())

     result = greeting.parse_string("Hello World!")
     print(result.dump())

prints:

     ['Hello', 'World', '.']

     ['Hello', 'World', '!']
     - enthusiastic: True

- Added example `mongodb_query_expression.py`, to convert human-readable infix query
expressions (such as `a==100 and b>=200`) and transform them into the equivalent
query argument for the pymongo package (`{'$and': [{'a': 100}, {'b': {'$gte': 200}}]}`).
Supports many equality and inequality operators - see the docstring for the
`transform_query` function for more examples.

- Fixed issue where PEP8 compatibility names for `ParserElement` static methods were
not themselves defined as `staticmethods`. When called using a `ParserElement` instance,
this resulted  in a `TypeError` exception. Reported by eylenburg (548).

- To address a compatibility issue in RDFLib, added a property setter for the
`ParserElement.name` property, to call `ParserElement.set_name`.

- Modified `ParserElement.set_name()` to accept a None value, to clear the defined
name and corresponding error message for a `ParserElement`.

- Updated railroad diagram generation for `ZeroOrMore` and `OneOrMore` expressions with
`stop_on` expressions, while investigating 558, reported by user Gu_f.

- Added `<META>` tag to HTML generated for railroad diagrams to force UTF-8 encoding
with older browsers, to better display Unicode parser characters.

- Fixed some cosmetics/bugs in railroad diagrams:
- fixed groups being shown even when `show_groups`=False
- show results names as quoted strings when `show_results_names`=True
- only use integer loop counter if repetition > 2

- Some type annotations added for parse action related methods, thanks August
Karlstedt (551).

- Added exception type to `trace_parse_action` exception output, while investigating
SO question posted by medihack.

- Added `set_name` calls to internal expressions generated in `infix_notation`, for
improved railroad diagramming.

- `delta_time`, `lua_parser`, `decaf_parser`, and `roman_numerals` examples cleaned up
to use latest PEP8 names and add minor enhancements.

- Fixed bug (and corresponding test code) in `delta_time` example that did not handle
weekday references in time expressions (like "Monday at 4pm") when the weekday was
the same as the current weekday.

- Minor performance speedup in `trim_arity`, to benefit any parsers using parse actions.

- Added early testing support for Python 3.13 with JIT enabled.
Links

@pyup-bot
Copy link
Collaborator Author

Closing this in favor of #2429

@pyup-bot pyup-bot closed this Jan 21, 2026
@MBARIMike MBARIMike deleted the pyup-update-pyparsing-3.1.2-to-3.3.1 branch January 21, 2026 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant