-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
refactor(es/parser): Detach swc_ecma_parser from swc_ecma_lexer
#11148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🦋 Changeset detectedLatest commit: 3d13362 The changes in this PR will be included in the next version bump. Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
8aef46c to
d856923
Compare
CodSpeed Performance ReportMerging #11148 will not alter performanceComparing Summary
Benchmarks breakdown
Footnotes
|
This comment was marked as spam.
This comment was marked as spam.
d3b79fc to
236f45e
Compare
20f5d1d to
fc2e4e5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the swc_ecma_parser to be self-contained by removing its dependency on swc_ecma_lexer. The code previously shared between the lexer and parser through trait-based abstractions has been copied directly into swc_ecma_parser and simplified by removing generic trait implementations. This improves developer experience by eliminating cross-crate navigation and provides opportunities for future performance optimizations.
Key changes:
- Moved common parsing functions from swc_ecma_lexer into swc_ecma_parser
- Eliminated trait-based generic abstractions in favor of direct implementations
- Maintained backward compatibility for most users (no breaking changes unless using Token API directly)
Reviewed Changes
Copilot reviewed 45 out of 58 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| crates/swc_ecma_parser/src/parser/tests.rs | Moved expression parsing tests from expr/tests.rs and added new benchmarks |
| crates/swc_ecma_parser/src/parser/stmt/module_item.rs | Removed legacy decorator test (moved to module_item.rs) |
| crates/swc_ecma_parser/src/parser/stmt.rs | Copied statement parsing logic from swc_ecma_lexer, removing trait dependencies |
| crates/swc_ecma_parser/src/parser/state.rs | Added new State type and WithState helper for parser state management |
| crates/swc_ecma_parser/src/parser/pat.rs | Copied pattern parsing functions from swc_ecma_lexer |
| crates/swc_ecma_parser/src/parser/object.rs | Added object literal and pattern parsing functions |
| crates/swc_ecma_parser/src/parser/module_item.rs | Copied module item parsing with legacy test moved from stmt/module_item.rs |
| crates/swc_ecma_parser/src/parser/mod.rs | Removed trait dependencies, added utility methods directly to Parser |
| crates/swc_ecma_parser/src/parser/macros.rs | Added missing macros (expect, debug_tracing, peek, return_if_arrow) |
| crates/swc_ecma_parser/src/parser/jsx.rs | Copied JSX parsing functions from swc_ecma_lexer |
| crates/swc_ecma_parser/src/parser/input.rs | Simplified Tokens trait and Buffer implementation |
| crates/swc_ecma_parser/src/parser/ident.rs | Added identifier parsing functions |
| crates/swc_ecma_parser/src/parser/expr/tests.rs | Removed tests (moved to parser/tests.rs) |
| crates/swc_ecma_parser/src/parser/expr/ops.rs | Removed binary operator tests (moved to parser/tests.rs) |
| crates/swc_ecma_parser/src/parser/class_and_fn.rs | Copied class and function parsing from swc_ecma_lexer |
| crates/swc_ecma_parser/src/lib.rs | Updated exports to remove swc_ecma_lexer dependencies |
| crates/swc_ecma_parser/src/lexer/whitespace.rs | Added whitespace scanning implementation |
| crates/swc_ecma_parser/src/lexer/table.rs | Updated to use const generic for mul_mod method |
| crates/swc_ecma_parser/src/lexer/state.rs | Simplified State implementation removing trait dependencies |
| crates/swc_ecma_parser/src/lexer/number.rs | Added number parsing utilities |
Comments suppressed due to low confidence (1)
crates/swc_ecma_parser/src/parser/tests.rs:1
- Corrected spelling of 'THis' to 'This'.
use std::hint::black_box;
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
swc_ecma_parser from swc_emca_lexerswc_ecma_parser from swc_ecma_lexer
swc_ecma_parser from swc_ecma_lexerswc_ecma_parser from swc_ecma_lexer
… 10-11-refactor/parser
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 46 out of 59 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Background:
In #10377 and #10399, we create another crate
swc_ecma_lexerfromswc_ecma_parser.TokenKindandTokenValueto make lexer and parser run faster. This has been done inswc_ecma_parser, whoseTokenis only 1 byte. This also means we should refactor the lexer and the parser.LexerandTokenpublic, so the change ofTokenwill introduce a large breaking for rust users. So for compatibility, we have to keep the legacy set of lexer and parser to produce compatibleTokens.swc_ecma_lexer/commonby introducing a set of complexyParseTrait,LexerTrait, etc, which makes the project chaotic and less comprehensive. You can see thatswc_ecma_parserdepends onswc_ecma_lexerand calls the common functions everywhere.Motivation:
Description:
Now it's time to correct the decision. This pr makes
swc_ecma_parserself-contained and doesn't dependsswc_ecma_lexerany more. On the contrary, this pr makesswc_ecma_lexerdepends on someswc_ecma_parserinstead such as some common simple data structure likeSyntax. For compatibility, I also move and import legacyTokeninswc_ecma_parser.After this pr the
swc_ecma_lexeris nearly marked as no longer maintained. All the bug fixes and performance optimization should only be applied inswc_ecma_parser.Specifically, what I do in this pr is only copy all common functions from
swc_ecma_lexer/commontoswc_ecma_parserand eliminate the trait-based generics. For example:Note that I nearly doesn't change anything in
swc_ecma_lexerso the lexer and parser in that crate are still based on trait and common function.Breaking Changes:
If you don't use
TokenAPI, then there's no breaking changes, which means for most rust api users, there's no breaking change. Otherwise you may need to remove the dependencies ofswc_ecma_lexerand related imports of traits.Test in community crates:
swc_ecma_lexerand related imports of traits.Future Works:
Actually when I finished copying all the code, the performance got regression. It takes me lots of time to figure it out but I finally keep the regression around -1%. So I have to do some complex optimization ahead of time such as refactor of
parse_subscripts.