A static analysis tool for detecting test smells in End-to-End (E2E) test code written in TypeScript and JavaScript, targeting Selenium, Playwright, Puppeteer, and Cypress.
E2E tests are essential for verifying web applications from the user's perspective, but they often contain test smells — bad practices that hurt readability, maintainability, and reliability.
This tool ships multiple detectors that catch issues like undocumented assertions, hardcoded waits, empty tests, conditional logic in test bodies, magic numbers, absolute URLs/XPaths, and more. Parsing relies on tree-sitter for accurate AST-level analysis.
There are two ways to use it:
- Batch scanner — analyzes the entire dataset and produces TXT/CSV reports
- Web app — Flask-based UI to browse repos, view source with syntax highlighting, and run detectors interactively on individual files
The underlying dataset is E2EGit, a curated collection of GitHub repositories containing E2E tests, stored as a SQLite database.
- Python 3.8+ (tested on Linux)
- pip
python -m venv venv
source venv/bin/activate # Linux / macOS
# venv\Scripts\activate # Windows
pip install -r requirements.txtpython setup.pyThis automatically downloads the E2EGit.db database into data/, creates the repos/ directory, and checks that all packages are installed.
python utils/download_repos.pyThis downloads all TypeScript and JavaScript test files in parallel (10 threads by default). Files are fetched at the exact commit SHA recorded in the database and saved under repos/ organized by framework (e.g. repos/playwright_ts/, repos/cypress_js/).
You can also download only one language or tune parallelism:
python utils/download_repos.py --lang ts # TypeScript only
python utils/download_repos.py --lang js # JavaScript only
python utils/download_repos.py --workers 16 # more threadsBatch scanner — processes the full dataset:
python utils/typescript_analysis.py # outputs typescript_analysis.txt/.csv
python utils/javascript_analysis.py # outputs javascript_analysis.txt/.csvWeb app — interactive file-by-file inspection:
python web_app/run.pyThen open http://localhost:5000. Select a language, browse frameworks and repos, open a file, pick the detectors you want, and run — smells are highlighted directly in the code.
python -m pytest tests/ # all tests
python -m pytest tests/typescript/ # TypeScript detectors only
python -m pytest tests/javascript/ # JavaScript detectors onlyTests that use absolute URLs (e.g. http://staging.example.com/login) instead of relative ones (e.g. /login). This couples tests to a specific environment and makes them brittle when hosts, ports, or domains change.
Tests that use absolute XPaths (e.g. /html/body/div[2]/form/input[1]) instead of relative ones (e.g. //input[@id='username']). Absolute XPaths depend heavily on DOM structure and break when the layout changes.
Test methods with multiple undocumented assertions. When one fails it is unclear which assertion broke and why, making debugging harder.
Test methods that exceed a statement count threshold. Large, multi-purpose tests are harder to understand, debug, and maintain, and they violate the single responsibility principle.
Usage of if, for, switch, while, and similar control flow inside tests. Branches and loops make tests harder to reason about and can hide bugs — tests should be linear and deterministic.
Test classes that define constructors. Constructors add setup logic at class level and can create hidden dependencies or shared state between tests, reducing isolation and clarity.
Assertions that are repeated within the same test method (same expected/actual values). Duplicate assertions add no value, increase maintenance burden, and may mask a missing verification.
Test methods whose body contains no executable code (only comments or nothing at all). Empty tests give false confidence that something is being tested while verifying nothing.
This smell occurs when a test method explicitly passes or fails based on the code under test throwing an exception, using custom try/catch blocks or manual throw statements. Instead of encoding this logic by hand, tests should use the testing framework's built-in support for expecting exceptions (e.g. expect(promise).rejects.toThrow(...) or await expect(fn).rejects.toThrow(...)), so that failures are reported consistently and intent is clear.
Tests that use global or module-level variables to share state. This creates hidden dependencies between tests and can cause order-dependent, flaky, and hard-to-understand behavior.
Numeric literals used directly in assertions without named constants or variables (e.g. expect(items).toHaveLength(7)). Magic numbers hide intent, reduce readability, and make tests harder to maintain.
Usage of broad tag-based locators (e.g. div, span, button) instead of more explicit, intention-revealing locators (role, test id, stable attributes). Even when such a locator currently matches a single element, it is inherently fragile: any small DOM change (adding another div, wrapping content, tweaking layout) can make it match multiple or different elements, breaking the test without any semantic change in behavior.
Tests that depend on external resources such as the file system, databases, or network services without explicit setup. Hidden dependencies make tests slow, fragile, and reliant on external state.
Locators that do not use stable, intention-revealing patterns such as data-* attributes (data-testid, data-cy, data-qa) or role-based queries like getByRole(). When tests rely on bare CSS strings without these signals, the selectors tend to couple to incidental structure (specific DOM shape, class names, nesting) and become brittle as the UI markup evolves.
Assertions where the expected and actual values are the same (e.g. expect(x).toBe(x)). Such assertions are always true by definition and do not verify any behavior.
Logging or print statements left in tests (e.g. console.log, console.warn, cy.log). These add noise to test output and serve no purpose in automated test suites.
Occurs when objects are verified by calling toString() / String() and comparing the resulting string to a specific literal. In this case the assertion is tied to the current implementation of toString(), so even harmless formatting changes can break the test while the behavior remains correct. A better approach is to expose explicit, domain-level properties or helper methods on the object and assert on those instead of its string representation.
Arbitrary time-based waits (e.g. setTimeout, page.waitForTimeout, cy.wait(ms)) instead of condition-based waits. Fixed delays slow down tests and cause flakiness when timing varies across environments.
Test methods that perform actions but contain no assertions. Without assertions the test does not verify anything, so passing gives no confidence that the code works correctly.
Locators that rely on visible text content (e.g. getByText, contains, linkText). Text-based locators are fragile when UI copy is updated, translated, or localized.