Skip to content

squidslab/e2e-test-smell-analyzer

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

E2E Test Smell Detector

A static analysis tool for detecting test smells in End-to-End (E2E) test code written in TypeScript and JavaScript, targeting Selenium, Playwright, Puppeteer, and Cypress.

Overview

E2E tests are essential for verifying web applications from the user's perspective, but they often contain test smells — bad practices that hurt readability, maintainability, and reliability.

This tool ships multiple detectors that catch issues like undocumented assertions, hardcoded waits, empty tests, conditional logic in test bodies, magic numbers, absolute URLs/XPaths, and more. Parsing relies on tree-sitter for accurate AST-level analysis.

There are two ways to use it:

  • Batch scanner — analyzes the entire dataset and produces TXT/CSV reports
  • Web app — Flask-based UI to browse repos, view source with syntax highlighting, and run detectors interactively on individual files

The underlying dataset is E2EGit, a curated collection of GitHub repositories containing E2E tests, stored as a SQLite database.


Requirements

  • Python 3.8+ (tested on Linux)
  • pip

Getting Started

1. Install dependencies

python -m venv venv
source venv/bin/activate      # Linux / macOS
# venv\Scripts\activate       # Windows

pip install -r requirements.txt

2. Download the dataset

python setup.py

This automatically downloads the E2EGit.db database into data/, creates the repos/ directory, and checks that all packages are installed.

3. Download test repositories from GitHub

python utils/download_repos.py

This downloads all TypeScript and JavaScript test files in parallel (10 threads by default). Files are fetched at the exact commit SHA recorded in the database and saved under repos/ organized by framework (e.g. repos/playwright_ts/, repos/cypress_js/).

You can also download only one language or tune parallelism:

python utils/download_repos.py --lang ts          # TypeScript only
python utils/download_repos.py --lang js          # JavaScript only
python utils/download_repos.py --workers 16       # more threads

4. Run the analysis

Batch scanner — processes the full dataset:

python utils/typescript_analysis.py    # outputs typescript_analysis.txt/.csv
python utils/javascript_analysis.py    # outputs javascript_analysis.txt/.csv

Web app — interactive file-by-file inspection:

python web_app/run.py

Then open http://localhost:5000. Select a language, browse frameworks and repos, open a file, pick the detectors you want, and run — smells are highlighted directly in the code.


Running Tests

python -m pytest tests/                # all tests
python -m pytest tests/typescript/     # TypeScript detectors only
python -m pytest tests/javascript/     # JavaScript detectors only

Supported Smells (TS/JS)

Absolute URL

Tests that use absolute URLs (e.g. http://staging.example.com/login) instead of relative ones (e.g. /login). This couples tests to a specific environment and makes them brittle when hosts, ports, or domains change.

Absolute XPath

Tests that use absolute XPaths (e.g. /html/body/div[2]/form/input[1]) instead of relative ones (e.g. //input[@id='username']). Absolute XPaths depend heavily on DOM structure and break when the layout changes.

Assertion Roulette

Test methods with multiple undocumented assertions. When one fails it is unclear which assertion broke and why, making debugging harder.

Complex Test

Test methods that exceed a statement count threshold. Large, multi-purpose tests are harder to understand, debug, and maintain, and they violate the single responsibility principle.

Conditional Logic

Usage of if, for, switch, while, and similar control flow inside tests. Branches and loops make tests harder to reason about and can hide bugs — tests should be linear and deterministic.

Constructor Initialization

Test classes that define constructors. Constructors add setup logic at class level and can create hidden dependencies or shared state between tests, reducing isolation and clarity.

Duplicate Assert

Assertions that are repeated within the same test method (same expected/actual values). Duplicate assertions add no value, increase maintenance burden, and may mask a missing verification.

Empty Test

Test methods whose body contains no executable code (only comments or nothing at all). Empty tests give false confidence that something is being tested while verifying nothing.

Exception Handling

This smell occurs when a test method explicitly passes or fails based on the code under test throwing an exception, using custom try/catch blocks or manual throw statements. Instead of encoding this logic by hand, tests should use the testing framework's built-in support for expecting exceptions (e.g. expect(promise).rejects.toThrow(...) or await expect(fn).rejects.toThrow(...)), so that failures are reported consistently and intent is clear.

Global Variable

Tests that use global or module-level variables to share state. This creates hidden dependencies between tests and can cause order-dependent, flaky, and hard-to-understand behavior.

Magic Number

Numeric literals used directly in assertions without named constants or variables (e.g. expect(items).toHaveLength(7)). Magic numbers hide intent, reduce readability, and make tests harder to maintain.

Misused Tag Locator

Usage of broad tag-based locators (e.g. div, span, button) instead of more explicit, intention-revealing locators (role, test id, stable attributes). Even when such a locator currently matches a single element, it is inherently fragile: any small DOM change (adding another div, wrapping content, tweaking layout) can make it match multiple or different elements, breaking the test without any semantic change in behavior.

Mystery Guest

Tests that depend on external resources such as the file system, databases, or network services without explicit setup. Hidden dependencies make tests slow, fragile, and reliant on external state.

Non-Preferred Locator

Locators that do not use stable, intention-revealing patterns such as data-* attributes (data-testid, data-cy, data-qa) or role-based queries like getByRole(). When tests rely on bare CSS strings without these signals, the selectors tend to couple to incidental structure (specific DOM shape, class names, nesting) and become brittle as the UI markup evolves.

Redundant Assertion

Assertions where the expected and actual values are the same (e.g. expect(x).toBe(x)). Such assertions are always true by definition and do not verify any behavior.

Redundant Print

Logging or print statements left in tests (e.g. console.log, console.warn, cy.log). These add noise to test output and serve no purpose in automated test suites.

Sensitive Equality

Occurs when objects are verified by calling toString() / String() and comparing the resulting string to a specific literal. In this case the assertion is tied to the current implementation of toString(), so even harmless formatting changes can break the test while the behavior remains correct. A better approach is to expose explicit, domain-level properties or helper methods on the object and assert on those instead of its string representation.

Sleepy Test

Arbitrary time-based waits (e.g. setTimeout, page.waitForTimeout, cy.wait(ms)) instead of condition-based waits. Fixed delays slow down tests and cause flakiness when timing varies across environments.

Unknown Test

Test methods that perform actions but contain no assertions. Without assertions the test does not verify anything, so passing gives no confidence that the code works correctly.

Unstable Link Text

Locators that rely on visible text content (e.g. getByText, contains, linkText). Text-based locators are fragile when UI copy is updated, translated, or localized.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 96.6%
  • HTML 3.4%