Skip to content

feat(preset-basic): consider TypeScript AST parsing for complex pattern rules #36

@aridyckovsky

Description

@aridyckovsky

Summary

Evaluate using TypeScript AST parsing (via @typescript-eslint/parser or ts-morph) instead of regex for complex pattern rules to improve accuracy and reduce false positives.

Problem

Current implementation uses regex patterns for rule detection:

const noAsyncAwait: Rule = {
  id: "no-async-await",
  kind: "pattern",
  run: (ctx) => Effect.gen(function* () {
    const files = yield* ctx.listFiles(["**/*.ts", "**/*.tsx"])
    for (const file of files) {
      const content = yield* ctx.readFile(file)
      const pattern = /async\s+(function|\w+\s*=>)/g
      // ... find matches
    }
  })
}

Limitations:

  • False positives: Matches async in strings, comments, type definitions
  • Context unaware: Can't distinguish between function declarations vs. types
  • Brittle: Complex regex patterns hard to maintain
  • Limited scope: Can't analyze AST structure (e.g., "find all Effect.gen without yield*")

Proposed Solution

Use TypeScript parser for accurate semantic analysis:

Option 1: @typescript-eslint/parser

import { parse } from '@typescript-eslint/parser'
import type { AST } from '@typescript-eslint/types'

const noAsyncAwait: Rule = {
  id: "no-async-await",
  kind: "pattern",
  run: (ctx) => Effect.gen(function* () {
    const files = yield* ctx.listFiles(["**/*.ts", "**/*.tsx"])
    const results: RuleResult[] = []
    
    for (const file of files) {
      const content = yield* ctx.readFile(file)
      
      // Parse to AST
      const ast = parse(content, {
        ecmaVersion: 2022,
        sourceType: 'module',
        loc: true,
        range: true
      })
      
      // Walk AST looking for async functions
      walk(ast, {
        FunctionDeclaration(node) {
          if (node.async) {
            results.push({
              id: "no-async-await",
              ruleKind: "pattern",
              message: "Replace async function with Effect.gen",
              severity: "warning",
              file,
              range: {
                start: { line: node.loc.start.line, column: node.loc.start.column },
                end: { line: node.loc.end.line, column: node.loc.end.column }
              }
            })
          }
        }
      })
    }
    
    return results
  })
}

Option 2: ts-morph

import { Project } from 'ts-morph'

const noAsyncAwait: Rule = {
  id: "no-async-await",
  kind: "pattern",
  run: (ctx) => Effect.gen(function* () {
    const project = new Project()
    const files = yield* ctx.listFiles(["**/*.ts", "**/*.tsx"])
    
    const results: RuleResult[] = []
    
    for (const file of files) {
      const sourceFile = project.addSourceFileAtPath(file)
      
      // Find all async functions
      const asyncFunctions = sourceFile.getFunctions()
        .filter(fn => fn.isAsync())
      
      for (const fn of asyncFunctions) {
        results.push({
          id: "no-async-await",
          // ... location from fn.getStartLineNumber()
        })
      }
    }
    
    return results
  })
}

Trade-offs

Regex (Current)

Pros:

  • Fast and lightweight
  • No additional dependencies
  • Simple to implement

Cons:

  • False positives
  • Limited to surface-level patterns
  • Brittle for complex rules

AST Parsing

Pros:

  • Accurate semantic analysis
  • No false positives from strings/comments
  • Can analyze complex structures
  • Better error locations

Cons:

  • Slower (parsing overhead)
  • Heavier dependencies (~5-10 MB for @typescript-eslint)
  • More complex implementation

Recommendation

Hybrid approach:

  1. Keep regex for simple, fast rules (barrel imports, console.log)
  2. Use AST parsing for complex rules that need semantic accuracy:
    • no-async-await (distinguish function vs type)
    • no-unhandled-effect (detect Effect without yield*)
    • no-effect-gen-try-catch (analyze try/catch inside gen)

Implementation Plan

  1. Add optional @typescript-eslint/parser dependency to preset-basic
  2. Create packages/preset-basic/src/ast-helpers.ts with AST utilities
  3. Refactor 3-5 complex rules to use AST
  4. Benchmark performance impact (ensure <200ms overhead per file)
  5. Document when to use regex vs AST in packages/preset-basic/AGENTS.md

Acceptance Criteria

  • AST parser integrated (@typescript-eslint/parser or ts-morph)
  • 3-5 rules migrated to AST parsing
  • Performance benchmark shows acceptable overhead
  • False positive rate reduced
  • Documentation updated with guidance

Performance Target

  • Regex: ~5ms per file
  • AST: ~50ms per file (10x slower, but acceptable for accuracy)
  • Hybrid: Use AST only when needed

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    pkg:presetIssues related to preset packagespriority:lowLow priorityrulesMigration rules and patternstype:featureNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions