Skip to content

Alias-aware token threading for better parallelism #1

@maleadt

Description

@maleadt

All memory operations (loads, stores, atomics) are threaded through a single global token chain. This is correct but conservative—operations on independent arrays are serialized unnecessarily.

Proposed improvement

Implement alias-aware token threading:

  1. Alias analysis: Compute which pointers may refer to the same memory region (alias sets)
  2. Per-set token chains: Thread tokens only between operations that may alias
  3. Loop parallel stores: Identify stores in loops with non-overlapping indices across iterations—these can skip token dependencies entirely

Why

The current sequential approach prevents parallelism between independent memory operations. For example, loading from array a and storing to array b don't need ordering constraints if they're provably disjoint. Alias-aware threading preserves correctness while enabling the hardware to execute independent operations concurrently.

Reference implementation

cuTile Python implements this in:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions