Skip to content

MapTiling Bug? #3

@aydogdub

Description

@aydogdub

(Potential) Bug Description

The MapTiling transformation does not appear to handle maps with strided ranges (stride > 1) correctly. Specifically, when applied to such maps, the resulting SDFG differs semantically from the original and produces incorrect results.

Steps to Reproduce

I recommend to run the following code snippets in a Jupyter notebook.

  1. Import the needed modules

    import numpy as np
    
    import dace
    from dace.sdfg import nodes
    from dace.transformation.dataflow.tiling import MapTiling
  2. The example SDFG

    N = dace.symbol('N')
    s = 33
    @dace.program
    def vector_copy_strides(A: dace.uint32[N], B: dace.uint32[N]):
        for i in dace.map[0:N:s] @ dace.dtypes.ScheduleType.CPU_Multicore:
            A[i] = B[i]
    
    sdfg = vector_copy_strides.to_sdfg()
    sdfg

    This produces the following SDFG:

    Image

  3. Apply the Maptiling transformation:

    state = sdfg.states()[0]
    sdfg_nodes = state.nodes()
    map_entry:nodes.MapEntry = [n for n in sdfg_nodes if isinstance(n, nodes.MapEntry)][0]
    
    tile_sizes = [32]
    MapTiling.apply_to(
        sdfg=sdfg,
        options={
            "prefix": "b",
            "tile_sizes": tile_sizes,
            "divides_evenly": False,
            "tile_trivial": True,
            "skew": True
        },
        map_entry=map_entry
    )
    sdfg

    This results into the following SDFG:

    Image

    Where the stride in the top level map is not correct.

Expected Behavior

The stride in the top-level map should be 32 * s (i.e., 32 * 33) to correctly account for both the original stride (s = 33) and the tiling configuration. However, the current MapTiling transformation only considers the tile size (32) and ignores the existing stride. This results in incorrect behavior.

For example, consider the following input:

A = [0, 0, ..., 0]
B = [0, 1, 2, ..., 63]

After applying the MapTiling transformation, only every 32nd element of B is copied into A, instead of every 33rd. This occurs because the generated SDFG includes a top-level map with stride 32 and an inner map with stride 33. Due to the limited inner range (min(N-1, b_i + 31) + 1 - b_i <= 32), only the first iteration (i = 0) of the inner map executes.

Since the copy operation is of the form A[b_i + i] = B[b_i + i], and i = 0 is the only iteration, the result is simply A[b_i] = B[b_i] for b_i in [0:N:32]. This causes the final output to reflect copies from indices 0, 32, 64, ..., instead of the expected 0, 33, 66, ....

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions