Skip to content

A workaround for the case that get_slice_tile() doesn't work. #2371

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

ruanjm
Copy link
Contributor

@ruanjm ruanjm commented Jun 19, 2025

This PR here is just for discussion. The modification in this PR is just a workaround. I think the root cause is the lack of flexibility in ck_tile::detail::slice_distribution_from_x().

The distribution encoding of tile to be cut is

ck_tile::tile_distribution_encoding
<
  /*RsLengths   */ ck_tile::sequence<>, 
  /*HsLengthss  */ ck_tile::tuple<ck_tile::sequence<1, 4, 16>, ck_tile::sequence<2, 2, 1, 4, 4>>, // Outer: <1, 4>, <2, 2>
  /*Ps2RHssMajor*/ ck_tile::tuple<ck_tile::sequence<1, 2>, ck_tile::sequence<2, 1>>, // warpPerBlk: (1,1)=4, (2,1)=2
  /*Ps2RHssMinor*/ ck_tile::tuple<ck_tile::sequence<1, 1>, ck_tile::sequence<3, 2>>, // thrPerWarp: (2,3)=4, (1,2)=16
  /*Ys2RHsMajor */ ck_tile::sequence<1, 2, 2, 2>, // repeat: (1,0)=1, (2,2)=1
  /*Ys2RHsMinor */ ck_tile::sequence<0, 0, 2, 4>  // vector: (2,0)=2, (2,4)=4
>

It is a 64x64 tile and is going to be sliced as 64x16 tiles:

const auto pp = ck_tile::get_slice_tile(p, ck_tile::sequence<0, 0>{}, ck_tile::sequence<64, 16>{});

where pp is a sliced tile and p is the uncut tile.

However, we encounter the following static assert failure in ck_tile::detail::slice_distribution_from_x() in tile_distribution.hpp:

// found_y_index and src_y_dims.size() are both 4.
static_assert(found_y_index >= 0 && found_y_index < src_y_dims.size(),
              "not sliced at y dim, please check");

This assert check failure happens when slicing is being conducted on the 2nd dimension (id=1). sliced_h returned by reverse_slice_sequence(<2,2,1,4,4>, 16) is

  • sliced_h[0]: lengths = <1,1,1,4,4>
  • sliced_h[1]: nums = <2,2,1,1,1>
  • sliced_h[2]: slice_idx = 1

The sequence src_h_prefix_sum returned by Encoding::detail::get_h_dim_lengths_prefix_sum() is <0, 3, 8>, together with 2nd dimension (id=1), we want to find 4 = slice_idx(1) + src_h_prefix_sum[id](3) in sequence <0, 3, 5, 7> which is the first result in Encoding::detail::get_sorted_y_info() but obviously 4 is not in it so assert check failed. <0, 3, 5, 7> is the location of Ys2RHs in Hs. So that the assert check failure is caused by (Major=2, Minor 1) is not referred in Ys2RHs.

So the workaround here is making the minor coordinate of elements referred by Ys2RHs as large as possible so that slice_idx is more likely to hit the element referred in Ys2RHs. However, this only works on the dimensions whose length of sliced tile is less than that of uncut tile's length. For the case that lengths are the same (1st dim of current case: 64x64 -> 64x16), slice_idx is always 0 so the minor coordinate of elements referred by Ys2RHs in this case should be as small as possible.

In theory, a tile like above one should be able to be sliced as such way. So how can we overcome such issue and have a terminate solution?

@ruanjm ruanjm added bug Something isn't working help wanted Extra attention is needed question Further information is requested labels Jun 19, 2025
@ruanjm ruanjm marked this pull request as draft June 19, 2025 08:01
@carlushuang
Copy link
Contributor

carlushuang commented Jun 21, 2025

@ruanjm thanks for this feedback. This is the fundamental design problem/choice in the slice tile API.

=> fired an issue #2384
Originally we assume the slice along a dim must from left to right
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants