A workaround for the case that get_slice_tile() doesn't work. #2371
+7
−7
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR here is just for discussion. The modification in this PR is just a workaround. I think the root cause is the lack of flexibility in
ck_tile::detail::slice_distribution_from_x()
.The distribution encoding of tile to be cut is
It is a
64x64
tile and is going to be sliced as64x16
tiles:where
pp
is a sliced tile and p is the uncut tile.However, we encounter the following static assert failure in
ck_tile::detail::slice_distribution_from_x()
intile_distribution.hpp
:This assert check failure happens when slicing is being conducted on the 2nd dimension (id=1).
sliced_h
returned byreverse_slice_sequence(<2,2,1,4,4>, 16)
isThe sequence
src_h_prefix_sum
returned byEncoding::detail::get_h_dim_lengths_prefix_sum()
is<0, 3, 8>
, together with 2nd dimension (id=1), we want to find4 = slice_idx(1) + src_h_prefix_sum[id](3)
in sequence<0, 3, 5, 7>
which is the first result inEncoding::detail::get_sorted_y_info()
but obviously 4 is not in it so assert check failed.<0, 3, 5, 7>
is the location ofYs2RHs
inHs
. So that the assert check failure is caused by(Major=2, Minor 1)
is not referred inYs2RHs
.So the workaround here is making the minor coordinate of elements referred by
Ys2RHs
as large as possible so thatslice_idx
is more likely to hit the element referred inYs2RHs
. However, this only works on the dimensions whose length of sliced tile is less than that of uncut tile's length. For the case that lengths are the same (1st dim of current case:64x64
->64x16
),slice_idx
is always0
so the minor coordinate of elements referred byYs2RHs
in this case should be as small as possible.In theory, a tile like above one should be able to be sliced as such way. So how can we overcome such issue and have a terminate solution?