Skip to content

[SYCL][E2E] Fix XFAIL statements that use negated device features #18641

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 22, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion sycl/test-e2e/Regression/build_log.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
// RUN: FileCheck %s --check-prefix=CHECK-EXPECTED-ERROR --input-file %t.out
// CHECK-EXPECTED-ERROR: error: backend compiler failed build

// XFAIL: !arch-intel_gpu_bmg_g21
// XFAIL: *
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually fails on BMG, but this statement was an example of one that was being incorrectly processed and marked as XFAIL on BMG.

https://github.com/intel/llvm/actions/runs/15194270458/job/42735337999

// XFAIL-TRACKER: https://github.com/intel/llvm/issues/16416

#include <stdio.h>
Expand Down
6 changes: 6 additions & 0 deletions sycl/test-e2e/format.py
Original file line number Diff line number Diff line change
Expand Up @@ -393,4 +393,10 @@ def map_result(features, code):
if len(devices_for_test) == 1:
device = devices_for_test[0]
result.code = map_result(test.config.sycl_dev_features[device], result.code)

# Set this to empty so internal lit code won't change our result if it incorrectly
# thinks the test should XFAIL. This can happen when our XFAIL condition relies on
# device features, since the internal lit code doesn't have knowledge of these.
test.xfails = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry so which features are known by the general lit infra and which are known only by our custom code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The features in test.config.available_features are known by the general lit infra, the device specific features which are in test.config.sycl_dev_features are not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we add the dev features to available_features before we build/run for each target and remove it after?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue would be in situations with multiple devices where a single device might be marked as XFAIL, while the rest pass. If we add all the device features to available_features and let the lit infra do their own XFAIL logic, then this test would be expected to XFAIL, while in reality we expect it to pass, while the XFAIL devices are skipped.

The reason I did it this way was mostly to avoid doing this redundant second XFAIL check.

Also in the case that we change the logic for XFAILs in our code to properly account for running on multiple devices, it is likely that we would want this internal XFAIL check turned off in favor of our own logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the xfail check only happen once in the lit infra with multiple devices?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the internal lit infra XFAIL check only happens once after we return the result of our test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and the result contains all device runs? sorry just making sure we can't use the lit infra

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah its a single result for the amalgamation of all the device runs.

i.e.,
PASS -> if all device pass (Some but not all devices may be xfail/unsupported, those are skipped)
FAIL -> if a single non xfail/unsupported device fails
XFAIL/XPASS -> can't happen afaik when running a test on multiple devices
UNSUPPORTED -> if all devices are unsupported, or xfailed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does our outer lit code handle XFAILs for both device specific features and normal features?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep the part just above this

if len(triples) == 1 and test.config.test_mode == "build-only":
result.code = map_result(test.config.available_features, result.code)
if len(devices_for_test) == 1:
device = devices_for_test[0]
result.code = map_result(test.config.sycl_dev_features[device], result.code)

For build only if the test is only ran for a single triple we only use the normal features, and on full/run-only if the test is only ran for one device we use test.config.sycl_dev_features[device] which contains both the normal features as well as the device features.


return result