Skip to content

[DevTSAN] Support detecting data race for local memory #18718

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Jun 5, 2025

Conversation

zhaomaosu
Copy link
Contributor

@zhaomaosu zhaomaosu commented May 29, 2025

  • Add builtins _tsan_cleanup[static|dynamic]_local to initialize local shadow
  • Add missing lock for kernel launch api

@zhaomaosu zhaomaosu requested review from a team as code owners May 29, 2025 02:44
@zhaomaosu zhaomaosu requested a review from Copilot May 29, 2025 02:45
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR extends ThreadSanitizer support to detect races in local (shared) memory by allocating and initializing a separate shadow region per workgroup and passing local-argument metadata through the interceptor and runtime.

  • Added AllocLocalShadow API to ShadowMemory and provided CPU/GPU implementations.
  • Enhanced the interceptor and DDI to record local kernel arguments and calculate per-kernel local shadow offsets.
  • Updated the device runtime (tsan_rtl.cpp), SPIR-V instrumentation, and end-to-end tests to initialize, clean up, and report races on local memory.

Reviewed Changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 2 comments.

File Description
tsan_shadow.hpp & tsan_shadow.cpp Defined AllocLocalShadow() and implemented it for CPU/GPU shadows.
tsan_interceptor.hpp & tsan_interceptor.cpp Recorded local arg sizes, computed workgroup counts, and invoked AllocLocalShadow.
tsan_rtl.cpp Mapped local memory to shadow space, added cleanup hooks, and marked local races.
llvm/.../ThreadSanitizer.cpp Instrumented static/dynamic local memory in SPIR-V and factored out common utils.
Comments suppressed due to low confidence (2)

unified-runtime/source/loader/layers/sanitizer/tsan/tsan_interceptor.hpp:148

  • [nitpick] Constructor parameter LocalWorkSize shadows the member of the same name; consider renaming the parameter (e.g. pLocalWorkSize) to avoid confusion.
             const size_t *GlobalWorkSize, const size_t *LocalWorkSize,

unified-runtime/source/loader/layers/sanitizer/tsan/tsan_report.cpp:30

  • [nitpick] The updated warning message only indicates memory type and omits the actual race address, which may reduce debugging clarity; consider including the address in the log string.
           "====WARNING: DeviceSanitizer: data race on {}",

Copy link
Contributor

@yingcong-wu yingcong-wu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one nit question.

@zhaomaosu
Copy link
Contributor Author

Hi @intel/unified-runtime-reviewers, could you please help review this PR? Thanks.

@zhaomaosu zhaomaosu requested a review from a team June 5, 2025 06:55
@zhaomaosu
Copy link
Contributor Author

Hi @intel/llvm-gatekeepers, this PR is ready to be merged. Thanks.

@steffenlarsen steffenlarsen merged commit 40f7479 into intel:sycl Jun 5, 2025
33 checks passed
@zhaomaosu zhaomaosu deleted the tsan-support-local branch June 5, 2025 07:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants