Skip to content

Add Mt_Gemm for the nonlocal_pw #6253

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 7, 2025
Merged

Conversation

A-006
Copy link
Collaborator

@A-006 A-006 commented May 31, 2025

What's changed?

  • Changed the LCAO basis type to single precision for the test in the file.
  • The nonlocal_pw implementation has now been added with DSP support for computation. However, there is an error in the mt_fft_device.dat when computing multiple matrices. As a temporary solution, we have added a Gemm (matrix multiplication) routine, and the results will be tested once the bug is fixed.

@mohanchen mohanchen merged commit 2b1e662 into deepmodeling:develop Jun 7, 2025
14 checks passed
@mohanchen mohanchen added GPU & DCU & HPC GPU and DCU and HPC related any issues Refactor Refactor ABACUS codes Features Needed The features are indeed needed, and developers should have sophisticated knowledge labels Jun 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Features Needed The features are indeed needed, and developers should have sophisticated knowledge GPU & DCU & HPC GPU and DCU and HPC related any issues Refactor Refactor ABACUS codes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants