-
Notifications
You must be signed in to change notification settings - Fork 237
Add AARCH64-NEON fastpath for data state #618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Pavel Irzhauski <[email protected]>
into simd-neon implement SIMD fast path using ARM NEON in the same way as using SSE2 Signed-off-by: Pavel Irzhauski <[email protected]>
into simd-neon implement SIMD fast path using ARM NEON in the same way as using SSE2 Signed-off-by: Pavel Irzhauski <[email protected]>
For reference, on an M4 pro with
|
I'm seeing similar results on Apple M1 even without
|
Have you considered using the "safe architecture intrinsics" (aka target features v1.1) from Rust 1.87? https://blog.rust-lang.org/2025/05/15/Rust-1.87.0/#safe-architecture-intrinsics I think this would allow the body of the SIMD function(s) to be safe (even though it would still be |
|
Solution for #612 is almost the same as #601. It also gives significant performance improvement on aarch64 for
lipsum.html
andlipsum.zh.html
and close to 0 change for smaller cases.