Skip to content

Commit 9581255

Browse files
author
midnight
committed
cmake : fix compile assumptions for power9/etc
* Add small comment re: VSX to readme
1 parent 33ea03f commit 9581255

File tree

2 files changed

+22
-12
lines changed

2 files changed

+22
-12
lines changed

README.md

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ High-performance inference of [OpenAI's Whisper](https://github.com/openai/whisp
1717
- Plain C/C++ implementation without dependencies
1818
- Apple Silicon first-class citizen - optimized via ARM NEON, Accelerate framework, Metal and [Core ML](#core-ml-support)
1919
- AVX intrinsics support for x86 architectures
20-
- VSX intrinsics support for POWER architectures
20+
- [VSX intrinsics support for POWER architectures](#power-vsx-intrinsics)
2121
- Mixed F16 / F32 precision
2222
- [Integer quantization support](#quantization)
2323
- Zero memory allocations at runtime
@@ -139,6 +139,20 @@ make -j large-v3-turbo
139139
| medium | 1.5 GiB | ~2.1 GB |
140140
| large | 2.9 GiB | ~3.9 GB |
141141

142+
## POWER VSX Intrinsics
143+
144+
`whisper.cpp` supports POWER architectures and includes code which
145+
significantly speeds operation on Linux running on POWER9/10, making it
146+
capable of faster-than-realtime transcription on underclocked Raptor
147+
Talos II. Ensure you have a BLAS package installed, and replace the
148+
standard cmake setup with:
149+
150+
```bash
151+
# build with GGML_BLAS defined
152+
cmake -B build -DGGML_BLAS=1
153+
cmake --build build --config Release
154+
./build/bin/whisper-cli [ .. etc .. ]
155+
142156
## Quantization
143157

144158
`whisper.cpp` supports integer quantization of the Whisper `ggml` models.

ggml/src/ggml-cpu/CMakeLists.txt

Lines changed: 7 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -279,19 +279,15 @@ function(ggml_add_cpu_backend_variant_impl tag_name)
279279
endif()
280280
elseif (${CMAKE_SYSTEM_PROCESSOR} MATCHES "ppc64")
281281
message(STATUS "PowerPC detected")
282-
execute_process(COMMAND bash -c "grep POWER10 /proc/cpuinfo | head -n 1" OUTPUT_VARIABLE POWER10_M)
283-
string(FIND "${POWER10_M}" "POWER10" substring_index)
284-
if (NOT DEFINED substring_index OR "${substring_index}" STREQUAL "")
285-
set(substring_index -1)
286-
endif()
287-
288-
if (${substring_index} GREATER_EQUAL 0)
289-
list(APPEND ARCH_FLAGS -mcpu=power10)
282+
execute_process(COMMAND bash -c "grep POWER /proc/cpuinfo | head -n 1" OUTPUT_VARIABLE POWER_M)
283+
if (${POWER_M} MATCHES "POWER10")
284+
list(APPEND ARCH_FLAGS -mcpu=power10)
285+
elseif (${POWER_M} MATCHES "POWER9")
286+
list(APPEND ARCH_FLAGS -mcpu=power9)
290287
elseif (${CMAKE_SYSTEM_PROCESSOR} MATCHES "ppc64le")
291-
list(APPEND ARCH_FLAGS -mcpu=powerpc64le)
288+
list(APPEND ARCH_FLAGS -mcpu=powerpc64le -mtune=native)
292289
else()
293-
list(APPEND ARCH_FLAGS -mcpu=native -mtune=native)
294-
# TODO: Add targets for Power8/Power9 (Altivec/VSX) and Power10(MMA) and query for big endian systems (ppc64/le/be)
290+
list(APPEND ARCH_FLAGS -mcpu=powerpc64 -mtune=native)
295291
endif()
296292
elseif (${CMAKE_SYSTEM_PROCESSOR} MATCHES "loongarch64")
297293
message(STATUS "loongarch64 detected")

0 commit comments

Comments
 (0)