llama.cpp

History

shalinib-ibm 55c509daf5 ggml : refactor llamafile_sgemm PPC code (#14673 ) Remove un-necessary templates from class definition and packing functions Reduce deeply nested conditionals, if-else switching in mnapck function Replace repetitive code with inline functions in Packing functions 2 ~ 7% improvement in Q8 Model 15 ~ 50% improvement in Q4 Model Signed-off-by: Shalini Salomi Bodapati <Shalini.Salomi.Bodapati@ibm.com>	2025-07-14 16:16:42 +03:00
..
sgemm.cpp	ggml : refactor llamafile_sgemm PPC code (#14673 )	2025-07-14 16:16:42 +03:00
sgemm.h	llamafile : support s390x SIMD instruction set (#14273 )	2025-06-19 11:48:54 +02:00

ggml : refactor llamafile_sgemm PPC code (#14673 )

Remove un-necessary templates from class definition and packing functions
Reduce deeply nested conditionals, if-else switching in mnapck function
Replace repetitive code with inline functions in Packing functions

2 ~ 7% improvement in Q8 Model
15 ~ 50% improvement in Q4 Model

Signed-off-by: Shalini Salomi Bodapati <Shalini.Salomi.Bodapati@ibm.com>

2025-07-14 16:16:42 +03:00

sgemm.cpp

ggml : refactor llamafile_sgemm PPC code (#14673 )

2025-07-14 16:16:42 +03:00

sgemm.h

llamafile : support s390x SIMD instruction set (#14273 )

2025-06-19 11:48:54 +02:00