llama.cpp

Commit Graph

Author	SHA1	Message	Date
Aaron Teo	18d79e1a30	ggml-cpu: move s390x typedef to own header file Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> (cherry picked from commit `157f856c34`)	2025-06-21 19:31:34 +08:00
Aaron Teo	1cacdd9a36	ggml-cpu: fix macro declaration Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 19:08:48 +08:00
Aaron Teo	48df977079	Revert "ggml-cpu: move s390x typedef to own header file" This reverts commit `157f856c34`. Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 19:03:09 +08:00
Aaron Teo	157f856c34	ggml-cpu: move s390x typedef to own header file Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 19:00:20 +08:00
Aaron Teo	4ad6efa37b	ggml-cpu: diagnose why __NNPA__ macro is not being defined Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 18:33:08 +08:00
Aaron Teo	8ef51b9055	ggml-cpu: bring back fp32->fp16 store nnpa Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 17:49:36 +08:00
Aaron Teo	987d1690e4	ggml-cpu: clarified vector naming Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 17:39:35 +08:00
Aaron Teo	4621a23c14	ggml-cpu: add 4 element loops for fp32->fp16 Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 17:32:20 +08:00
Aaron Teo	373fa28e4c	ggml-cpu: change to typedef vector types Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 17:26:20 +08:00
Aaron Teo	7413dabc8c	ggml-cpu: fix compiler types Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 17:23:18 +08:00
Aaron Teo	e12e9fe704	ggml-cpu: reattempt fp32->fp16 Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 17:20:20 +08:00
Aaron Teo	54811fc128	ggml-cpu: fix typo Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 17:13:57 +08:00
Aaron Teo	433d587426	ggml-cpu: reattempt fp32->fp16 Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 17:12:22 +08:00
Aaron Teo	946c78ebde	ggml-cpu: switch to elif macro Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 17:06:18 +08:00
Aaron Teo	27131e5f34	ggml-cpu: disable fp32->fp16 nnpa conversions for now there are some conversion failures in nnpa that requires the eyes of an ibm stsm. will create a separate pr to introduce the fp32->fp16 change. Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 16:58:43 +08:00
Aaron Teo	4f017d718a	ggml-cpu: test fix for conversion failure Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 16:55:16 +08:00
Aaron Teo	5424d9e757	ggml-cpu: add breakpoint for debugging Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 16:51:05 +08:00
Aaron Teo	bb9345ca8a	ggml-cpu: activate nnpa for ggml_cpu_fp32_to_fp16 Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 16:50:05 +08:00
Aaron Teo	e0f8fb930b	ggml-cpu: clarify variable naming Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 16:43:41 +08:00
Aaron Teo	27b4c3f338	ggml-cpu: remove noop, general code cleanup Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 16:41:39 +08:00
Aaron Teo	8312adc980	ggml-cpu: rework noop Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 16:24:32 +08:00
Aaron Teo	6d507bbeb0	ggml-cpu: switch to vec_xst for 4 element loops also Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 16:23:23 +08:00
Aaron Teo	f9f6c7e897	ggml-cpu: nnpa switch to vec_xst test Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 16:16:35 +08:00
Aaron Teo	6a25fd8531	ggml-cpu: nnpa activate ggml_cpu_fp16_to_fp32 for 8 elements Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 16:10:44 +08:00
Aaron Teo	ebc1d19f62	ggml-cpu: activate nnpa for ggml_cpu_fp16_to_fp32 Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 16:01:55 +08:00
Aaron Teo	9330454cb8	ggml-cpu: remove sigint from fp16 store for some reason, the function is not getting a hit when debugged with gdb. we will need to investigate further Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 15:06:31 +08:00
Aaron Teo	575ea9f6c6	ggml-cpu: fp16 load ensured to hit Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 15:00:46 +08:00
Aaron Teo	8f3a5af6c0	ggml-cpu: ensure fp16 and fp32 load and stores are called Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 14:57:25 +08:00
Aaron Teo	94f10ca189	ggml-cpu: fix float placeholder Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 14:53:15 +08:00
Aaron Teo	d9cc63a94a	ggml-cpu: fix print vs printf Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 14:51:38 +08:00
Aaron Teo	48b820d05f	ggml-cpu: add debugging prints to see if dlf16 is correct Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-21 14:50:33 +08:00
Aaron Teo	ffe296457e	ggml-cpu: better variable names Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> (cherry picked from commit `2f58bbcbb8`)	2025-06-21 14:47:46 +08:00
Aaron Teo	ebf9f34a38	ggml-cpu: add fp32->fp16 Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> (cherry picked from commit `0ff0d65162`)	2025-06-21 14:47:23 +08:00
Aaron Teo	45a4cf651c	ggml-cpu: add fp16->fp32 nnpa first Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> (cherry picked from commit `8d4a7987f9`)	2025-06-21 14:47:12 +08:00
Aaron Teo	5801806f70	ggml-cpu: add nnpa compile flag Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> (cherry picked from commit `4a9f60c201`)	2025-06-21 14:46:41 +08:00
Acly	b7147673f2	Add `ggml_roll` (ggml/1274) * ggml : add ggml_roll * use set/get_op_params & std::min	2025-06-20 21:02:47 +03:00
Christian Kastner	6369be0735	Implement GGML_CPU_ALL_VARIANTS for PowerPC (#14286 ) * Add PowerPC feature detection and scoring * ggml-cpu: Implement GGML_CPU_ALL_VARIANTS for PowerPC * ggml-cpu: Delay some initializations until function is called When using GGML_BACKEND_DL=ON, these initializations might use instructions that are not supported by the current CPU. --------- Co-authored-by: Diego Devesa <slarengh@gmail.com>	2025-06-20 14:17:32 +02:00
Georgi Gerganov	d27b3ca175	ggml : fix repack work size for mul_mat_id (#14292 ) ggml-ci	2025-06-20 11:19:15 +03:00
Charles Xu	9230dbe2c7	ggml: Update KleidiAI to v1.9.0 (#14277 )	2025-06-20 10:51:01 +03:00
Diego Devesa	8f71d0f3e8	ggml-cpu : remove unnecesary arm feature detection (#14281 ) Support for Arm runtime feature detection has now been added to GGML_CPU_ALL_VARIANTS. This removes the old and not very functional code.	2025-06-19 21:24:14 +02:00
Aaron Teo	faed5a5f5d	llamafile : support s390x SIMD instruction set (#14273 )	2025-06-19 11:48:54 +02:00
Aaron Teo	50d2227953	ggml-cpu: reduce asm calls for hsum (#14037 ) Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-18 18:10:08 +01:00
Aaron Teo	6231c5cd6d	ggml-cpu: fix uncaught underscore terminators (#14023 ) Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-06-18 18:06:49 +01:00
Charles Xu	ef035803eb	ggml: Add Apple support for GGML_CPU_ALL_VARIANTS (#14258 )	2025-06-18 12:40:07 +01:00
xctan	860a9e4eef	ggml-cpu : remove the weak alias trick (#14221 )	2025-06-17 12:58:32 +03:00
Diego Devesa	6adc3c3ebc	llama : add thread safety test (#14035 ) * llama : add thread safety test * llamafile : remove global state * llama : better LLAMA_SPLIT_MODE_NONE logic when main_gpu < 0 GPU devices are not used --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-06-16 08:11:43 -07:00
Charles Xu	3ba0d843c6	ggml: Add Android support for GGML_CPU_ALL_VARIANTS (#14206 )	2025-06-16 11:47:57 +02:00
xctan	3555b3004b	ggml-cpu : rework weak alias on apple targets (#14146 ) * ggml-cpu : rework weak alias on apple targets * fix powerpc detection * fix ppc detection * fix powerpc detection on darwin	2025-06-16 13:54:15 +08:00
Christian Kastner	532802f938	Implement GGML_CPU_ALL_VARIANTS for ARM (#14080 ) * ggml-cpu: Factor out feature detection build from x86 * ggml-cpu: Add ARM feature detection and scoring This is analogous to cpu-feats-x86.cpp. However, to detect compile-time activation of features, we rely on GGML_USE_<FEAT> which need to be set in cmake, instead of GGML_<FEAT> that users would set for x86. This is because on ARM, users specify features with GGML_CPU_ARM_ARCH, rather than with individual flags. * ggml-cpu: Implement GGML_CPU_ALL_VARIANTS for ARM Like x86, however to pass around arch flags within cmake, we use GGML_INTERNAL_<FEAT> as we don't have GGML_<FEAT>. Some features are optional, so we may need to build multiple backends per arch version (armv8.2_1, armv8.2_2, ...), and let the scoring function sort out which one can be used. * ggml-cpu: Limit ARM GGML_CPU_ALL_VARIANTS to Linux for now The other platforms will need their own specific variants. This also fixes the bug that the the variant-building branch was always being executed as the else-branch of GGML_NATIVE=OFF. The branch is moved to an elseif-branch which restores the previous behavior.	2025-06-11 21:07:44 +02:00
Georgi Gerganov	b7ce1ad1e3	ggml : fix weak alias win32 (whisper/0) ggml-ci	2025-06-10 18:39:33 +03:00

1 2 3 4

176 Commits