llama.cpp

History

Daniel Bevenius 2f7d0ac015 ggml : add CPU backend reference implementation This commit introduces a CPU reference implementation for GGML, designed primarily for testing and validation purposes. The motivation for this addition is to have a pure C CPU backend implementation that does not use any hardware-specific optimizations or intrinsics. This will allow for testing the CPU backend variants against the reference implementation to ensure correctness Building: ```console $ cmake -B build \ -DGGML_CPU_REF_BACKEND=ON -DGGML_BACKEND_DL=ON \ -DGGML_CPU_ALL_VARIANTS=ON ``` List availble cpu architectures/variants: ```console $ ./build/bin/test-backend-ops cpu-variants --list CPU variants: CPU-haswell - 12th Gen Intel(R) Core(TM) i7-1260P CPU-sse42 - 12th Gen Intel(R) Core(TM) i7-1260P CPU-x64 - 12th Gen Intel(R) Core(TM) i7-1260P CPU-alderlake - 12th Gen Intel(R) Core(TM) i7-1260P CPU-sandybridge - 12th Gen Intel(R) Core(TM) i7-1260P ``` Run tests: ```console ./build-ref/bin/test-backend-ops cpu-variants --variant CPU-alderlake -o ADD CPU-ref features: SSE2 = 1 CPU-alderlake features: SSE2 = 1 SSE3 = 1 SSSE3 = 1 AVX = 1 AVX_VNNI = 1 AVX2 = 1 F16C = 1 FMA = 1 BMI2 = 1 LLAMAFILE = 1 OPENMP = 1 REPACK = 1 Testing CPU variant 'CPU-alderlake' against 'CPU-ref' backend... ADD(type=f16,ne=[1,1,8,1],nr=[1,1,1,1],nf=1): OK ADD(type=f16,ne=[1,1,1,1],nr=[32,1,1,1],nf=1): OK ... ```		2026-01-02 11:50:31 +01:00
..
ggml-alloc.h	llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization (#16653 )	2025-12-15 09:24:59 +01:00
ggml-backend.h	ggml : add CPU backend reference implementation	2026-01-02 11:50:31 +01:00
ggml-blas.h	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-cann.h	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-cpp.h	ggml : fix ggml_gallocr_ptr type (ggml/1205)	2025-05-01 09:58:44 +03:00
ggml-cpu.h	ggml : add CPU backend reference implementation	2026-01-02 11:50:31 +01:00
ggml-cuda.h	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-hexagon.h	Add experimental ggml-hexagon backend for the Hexagon NPU (#16547 )	2025-10-22 13:47:09 -07:00
ggml-metal.h	metal : refactor + optimize v2 (#15995 )	2025-09-17 20:38:12 +03:00
ggml-opencl.h	Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs (#10693 )	2024-12-13 12:23:52 -08:00
ggml-opt.h	finetune: SGD optimizer, more CLI args (#13873 )	2025-08-14 12:03:57 +02:00
ggml-rpc.h	rpc : fix alloc size logic (#17116 )	2025-12-05 19:39:04 +02:00
ggml-sycl.h	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-vulkan.h	vulkan: Make Vulkan optional at runtime (#11493 ). (#11494 )	2025-02-10 07:17:21 +01:00
ggml-webgpu.h	ggml: Add initial WebGPU backend (#14521 )	2025-07-16 18:18:51 +03:00
ggml-zdnn.h	zdnn: refactor codebase + add docs (#16178 )	2025-09-23 14:53:05 +08:00
ggml-zendnn.h	ggml-zendnn : add ZenDNN backend for AMD CPUs (#17690 )	2025-12-07 00:13:33 +08:00
ggml.h	llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization (#16653 )	2025-12-15 09:24:59 +01:00
gguf.h	GGUF: C++ refactor, backend support, misc fixes (#11030 )	2025-01-07 18:01:58 +01:00