llama.cpp

History

muggle-stack 342c728d03 ggml : fix SpaceMit IME array out-of-bounds in task assignment (#16629 ) Fix incorrect task-to-batch index calculation in the quantization phase. The bug caused out-of-bounds access to qnbitgemm_args array when compute_idx exceeded per_gemm_block_count_m, leading to invalid pointer dereferences and SIGBUS errors. Correctly map tasks to batches by dividing compute_idx by per_gemm_block_count_m instead of block_size_m. Example: batch_feature=1, gemm_m=30, block_size_m=4 per_gemm_block_count_m = 8, task_count = 8 Old: gemm_idx = 4/4 = 1 (out of bounds New: gemm_idx = 4/8 = 0 (correct) Tested on SpaceMit K1 RISC-V64 with qwen2.5:0.5b model. Co-authored-by: muggle <mingjun.rong@spacemit.com>		2025-10-17 13:01:23 +03:00
..
amx	ggml : fix unaligned access in AMX code (#16315 )	2025-10-06 16:05:27 +03:00
arch	devops: add s390x & ppc64le CI (#15925 )	2025-09-27 02:03:33 +08:00
cmake	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
kleidiai	kleidiai: kernel interface refactoring (#16460 )	2025-10-09 10:29:17 +03:00
llamafile	llamafile: PowerPC Sgemm Optimization (#15558 )	2025-08-26 23:35:25 +08:00
spacemit	ggml : fix SpaceMit IME array out-of-bounds in task assignment (#16629 )	2025-10-17 13:01:23 +03:00
CMakeLists.txt	kleidiai : fix work size and threads sync for fp16 (#16246 )	2025-09-30 10:07:20 +03:00
arch-fallback.h	ggml-cpu: implement MXFP4 SIMD for s390x (#16193 )	2025-09-26 13:27:25 +03:00
binary-ops.cpp	cpu: de-duplicate some of the operators and refactor (ggml/1144)	2025-03-30 08:33:31 +03:00
binary-ops.h	cpu: de-duplicate some of the operators and refactor (ggml/1144)	2025-03-30 08:33:31 +03:00
common.h	ggml : refactor forward_dup for cpu backend (#16062 )	2025-09-19 06:31:56 +02:00
ggml-cpu-impl.h	ggml : fix build broken with -march=armv9-a on MacOS (#16520 )	2025-10-13 15:48:47 +03:00
ggml-cpu.c	ggml-cpu: replace putenv with setenv for const-correctness (#16573 )	2025-10-16 08:10:32 +03:00
ggml-cpu.cpp	ggml: riscv: add riscv spacemit backend (#15288 )	2025-09-29 17:50:44 +03:00
hbm.cpp	ggml-cpu : split arch-specific implementations (#13892 )	2025-06-09 16:47:13 +02:00
hbm.h	ggml-cpu : split arch-specific implementations (#13892 )	2025-06-09 16:47:13 +02:00
ops.cpp	cpu : add FLOOR, CEIL, ROUND and TRUNC unary operators (#16083 )	2025-10-15 21:24:51 +02:00
ops.h	ggml: add ops for WAN video model (cuda && cpu) (#15669 )	2025-09-04 10:38:49 +02:00
quants.c	llama : add gpt-oss (#15091 )	2025-08-05 22:10:36 +03:00
quants.h	llama : add gpt-oss (#15091 )	2025-08-05 22:10:36 +03:00
repack.cpp	ggml : repack block_iq4_nlx8 (#14904 )	2025-08-13 11:09:39 +03:00
repack.h	ggml : repack block_iq4_nlx8 (#14904 )	2025-08-13 11:09:39 +03:00
simd-mappings.h	ggml : fix loongarch lsx compilation error (#15864 )	2025-09-25 12:22:55 +03:00
traits.cpp	ggml : fix fallback to CPU for ununsupported ops (#15118 )	2025-08-06 14:37:35 +02:00
traits.h	ggml : fix fallback to CPU for ununsupported ops (#15118 )	2025-08-06 14:37:35 +02:00
unary-ops.cpp	cpu : add FLOOR, CEIL, ROUND and TRUNC unary operators (#16083 )	2025-10-15 21:24:51 +02:00
unary-ops.h	cpu : add FLOOR, CEIL, ROUND and TRUNC unary operators (#16083 )	2025-10-15 21:24:51 +02:00
vec.cpp	ggml : fix scalar path for computing norm (#16558 )	2025-10-13 11:22:27 +03:00
vec.h	ggml : Fix FP16 ELU positive branch (#16519 )	2025-10-12 08:25:37 +03:00