llama.cpp

History

Daniel Bevenius 3913f8730e ggml : fix padding in timestep embedding kernels (#15932 ) * ggml : remove adding extra dim timestep embedding This commit updates the ggml_timestep_embedding function to no longer add an extra dimension when the specified dimension is odd. The motivation for this change is that this introduces an unnecessary dimension when the dimension is odd, which caused an issue in the kernels which were not expecting this extra dimension and it resulted in uninitialized memory for the second to last dimension. * ggml-cuda : fix padding in timestep embedding kernel This commit removes the zeroing out of the last dimension now that we are not adding the extra padding dimension. * ggml-metal : fix padding in timestep embedding kernel This commit fixes the zero padding for odd dimensions in the timestep embedding kernel * ggml-opencl : fix padding in timestep embedding kernel This commit fixes the zero padding for odd dimensions in the timestep embedding kernel. * ggml-sycl : fix padding in timestep embedding kernel This commit fixes the zero padding for odd dimensions in the timestep embedding kernel. * ggml-vulkan : fix padding in timestep embedding kernel This commit fixes the zero padding for odd dimensions in the timestep embedding kernel. * ggml-cpu : fix padding in timestep embedding function This commit removes the zeroing out of the last dimension now that we are not adding the extra padding dimension.		2025-09-16 15:25:57 +02:00
..
amx	ggml-cpu: enable IBM NNPA Vector Intrinsics (#14317 )	2025-06-25 23:49:04 +02:00
arch	ggml-cpu: clean up s390x SIMD (#15855 )	2025-09-08 02:18:28 +08:00
cmake	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
kleidiai	kleidiai: fix GGML_ASSERT(*cur_backend_id != -1) failed (#15614 )	2025-09-11 12:45:40 +02:00
llamafile	llamafile: PowerPC Sgemm Optimization (#15558 )	2025-08-26 23:35:25 +08:00
CMakeLists.txt	ggml-cpu : add check for ARM MATMUL_INT8/i8mm support (#15922 )	2025-09-11 14:39:12 +01:00
arch-fallback.h	ggml-cpu: Support Q5_0 and Q5_1 on s390x (#15486 )	2025-08-22 16:11:04 +08:00
binary-ops.cpp	cpu: de-duplicate some of the operators and refactor (ggml/1144)	2025-03-30 08:33:31 +03:00
binary-ops.h	cpu: de-duplicate some of the operators and refactor (ggml/1144)	2025-03-30 08:33:31 +03:00
common.h	ggml-cpu: enable IBM NNPA Vector Intrinsics (#14317 )	2025-06-25 23:49:04 +02:00
ggml-cpu-impl.h	ggml-cpu: clean up s390x SIMD (#15855 )	2025-09-08 02:18:28 +08:00
ggml-cpu.c	ggml: allow casting between f32 and i32 (#15783 )	2025-09-08 12:33:01 +02:00
ggml-cpu.cpp	vulkan: sort graph to allow more parallel execution (#15850 )	2025-09-09 02:10:07 +08:00
hbm.cpp	ggml-cpu : split arch-specific implementations (#13892 )	2025-06-09 16:47:13 +02:00
hbm.h	ggml-cpu : split arch-specific implementations (#13892 )	2025-06-09 16:47:13 +02:00
ops.cpp	ggml : fix padding in timestep embedding kernels (#15932 )	2025-09-16 15:25:57 +02:00
ops.h	ggml: add ops for WAN video model (cuda && cpu) (#15669 )	2025-09-04 10:38:49 +02:00
quants.c	llama : add gpt-oss (#15091 )	2025-08-05 22:10:36 +03:00
quants.h	llama : add gpt-oss (#15091 )	2025-08-05 22:10:36 +03:00
repack.cpp	ggml : repack block_iq4_nlx8 (#14904 )	2025-08-13 11:09:39 +03:00
repack.h	ggml : repack block_iq4_nlx8 (#14904 )	2025-08-13 11:09:39 +03:00
simd-mappings.h	ggml-cpu: drop support for nnpa intrinsics (#15821 )	2025-09-06 11:27:28 +08:00
traits.cpp	ggml : fix fallback to CPU for ununsupported ops (#15118 )	2025-08-06 14:37:35 +02:00
traits.h	ggml : fix fallback to CPU for ununsupported ops (#15118 )	2025-08-06 14:37:35 +02:00
unary-ops.cpp	cpu: de-duplicate some of the operators and refactor (ggml/1144)	2025-03-30 08:33:31 +03:00
unary-ops.h	cpu: de-duplicate some of the operators and refactor (ggml/1144)	2025-03-30 08:33:31 +03:00
vec.cpp	ggml-cpu : optimize RVV kernels (#15720 )	2025-09-03 16:16:21 +08:00
vec.h	ggml-cpu : optimize RVV kernels (#15720 )	2025-09-03 16:16:21 +08:00