llama.cpp/ggml/src
Daniel Bevenius 3913f8730e
ggml : fix padding in timestep embedding kernels (#15932)
* ggml : remove adding extra dim timestep embedding

This commit updates the ggml_timestep_embedding function to no longer
add an extra dimension when the specified dimension is odd.

The motivation for this change is that this introduces an unnecessary
dimension when the dimension is odd, which caused an issue in the
kernels which were not expecting this extra dimension and it resulted in
uninitialized memory for the second to last dimension.

* ggml-cuda : fix padding in timestep embedding kernel

This commit removes the zeroing out of the last dimension now that we
are not adding the extra padding dimension.

* ggml-metal : fix padding in timestep embedding kernel

This commit fixes the zero padding for odd dimensions in
the timestep embedding kernel

* ggml-opencl : fix padding in timestep embedding kernel

This commit fixes the zero padding for odd dimensions in
the timestep embedding kernel.

* ggml-sycl : fix padding in timestep embedding kernel

This commit fixes the zero padding for odd dimensions in
the timestep embedding kernel.

* ggml-vulkan : fix padding in timestep embedding kernel

This commit fixes the zero padding for odd dimensions in
the timestep embedding kernel.

* ggml-cpu : fix padding in timestep embedding function

This commit removes the zeroing out of the last dimension now that we
are not adding the extra padding dimension.
2025-09-16 15:25:57 +02:00
..
ggml-blas vulkan: sort graph to allow more parallel execution (#15850) 2025-09-09 02:10:07 +08:00
ggml-cann CANN: Disable acl_graph for prefill stage (#15933) 2025-09-11 15:59:37 +08:00
ggml-cpu ggml : fix padding in timestep embedding kernels (#15932) 2025-09-16 15:25:57 +02:00
ggml-cuda ggml : fix padding in timestep embedding kernels (#15932) 2025-09-16 15:25:57 +02:00
ggml-hip HIP: bump requirement to rocm 6.1 (#15296) 2025-08-13 20:44:30 +02:00
ggml-metal ggml : fix padding in timestep embedding kernels (#15932) 2025-09-16 15:25:57 +02:00
ggml-musa CUDA: replace GGML_CUDA_F16 with CUDA arch checks (#15433) 2025-08-20 16:58:49 +02:00
ggml-opencl ggml : fix padding in timestep embedding kernels (#15932) 2025-09-16 15:25:57 +02:00
ggml-rpc vulkan: sort graph to allow more parallel execution (#15850) 2025-09-09 02:10:07 +08:00
ggml-sycl ggml : fix padding in timestep embedding kernels (#15932) 2025-09-16 15:25:57 +02:00
ggml-vulkan ggml : fix padding in timestep embedding kernels (#15932) 2025-09-16 15:25:57 +02:00
ggml-webgpu vulkan: sort graph to allow more parallel execution (#15850) 2025-09-09 02:10:07 +08:00
ggml-zdnn ggml-zdnn: rm user mapped buffers (#15965) 2025-09-14 13:37:03 +08:00
CMakeLists.txt ggml: initial IBM zDNN backend (#14975) 2025-08-15 21:11:22 +08:00
ggml-alloc.c llama : add gpt-oss (#15091) 2025-08-05 22:10:36 +03:00
ggml-backend-impl.h ggml-backend : add GGML_BACKEND_DEVICE_TYPE_IGPU device type (#15797) 2025-09-11 22:47:38 +02:00
ggml-backend-reg.cpp ggml-backend : add GGML_BACKEND_DEVICE_TYPE_IGPU device type (#15797) 2025-09-11 22:47:38 +02:00
ggml-backend.cpp vulkan: sort graph to allow more parallel execution (#15850) 2025-09-09 02:10:07 +08:00
ggml-common.h llama : add gpt-oss (#15091) 2025-08-05 22:10:36 +03:00
ggml-impl.h llama : add gpt-oss (#15091) 2025-08-05 22:10:36 +03:00
ggml-opt.cpp finetune: SGD optimizer, more CLI args (#13873) 2025-08-14 12:03:57 +02:00
ggml-quants.c ggml-quants : fix make_qp_quants NANs and IQ1 assertion errors (#15379) 2025-08-18 09:23:56 +02:00
ggml-quants.h llama : add gpt-oss (#15091) 2025-08-05 22:10:36 +03:00
ggml-threading.cpp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
ggml-threading.h remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797) 2024-12-12 19:02:49 +01:00
ggml.c ggml : fix padding in timestep embedding kernels (#15932) 2025-09-16 15:25:57 +02:00
ggml.cpp ggml : Print backtrace on uncaught C++ exceptions (ggml/1232) 2025-06-01 13:43:57 +03:00
gguf.cpp gguf: gguf_writer refactor (#15691) 2025-09-05 11:34:28 +02:00