llama.cpp

Commit Graph

Author	SHA1	Message	Date
Georgi Gerganov	1b74b9d73b	ggml : extend support for n_seq for soft_max and fattn ggml-ci	2025-06-24 20:23:56 +03:00
Radoslav Gerganov	eba97574da	ggml : simplify forward_dup_f32	2025-06-23 13:21:36 +03:00
Georgi Gerganov	e73690a69d	ggml : ggml_set_rows update comment + better index name	2025-06-23 13:21:35 +03:00
Georgi Gerganov	630c84a2bd	ggml : ggml_set_rows support quantized dst ggml-ci	2025-06-23 13:21:35 +03:00
Georgi Gerganov	df71c803b4	ggml : ggml_set_rows support broadcast	2025-06-23 13:21:35 +03:00
Georgi Gerganov	695b6b7025	ggml : add repeat impl for i64	2025-06-23 13:21:34 +03:00
Radoslav Gerganov	f2cd962fe2	use I64 for indices	2025-06-23 13:21:34 +03:00
Radoslav Gerganov	c1a581a10b	ggml : add ggml_set_rows Add ggml_set_rows(a, b, c) which copies rows from 'b' into 'a' using indices from 'c'. ref: #8366	2025-06-23 13:21:32 +03:00
Acly	b7147673f2	Add `ggml_roll` (ggml/1274) * ggml : add ggml_roll * use set/get_op_params & std::min	2025-06-20 21:02:47 +03:00
Diego Devesa	482548716f	releases : use dl backend for linux release, remove arm64 linux release (#13996 )	2025-06-04 13:15:54 +02:00
Vineel Abhinav	dd8ba93416	ggml: aarch64: Implement SVE F32 kernels for Mamba Sequential Scan Algorithm (#13882 ) * F32-Mamba-Seq_Scan-SVE * Fix formatting * ggml : missing space --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-05-29 12:18:43 +03:00
Vineel Abhinav	1b8fb8152d	ggml: aarch64: Implement SVE F32 kernels for vector functions (#13843 ) * F32-Mamba-SVE * F32-Mamba-SVE * Resolve test errors-1 * Resolve test errors-2 * F32-vec-SVE * F32-vec-SVE * F32-vec-SVE	2025-05-29 09:01:33 +03:00
Xuan-Son Nguyen	cf4cb59e64	ggml : add ggml_gelu_erf() (#13667 ) * ggml : add ggml_gelu_na (not approximated) * fix naming order * rename na --> erf * apply review suggesions * revert naming order	2025-05-21 16:26:33 +02:00
Daniel Bevenius	13b0a04597	whisper: remove MSVC warnings pragmas (whisper/3090) * ggml : remove MSVC warnings pragmas This commit removes the MSVC-specific pragmas as these are now handled in ggml/CMakeLists.txt. * whisper : remove MSVC warning pragmas This commit removes the MSVC-specific pragmas. These are now handled in the ggml/CMakeLists.txt file.	2025-05-07 17:28:36 +03:00
SXX	77d5e9a76a	ggml: move fp16/bf16 conversion optimizations to CPU backend + export conversion APIs (#13107 ) * ggml: dynamic x86_64 feature detection for FP32 <-> FP16/BF16 conversion * move fp converter to ggml-cpu * Switch ggml_compute_forward_get_rows_f16/bf16 to new ggml_cpu_fp16/bf16_to_fp32	2025-04-26 16:05:31 +02:00
Acly	c6e8cc28c1	ggml : Depthwise 2D convolution (ggml/1152) * ggml-cpu : kernels for faster depthwise 2D convolution * fix compile: remove static after moving to ops.cpp * add dilation for depthwise_conv_2d * review: rename to ggml_conv_2d_dw_direct, remove redundant struct keywords, pass by ref, whitespace * review: rename depthwise_conv_2d -> conv_2d_dw everywhere	2025-04-24 17:32:47 +03:00
Diego Devesa	fe92821ea9	ggml : add bilinear upscale support (ggml/1185)	2025-04-11 00:17:47 +03:00
Diego Devesa	459895c326	ggml : add more generic custom op, remove deprecated custom ops (ggml/1183) * ggml : add more generic ggml_custom op * ggml : remove deprecated custom ops	2025-04-11 00:17:47 +03:00
Georgi Gerganov	a19b5cef16	llama : fix FA when KV cache is not used (i.e. embeddings) (#12825 ) * ggml : FA supports F32 V * graph : cast KV to F16 when the KV cache is not used ggml-ci * server : add test that exercises embeddings with FA enabled ggml-ci	2025-04-08 19:54:51 +03:00
cmdr2	995083e4ed	cpu: move all the operators into a separate c++ file (except mul_mat) (ggml/1167) * cpu: refactor SIMD mappings and vectorized op functions into separate files * Fix warning for ggml_float to float * Fix warnings * cpu: move all the operations (except mul_mat) to a separate c++ file * fix whitespace * Update ggml/src/ggml-cpu/vec.h Co-authored-by: Diego Devesa <slarengh@gmail.com> * Fix PR comments - use GGML_UNUSED, use cassert in ops.cpp * Reverse the order of import for ops.h and vec.h, to match what was present in ggml-cpu.c previously --------- Co-authored-by: Diego Devesa <slarengh@gmail.com>	2025-04-07 18:44:17 +03:00

20 Commits