Georgi Gerganov
1b74b9d73b
ggml : extend support for n_seq for soft_max and fattn
...
ggml-ci
2025-06-24 20:23:56 +03:00
Radoslav Gerganov
eba97574da
ggml : simplify forward_dup_f32
2025-06-23 13:21:36 +03:00
Georgi Gerganov
e73690a69d
ggml : ggml_set_rows update comment + better index name
2025-06-23 13:21:35 +03:00
Georgi Gerganov
630c84a2bd
ggml : ggml_set_rows support quantized dst
...
ggml-ci
2025-06-23 13:21:35 +03:00
Georgi Gerganov
df71c803b4
ggml : ggml_set_rows support broadcast
2025-06-23 13:21:35 +03:00
Georgi Gerganov
695b6b7025
ggml : add repeat impl for i64
2025-06-23 13:21:34 +03:00
Radoslav Gerganov
f2cd962fe2
use I64 for indices
2025-06-23 13:21:34 +03:00
Radoslav Gerganov
c1a581a10b
ggml : add ggml_set_rows
...
Add ggml_set_rows(a, b, c) which copies rows from 'b' into 'a' using
indices from 'c'.
ref: #8366
2025-06-23 13:21:32 +03:00
Acly
b7147673f2
Add `ggml_roll` (ggml/1274)
...
* ggml : add ggml_roll
* use set/get_op_params & std::min
2025-06-20 21:02:47 +03:00
Diego Devesa
482548716f
releases : use dl backend for linux release, remove arm64 linux release ( #13996 )
2025-06-04 13:15:54 +02:00
Vineel Abhinav
dd8ba93416
ggml: aarch64: Implement SVE F32 kernels for Mamba Sequential Scan Algorithm ( #13882 )
...
* F32-Mamba-Seq_Scan-SVE
* Fix formatting
* ggml : missing space
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-05-29 12:18:43 +03:00
Vineel Abhinav
1b8fb8152d
ggml: aarch64: Implement SVE F32 kernels for vector functions ( #13843 )
...
* F32-Mamba-SVE
* F32-Mamba-SVE
* Resolve test errors-1
* Resolve test errors-2
* F32-vec-SVE
* F32-vec-SVE
* F32-vec-SVE
2025-05-29 09:01:33 +03:00
Xuan-Son Nguyen
cf4cb59e64
ggml : add ggml_gelu_erf() ( #13667 )
...
* ggml : add ggml_gelu_na (not approximated)
* fix naming order
* rename na --> erf
* apply review suggesions
* revert naming order
2025-05-21 16:26:33 +02:00
Daniel Bevenius
13b0a04597
whisper: remove MSVC warnings pragmas (whisper/3090)
...
* ggml : remove MSVC warnings pragmas
This commit removes the MSVC-specific pragmas as these are now handled
in ggml/CMakeLists.txt.
* whisper : remove MSVC warning pragmas
This commit removes the MSVC-specific pragmas. These are now handled in
the ggml/CMakeLists.txt file.
2025-05-07 17:28:36 +03:00
SXX
77d5e9a76a
ggml: move fp16/bf16 conversion optimizations to CPU backend + export conversion APIs ( #13107 )
...
* ggml: dynamic x86_64 feature detection for FP32 <-> FP16/BF16 conversion
* move fp converter to ggml-cpu
* Switch ggml_compute_forward_get_rows_f16/bf16 to new ggml_cpu_fp16/bf16_to_fp32
2025-04-26 16:05:31 +02:00
Acly
c6e8cc28c1
ggml : Depthwise 2D convolution (ggml/1152)
...
* ggml-cpu : kernels for faster depthwise 2D convolution
* fix compile: remove static after moving to ops.cpp
* add dilation for depthwise_conv_2d
* review: rename to ggml_conv_2d_dw_direct, remove redundant struct keywords, pass by ref, whitespace
* review: rename depthwise_conv_2d -> conv_2d_dw everywhere
2025-04-24 17:32:47 +03:00
Diego Devesa
fe92821ea9
ggml : add bilinear upscale support (ggml/1185)
2025-04-11 00:17:47 +03:00
Diego Devesa
459895c326
ggml : add more generic custom op, remove deprecated custom ops (ggml/1183)
...
* ggml : add more generic ggml_custom op
* ggml : remove deprecated custom ops
2025-04-11 00:17:47 +03:00
Georgi Gerganov
a19b5cef16
llama : fix FA when KV cache is not used (i.e. embeddings) ( #12825 )
...
* ggml : FA supports F32 V
* graph : cast KV to F16 when the KV cache is not used
ggml-ci
* server : add test that exercises embeddings with FA enabled
ggml-ci
2025-04-08 19:54:51 +03:00
cmdr2
995083e4ed
cpu: move all the operators into a separate c++ file (except mul_mat) (ggml/1167)
...
* cpu: refactor SIMD mappings and vectorized op functions into separate files
* Fix warning for ggml_float to float
* Fix warnings
* cpu: move all the operations (except mul_mat) to a separate c++ file
* fix whitespace
* Update ggml/src/ggml-cpu/vec.h
Co-authored-by: Diego Devesa <slarengh@gmail.com>
* Fix PR comments - use GGML_UNUSED, use cassert in ops.cpp
* Reverse the order of import for ops.h and vec.h, to match what was present in ggml-cpu.c previously
---------
Co-authored-by: Diego Devesa <slarengh@gmail.com>
2025-04-07 18:44:17 +03:00