llama.cpp

History

Reese Levine 957d717ce5 ggml-webgpu: parameterize submission size and add iOS specific limits (#21533 ) * Work towards removing bitcast * Move rest of existing types over * Add timeout back to wait and remove synchronous set_tensor/memset_tensor * move to unpackf16 for wider compatibility * cleanup * Remove deadlock condition in free_bufs * Start work on removing parameter buffer pools * Simplify and optimize further * simplify profile futures * Fix stride * Try using a single command buffer per batch * formatting * Add parameters for different browsers in-flight submissions * Update handling of batch size too * Throttle ios as much as possible * Increase timeout for llvm-pipe testing		2026-04-07 20:30:01 +03:00
..
cmake	…
include	ggml : deprecate GGML_OP_ADD1 (#21363 )	2026-04-07 15:28:27 +03:00
src	ggml-webgpu: parameterize submission size and add iOS specific limits (#21533 )	2026-04-07 20:30:01 +03:00
.gitignore	…
CMakeLists.txt	ggml : bump version to 0.9.11 (ggml/1456)	2026-04-02 10:39:00 +03:00