Branches - happyz/llama.cpp - HappyGit

master

4fd59e8427 · ggml-cuda: use CMAKE_CUDA_ARCHITECTURES if set when GGML_NATIVE=ON (#18413) · Updated 2025-12-27 17:33:14 -08:00

ggml-impl 4b3cb98d46 · ggml-impl : move extern "C" to start of file · Updated 2023-10-30 10:05:58 -07:00 happyz	6111 7		ZIP TAR.GZ
lto bc28aaa8c2 · make : use -lfto=auto to avoid warnings and maintain perf · Updated 2023-10-30 07:00:53 -07:00 happyz	6111 5		ZIP TAR.GZ
scratch 15267192c0 · llama : refactor tensor offloading as callback · Updated 2023-10-29 04:04:36 -07:00 happyz	6115 15		ZIP TAR.GZ
ggml-quants 8a86b95e87 · quantize : --pure option for disabling k-quant mixtures · Updated 2023-10-28 13:37:03 -07:00 happyz	6116 3		ZIP TAR.GZ
apply-3585 de7e0912b6 · convert : ignore tokens if their IDs are within [0, vocab_size) · Updated 2023-10-28 05:01:36 -07:00 happyz	6119 1		ZIP TAR.GZ
sampling-greedy-with-probs bbfc62ac2f · sampling : temp == 0.0 -> no probs, temp < 0.0 -> probs · Updated 2023-10-28 04:04:57 -07:00 happyz	6127 3		ZIP TAR.GZ
cuda-multi-gpu cd3e20fb50 · cuda : fix multi-gpu with tensor cores · Updated 2023-10-27 13:11:50 -07:00 happyz	6126 3		ZIP TAR.GZ
cuda-quantum-batch 49af767fad · build : add compile option to force use of MMQ kernels · Updated 2023-10-27 03:21:04 -07:00 happyz	6128 7		ZIP TAR.GZ
cuda-batched-gemm d798a17c34 · cuda : add TODO for calling cublas from kernel + using mem pool · Updated 2023-10-24 06:33:24 -07:00 happyz	6142 10		ZIP TAR.GZ
cuda-batched-gemm-deq 6966474928 · cuda : play with faster Q4_0 dequantization · Updated 2023-10-24 00:29:40 -07:00 happyz	6142 8		ZIP TAR.GZ
upd-issue-templates b9bb4cbe86 · Separate bug and enhancement template + no default title · Updated 2023-10-23 08:59:11 -07:00 happyz	6142 1		ZIP TAR.GZ
server-rev c0f4d54870 · server : add comment about changing slot_state to bool · Updated 2023-10-22 12:24:39 -07:00 happyz	6148 72		ZIP TAR.GZ
perf-study cb79f8a2d8 · llama : add SKIP_KQ_KQV option · Updated 2023-10-21 23:58:29 -07:00 happyz	6148 3		ZIP TAR.GZ
sampling-refactor 56ba00b923 · sampling : hide prev behind API and apply #3661 · Updated 2023-10-20 08:53:27 -07:00 happyz	6151 6		ZIP TAR.GZ
speculative-tree ad2727d091 · Merge branch 'master' into speculative-tree · Updated 2023-10-18 00:50:58 -07:00 happyz	6162 18		ZIP TAR.GZ
llava-fix-offloading 932589c0ef · Honor -ngl option for Cuda offloading in llava · Updated 2023-10-13 17:12:10 -07:00 happyz	6176 1		ZIP TAR.GZ
rev-sampling 5261aee8d8 · sampling : one sequence per sampling context · Updated 2023-10-12 10:36:44 -07:00 happyz	6179 1		ZIP TAR.GZ
batched-bench 2fcdf869cd · batched-bench : add mmq CLI arg · Updated 2023-10-11 09:42:33 -07:00 happyz	6191 7		ZIP TAR.GZ
alloc-assert-fix ee7456926e · ggml-alloc : fix assert in debug builds · Updated 2023-10-09 05:33:12 -07:00 happyz	6200 1		ZIP TAR.GZ
fix-kv-cache-access ee268b5446 · llama : no longer perform uninitialized access to the KV cache · Updated 2023-10-08 01:49:38 -07:00 happyz	6207 5		ZIP TAR.GZ

... 21 22 23 24 25 ...