llama.cpp

History

Georgi Gerganov c129369702 cuda : try to fix __hgt2_mask ggml-ci		2024-04-23 09:18:55 +03:00
..
acc.cu	…
acc.cuh	…
alibi.cu	…
alibi.cuh	…
arange.cu	…
arange.cuh	…
argsort.cu	ggml : mul_mat_id use the same tensor for all the experts (#6387 )	2024-04-03 16:07:05 +03:00
argsort.cuh	…
binbcast.cu	ggml : group all experts in a single ggml_mul_mat_id (#6505 )	2024-04-18 15:18:48 +02:00
binbcast.cuh	…
clamp.cu	…
clamp.cuh	…
common.cuh	cuda : try to fix __hgt2_mask	2024-04-23 09:18:55 +03:00
concat.cu	…
concat.cuh	…
convert.cu	ggml : group all experts in a single ggml_mul_mat_id (#6505 )	2024-04-18 15:18:48 +02:00
convert.cuh	llama : add Command R Plus support (#6491 )	2024-04-09 11:16:13 +03:00
cpy.cu	…
cpy.cuh	…
dequantize.cuh	llama : add Command R Plus support (#6491 )	2024-04-09 11:16:13 +03:00
diagmask.cu	…
diagmask.cuh	…
dmmv.cu	llama : add Command R Plus support (#6491 )	2024-04-09 11:16:13 +03:00
dmmv.cuh	sync : ggml (#6351 )	2024-03-29 17:45:46 +02:00
fattn.cu	cuda : "constexpr dim3" -> "const dim3"	2024-04-22 20:31:23 +03:00
fattn.cuh	cuda : fix build	2024-03-27 10:31:52 +02:00
getrows.cu	…
getrows.cuh	…
im2col.cu	…
im2col.cuh	…
mmq.cu	…
mmq.cuh	…
mmvq.cu	IQ1_M: 1.75 bpw quantization (#6302 )	2024-03-26 15:21:27 +01:00
mmvq.cuh	…
norm.cu	…
norm.cuh	…
pad.cu	…
pad.cuh	…
pool2d.cu	…
pool2d.cuh	…
quantize.cu	llama : add Command R Plus support (#6491 )	2024-04-09 11:16:13 +03:00
quantize.cuh	llama : add Command R Plus support (#6491 )	2024-04-09 11:16:13 +03:00
rope.cu	…
rope.cuh	…
scale.cu	…
scale.cuh	…
softmax.cu	ggml : ggml_soft_max support F16/F32 mask/pos	2024-04-22 14:53:11 +03:00
softmax.cuh	…
sumrows.cu	…
sumrows.cuh	…
tsembd.cu	…
tsembd.cuh	…
unary.cu	…
unary.cuh	…
upscale.cu	…
upscale.cuh	…
vecdotq.cuh	IQ1_M: 1.75 bpw quantization (#6302 )	2024-03-26 15:21:27 +01:00