llama.cpp/ggml-cuda
Georgi Gerganov c129369702
cuda : try to fix __hgt2_mask
ggml-ci
2024-04-23 09:18:55 +03:00
..
acc.cu
acc.cuh
alibi.cu
alibi.cuh
arange.cu
arange.cuh
argsort.cu ggml : mul_mat_id use the same tensor for all the experts (#6387) 2024-04-03 16:07:05 +03:00
argsort.cuh
binbcast.cu ggml : group all experts in a single ggml_mul_mat_id (#6505) 2024-04-18 15:18:48 +02:00
binbcast.cuh
clamp.cu
clamp.cuh
common.cuh cuda : try to fix __hgt2_mask 2024-04-23 09:18:55 +03:00
concat.cu
concat.cuh
convert.cu ggml : group all experts in a single ggml_mul_mat_id (#6505) 2024-04-18 15:18:48 +02:00
convert.cuh llama : add Command R Plus support (#6491) 2024-04-09 11:16:13 +03:00
cpy.cu
cpy.cuh
dequantize.cuh llama : add Command R Plus support (#6491) 2024-04-09 11:16:13 +03:00
diagmask.cu
diagmask.cuh
dmmv.cu llama : add Command R Plus support (#6491) 2024-04-09 11:16:13 +03:00
dmmv.cuh sync : ggml (#6351) 2024-03-29 17:45:46 +02:00
fattn.cu cuda : "constexpr dim3" -> "const dim3" 2024-04-22 20:31:23 +03:00
fattn.cuh cuda : fix build 2024-03-27 10:31:52 +02:00
getrows.cu
getrows.cuh
im2col.cu
im2col.cuh
mmq.cu
mmq.cuh
mmvq.cu IQ1_M: 1.75 bpw quantization (#6302) 2024-03-26 15:21:27 +01:00
mmvq.cuh
norm.cu
norm.cuh
pad.cu
pad.cuh
pool2d.cu
pool2d.cuh
quantize.cu llama : add Command R Plus support (#6491) 2024-04-09 11:16:13 +03:00
quantize.cuh llama : add Command R Plus support (#6491) 2024-04-09 11:16:13 +03:00
rope.cu
rope.cuh
scale.cu
scale.cuh
softmax.cu ggml : ggml_soft_max support F16/F32 mask/pos 2024-04-22 14:53:11 +03:00
softmax.cuh
sumrows.cu
sumrows.cuh
tsembd.cu
tsembd.cuh
unary.cu
unary.cuh
upscale.cu
upscale.cuh
vecdotq.cuh IQ1_M: 1.75 bpw quantization (#6302) 2024-03-26 15:21:27 +01:00