HappyZ happyz
happyz synced commits to refs/pull/6707/merge at happyz/llama.cpp from mirror 2024-04-17 23:13:17 -07:00
3b8f1ec4b1 llamafile : tmp disable + build sgemm.o when needed (#6716)
Compare 2 commits »
happyz synced commits to refs/pull/6721/merge at happyz/llama.cpp from mirror 2024-04-17 23:13:17 -07:00
3b8f1ec4b1 llamafile : tmp disable + build sgemm.o when needed (#6716)
Compare 2 commits »
happyz synced commits to refs/pull/6716/head at happyz/llama.cpp from mirror 2024-04-17 23:13:17 -07:00
0dd7505ad4 llamafile : tmp disable due to MoE bug
happyz synced commits to refs/tags/b2690 at happyz/llama.cpp from mirror 2024-04-17 23:13:17 -07:00
happyz synced commits to refs/pull/6658/merge at happyz/llama.cpp from mirror 2024-04-17 23:13:16 -07:00
3b8f1ec4b1 llamafile : tmp disable + build sgemm.o when needed (#6716)
Compare 2 commits »
happyz synced commits to refs/pull/6661/merge at happyz/llama.cpp from mirror 2024-04-17 23:13:16 -07:00
3b8f1ec4b1 llamafile : tmp disable + build sgemm.o when needed (#6716)
8dd1ec8b3f readme : add UI (#6724)
Compare 3 commits »
happyz synced commits to refs/pull/6688/merge at happyz/llama.cpp from mirror 2024-04-17 23:13:16 -07:00
3b8f1ec4b1 llamafile : tmp disable + build sgemm.o when needed (#6716)
Compare 2 commits »
happyz synced commits to refs/pull/6648/merge at happyz/llama.cpp from mirror 2024-04-17 23:13:15 -07:00
3b8f1ec4b1 llamafile : tmp disable + build sgemm.o when needed (#6716)
Compare 2 commits »
happyz synced commits to refs/pull/6640/merge at happyz/llama.cpp from mirror 2024-04-17 23:13:15 -07:00
3b8f1ec4b1 llamafile : tmp disable + build sgemm.o when needed (#6716)
Compare 2 commits »
happyz synced commits to refs/pull/6646/head at happyz/llama.cpp from mirror 2024-04-17 23:13:15 -07:00
44ca5764d6 fix KQ FP32 precision fpr parallel_blocks > 1
4e4d58ab6a Add __hgt2_mask implementation for CUDA 11
a9d6591652 Calculate KQ as FP32 if KQV has GGML_PREC_F32
aef96ff40a store temp KQ in registers
049533d99f flush softmax exp below threshold to 0
Compare 87 commits »
happyz synced commits to refs/pull/6646/merge at happyz/llama.cpp from mirror 2024-04-17 23:13:15 -07:00
5668c79ea0 server: bench: enable flash_attn param
44ca5764d6 fix KQ FP32 precision fpr parallel_blocks > 1
4e4d58ab6a Add __hgt2_mask implementation for CUDA 11
a9d6591652 Calculate KQ as FP32 if KQV has GGML_PREC_F32
Compare 11 commits »
happyz synced commits to refs/pull/6602/merge at happyz/llama.cpp from mirror 2024-04-17 23:13:14 -07:00
3b8f1ec4b1 llamafile : tmp disable + build sgemm.o when needed (#6716)
Compare 2 commits »
happyz synced commits to refs/pull/6563/merge at happyz/llama.cpp from mirror 2024-04-17 23:13:14 -07:00
b5798bcc84 Merge 9acb43d7fa0b8da867570c975d33f0728951ca46 into 3b8f1ec4b1
3b8f1ec4b1 llamafile : tmp disable + build sgemm.o when needed (#6716)
8dd1ec8b3f readme : add UI (#6724)
facb8b56f8 convert : fix autoawq gemma (#6704)
532c1737a1 llama : make general.name optional (#6709)
Compare 9 commits »
happyz synced commits to refs/pull/6638/merge at happyz/llama.cpp from mirror 2024-04-17 23:13:14 -07:00
3b8f1ec4b1 llamafile : tmp disable + build sgemm.o when needed (#6716)
Compare 2 commits »
happyz synced commits to refs/pull/6522/merge at happyz/llama.cpp from mirror 2024-04-17 23:13:14 -07:00
deed361cbf Merge a37d88568336ec949865e166eaf1454841f4cdb5 into 3b8f1ec4b1
3b8f1ec4b1 llamafile : tmp disable + build sgemm.o when needed (#6716)
8dd1ec8b3f readme : add UI (#6724)
Compare 3 commits »
happyz synced commits to refs/pull/6511/head at happyz/llama.cpp from mirror 2024-04-17 23:13:13 -07:00
34d8cfaf21 gguf-py/gguf/*.py: use __name__ as logger name
bc9a3ae6df *.py: refactor logging.basicConfig()
14af2d36d3 verify-checksum-models.py: use print() for printing table
9e4cf371ef convert-hf-to-gguf.py: print --> logger.debug or ValueError()
70d4f425a8 gguf-dump.py: dump_metadata() should print to stdout
Compare 43 commits »
happyz synced commits to refs/pull/6511/merge at happyz/llama.cpp from mirror 2024-04-17 23:13:13 -07:00
90599c4754 Merge 34d8cfaf21188ea753cefe073c43814d507c772b into 3b8f1ec4b1
34d8cfaf21 gguf-py/gguf/*.py: use __name__ as logger name
bc9a3ae6df *.py: refactor logging.basicConfig()
14af2d36d3 verify-checksum-models.py: use print() for printing table
9e4cf371ef convert-hf-to-gguf.py: print --> logger.debug or ValueError()
Compare 29 commits »
happyz synced commits to refs/pull/6505/merge at happyz/llama.cpp from mirror 2024-04-17 23:13:13 -07:00
4d8fe0764b metal : enable buffer log prints again
0e6963da8f cuda : fix warnings
d18b19c8fe Merge remote-tracking branch 'origin/master' into sl/moe-rework-2
3b8f1ec4b1 llamafile : tmp disable + build sgemm.o when needed (#6716)
Compare 5 commits »
happyz synced commits to refs/pull/6502/merge at happyz/llama.cpp from mirror 2024-04-17 23:13:12 -07:00
8dd1ec8b3f readme : add UI (#6724)
Compare 2 commits »
happyz synced commits to refs/pull/6505/head at happyz/llama.cpp from mirror 2024-04-17 23:13:12 -07:00
4d8fe0764b metal : enable buffer log prints again
0e6963da8f cuda : fix warnings
d18b19c8fe Merge remote-tracking branch 'origin/master' into sl/moe-rework-2
3b8f1ec4b1 llamafile : tmp disable + build sgemm.o when needed (#6716)
Compare 4 commits »