HappyZ happyz
happyz synced commits to refs/pull/6915/merge at happyz/llama.cpp from mirror 2024-05-13 22:42:01 -07:00
ee52225067 convert-hf : support direct Q8_0 conversion (#7234)
Compare 2 commits »
happyz synced commits to refs/pull/6829/merge at happyz/llama.cpp from mirror 2024-05-13 22:42:00 -07:00
8282200a90 Merge 6f3fd1d7d33ee9d8203e2386c01cba7c74bda8c2 into e0f556186b
e0f556186b Add left recursion check: quit early instead of going into an infinite loop (#7083)
27f65d6267 docs: Fix typo and update description for --embeddings flag (#7026)
ee52225067 convert-hf : support direct Q8_0 conversion (#7234)
Compare 4 commits »
happyz synced commits to refs/pull/6828/merge at happyz/llama.cpp from mirror 2024-05-13 22:42:00 -07:00
ee52225067 convert-hf : support direct Q8_0 conversion (#7234)
Compare 2 commits »
happyz synced commits to refs/pull/6834/merge at happyz/llama.cpp from mirror 2024-05-13 22:42:00 -07:00
28ddd2c474 ChatON: ChatParts dump returns info str rather than direct logging
4dfd10a40d ChatON: Move core templating/tagging code into ChatTemplates class
600653dae2 ChatON:Optional control of MsgCntBasedTagging
6e13c0c87e ChatON:Control SystemMsgSuffix+End tags only wrt 1st system msg
Compare 19 commits »
happyz synced commits to refs/pull/6834/head at happyz/llama.cpp from mirror 2024-05-13 22:42:00 -07:00
28ddd2c474 ChatON: ChatParts dump returns info str rather than direct logging
4dfd10a40d ChatON: Move core templating/tagging code into ChatTemplates class
600653dae2 ChatON:Optional control of MsgCntBasedTagging
6e13c0c87e ChatON:Control SystemMsgSuffix+End tags only wrt 1st system msg
3fcaf19967 ChatON+:Multi4Single: applyGlobalIfAny flag wrt templating api
Compare 12 commits »
happyz synced commits to refs/pull/6811/merge at happyz/llama.cpp from mirror 2024-05-13 22:42:00 -07:00
ee52225067 convert-hf : support direct Q8_0 conversion (#7234)
614d3b914e llama : less KV padding when FA is off (#7257)
Compare 3 commits »
happyz synced commits to refs/pull/6467/merge at happyz/llama.cpp from mirror 2024-05-13 22:41:59 -07:00
3175e16ba7 Merge e56761dc74f9605dfba6c5f53374fbafdea5b4ce into e0f556186b
e0f556186b Add left recursion check: quit early instead of going into an infinite loop (#7083)
27f65d6267 docs: Fix typo and update description for --embeddings flag (#7026)
ee52225067 convert-hf : support direct Q8_0 conversion (#7234)
614d3b914e llama : less KV padding when FA is off (#7257)
Compare 8 commits »
happyz synced commits to refs/pull/6640/merge at happyz/llama.cpp from mirror 2024-05-13 22:41:59 -07:00
ee52225067 convert-hf : support direct Q8_0 conversion (#7234)
614d3b914e llama : less KV padding when FA is off (#7257)
30e70334f7 llava-cli: fix base64 prompt (#7248)
1c570d8bee perplexity: add BF16 vs. FP16 results (#7150)
Compare 7 commits »
happyz synced commits to refs/pull/6522/merge at happyz/llama.cpp from mirror 2024-05-13 22:41:59 -07:00
ee52225067 convert-hf : support direct Q8_0 conversion (#7234)
614d3b914e llama : less KV padding when FA is off (#7257)
30e70334f7 llava-cli: fix base64 prompt (#7248)
Compare 4 commits »
happyz synced commits to refs/pull/6445/merge at happyz/llama.cpp from mirror 2024-05-13 22:41:59 -07:00
ee52225067 convert-hf : support direct Q8_0 conversion (#7234)
614d3b914e llama : less KV padding when FA is off (#7257)
30e70334f7 llava-cli: fix base64 prompt (#7248)
1c570d8bee perplexity: add BF16 vs. FP16 results (#7150)
Compare 7 commits »
happyz synced commits to sl/disable-pp-nkvo at happyz/llama.cpp from mirror 2024-05-13 22:41:58 -07:00
happyz synced commits to refs/pull/6188/merge at happyz/llama.cpp from mirror 2024-05-13 22:41:58 -07:00
ee52225067 convert-hf : support direct Q8_0 conversion (#7234)
614d3b914e llama : less KV padding when FA is off (#7257)
30e70334f7 llava-cli: fix base64 prompt (#7248)
1c570d8bee perplexity: add BF16 vs. FP16 results (#7150)
Compare 7 commits »
happyz synced commits to refs/pull/6035/head at happyz/llama.cpp from mirror 2024-05-13 22:41:58 -07:00
17ece900df fix q4_0 get_rows
happyz synced commits to refs/pull/6440/head at happyz/llama.cpp from mirror 2024-05-13 22:41:58 -07:00
0add3107f7 spacing changes.
a20edbf300 do 2 rounds of 4, instead of 4 rounds of 2. and properly offset unalligned reads across a 64 byte boundary.
b23ab86eda make offset available in a register.
1072686dcf load from identical addresses for low and high side.
3449b0f359 minor comment fixes.
Compare 41 commits »
happyz synced commits to refs/pull/5615/merge at happyz/llama.cpp from mirror 2024-05-13 22:41:58 -07:00
284870c868 Merge branch 'master' into fix-convert-modelname
ee52225067 convert-hf : support direct Q8_0 conversion (#7234)
614d3b914e llama : less KV padding when FA is off (#7257)
30e70334f7 llava-cli: fix base64 prompt (#7248)
Compare 31 commits »
happyz synced commits to refs/pull/5615/head at happyz/llama.cpp from mirror 2024-05-13 22:41:58 -07:00
284870c868 Merge branch 'master' into fix-convert-modelname
ee52225067 convert-hf : support direct Q8_0 conversion (#7234)
614d3b914e llama : less KV padding when FA is off (#7257)
30e70334f7 llava-cli: fix base64 prompt (#7248)
1c570d8bee perplexity: add BF16 vs. FP16 results (#7150)
Compare 669 commits »
happyz synced new reference sl/disable-pp-nkvo to happyz/llama.cpp from mirror 2024-05-13 22:41:58 -07:00
happyz synced and deleted reference refs/tags/refs/pull/7234/merge at happyz/llama.cpp from mirror 2024-05-13 22:41:57 -07:00
happyz synced and deleted reference refs/tags/refs/pull/7255/merge at happyz/llama.cpp from mirror 2024-05-13 22:41:57 -07:00
happyz synced commits to compilade/lazier-moe-convert-hf at happyz/llama.cpp from mirror 2024-05-13 22:41:57 -07:00