HappyZ happyz
happyz synced commits to refs/pull/6829/merge at happyz/llama.cpp from mirror 2024-04-24 11:14:03 -07:00
bef209fcae Merge 2ef868d9cdc8ba85a39719c318a57dfddec0142b into b4e4b8a935
b4e4b8a935 llama : add llama_get_pooling_type function (#6862)
3fe847b574 server : do not apply Markdown formatting in code sections (#6850)
37246b1031 common : revert showing control tokens by default for server (#6860)
28103f4832 Server: fix seed for multiple slots (#6835)
Compare 9 commits »
happyz synced commits to refs/pull/6826/merge at happyz/llama.cpp from mirror 2024-04-24 11:14:03 -07:00
c3f4b1f2d2 feat: rename Jina Bert to Jina Bert V2
b4e4b8a935 llama : add llama_get_pooling_type function (#6862)
3fe847b574 server : do not apply Markdown formatting in code sections (#6850)
37246b1031 common : revert showing control tokens by default for server (#6860)
Compare 12 commits »
happyz synced commits to refs/pull/6811/merge at happyz/llama.cpp from mirror 2024-04-24 11:14:02 -07:00
05efa34d92 grammars: keep llama_grammar_copy non-quadratic optim for later
b4e4b8a935 llama : add llama_get_pooling_type function (#6862)
3fe847b574 server : do not apply Markdown formatting in code sections (#6850)
37246b1031 common : revert showing control tokens by default for server (#6860)
Compare 10 commits »
happyz synced commits to refs/pull/6811/head at happyz/llama.cpp from mirror 2024-04-24 11:14:02 -07:00
05efa34d92 grammars: keep llama_grammar_copy non-quadratic optim for later
happyz synced commits to refs/pull/6810/merge at happyz/llama.cpp from mirror 2024-04-24 11:14:02 -07:00
b4e4b8a935 llama : add llama_get_pooling_type function (#6862)
3fe847b574 server : do not apply Markdown formatting in code sections (#6850)
37246b1031 common : revert showing control tokens by default for server (#6860)
28103f4832 Server: fix seed for multiple slots (#6835)
Compare 9 commits »
happyz synced commits to refs/pull/6784/merge at happyz/llama.cpp from mirror 2024-04-24 11:14:02 -07:00
b4e4b8a935 llama : add llama_get_pooling_type function (#6862)
3fe847b574 server : do not apply Markdown formatting in code sections (#6850)
37246b1031 common : revert showing control tokens by default for server (#6860)
28103f4832 Server: fix seed for multiple slots (#6835)
Compare 9 commits »
happyz synced commits to refs/pull/6822/head at happyz/llama.cpp from mirror 2024-04-24 11:14:02 -07:00
476d319fde correct buffer size
7f89803536 add enum keyword
0d3363e4e6 llama_chat_get_typed_template
81b5903890 adapt phi3 template
ada54292c6 Merge branch 'master' into xsn/chat_template_prefix_postfix
Compare 22 commits »
happyz synced commits to refs/pull/6822/merge at happyz/llama.cpp from mirror 2024-04-24 11:14:02 -07:00
476d319fde correct buffer size
7f89803536 add enum keyword
0d3363e4e6 llama_chat_get_typed_template
81b5903890 adapt phi3 template
Compare 17 commits »
happyz synced commits to refs/pull/6826/head at happyz/llama.cpp from mirror 2024-04-24 11:14:02 -07:00
c3f4b1f2d2 feat: rename Jina Bert to Jina Bert V2
dfa067631c feat: example comments in embedding
dd060a2a4e feat: handle gpt2 tokenizer with Jina architecture
Compare 3 commits »
happyz synced commits to refs/pull/6773/merge at happyz/llama.cpp from mirror 2024-04-24 11:14:01 -07:00
4e26b45dbb Merge ac6ae5daca into 1409defcd09e75b6f33bbf0adfde94d2bcd8fdfc
1409defcd0 llama : disable FA for AMD
8937ec5307 Merge branch 'master' into gg/flash-attn
3fe847b574 server : do not apply Markdown formatting in code sections (#6850)
37246b1031 common : revert showing control tokens by default for server (#6860)
Compare 28 commits »
happyz synced commits to refs/pull/6778/merge at happyz/llama.cpp from mirror 2024-04-24 11:14:01 -07:00
b4e4b8a935 llama : add llama_get_pooling_type function (#6862)
3fe847b574 server : do not apply Markdown formatting in code sections (#6850)
37246b1031 common : revert showing control tokens by default for server (#6860)
28103f4832 Server: fix seed for multiple slots (#6835)
Compare 9 commits »
happyz synced commits to refs/pull/6766/merge at happyz/llama.cpp from mirror 2024-04-24 11:14:01 -07:00
408759687f further addressed comments
d403b180a6 Addressed comments
b4e4b8a935 llama : add llama_get_pooling_type function (#6862)
3fe847b574 server : do not apply Markdown formatting in code sections (#6850)
Compare 12 commits »
happyz synced commits to refs/pull/6766/head at happyz/llama.cpp from mirror 2024-04-24 11:14:01 -07:00
408759687f further addressed comments
d403b180a6 Addressed comments
c3d4ead136 added missing CUDA_CHECKs
Compare 3 commits »
happyz synced commits to refs/pull/6757/merge at happyz/llama.cpp from mirror 2024-04-24 11:14:01 -07:00
3fe847b574 server : do not apply Markdown formatting in code sections (#6850)
37246b1031 common : revert showing control tokens by default for server (#6860)
28103f4832 Server: fix seed for multiple slots (#6835)
c0d1b3e03e ggml : move 32-bit arm compat in ggml-impl.h (#6865)
Compare 8 commits »
happyz synced commits to refs/pull/6739/merge at happyz/llama.cpp from mirror 2024-04-24 11:14:01 -07:00
3fe847b574 server : do not apply Markdown formatting in code sections (#6850)
37246b1031 common : revert showing control tokens by default for server (#6860)
28103f4832 Server: fix seed for multiple slots (#6835)
c0d1b3e03e ggml : move 32-bit arm compat in ggml-impl.h (#6865)
Compare 8 commits »
happyz synced commits to refs/pull/6644/merge at happyz/llama.cpp from mirror 2024-04-24 11:14:00 -07:00
240202d84a Merge 1b988855dca2ced3850dbe40812707e639b1dbd6 into b4e4b8a935
b4e4b8a935 llama : add llama_get_pooling_type function (#6862)
3fe847b574 server : do not apply Markdown formatting in code sections (#6850)
37246b1031 common : revert showing control tokens by default for server (#6860)
28103f4832 Server: fix seed for multiple slots (#6835)
Compare 9 commits »
happyz synced commits to refs/pull/6640/head at happyz/llama.cpp from mirror 2024-04-24 11:14:00 -07:00
0c74ad3cf1 grammar: nit numbering in comment
21bac1e453 grammar: nit typo switched error msgs
d03c98ed9a grammars: ensure unambiguous number alternatives
a61281fef5 grammars: comment on rule repetitions
724f879fa2 Update examples/server/public/json-schema-to-grammar.mjs
Compare 6 commits »
happyz synced commits to refs/pull/6638/merge at happyz/llama.cpp from mirror 2024-04-24 11:14:00 -07:00
28103f4832 Server: fix seed for multiple slots (#6835)
c0d1b3e03e ggml : move 32-bit arm compat in ggml-impl.h (#6865)
abd3314064 llama : add phi 3 chat template (#6857)
3fec68be4e convert : add support of codeqwen due to tokenizer (#6707)
Compare 6 commits »
happyz synced commits to refs/pull/6707/head at happyz/llama.cpp from mirror 2024-04-24 11:14:00 -07:00
8aa536a367 convert : fix whitespace
happyz synced commits to refs/pull/6688/merge at happyz/llama.cpp from mirror 2024-04-24 11:14:00 -07:00
b4e4b8a935 llama : add llama_get_pooling_type function (#6862)
3fe847b574 server : do not apply Markdown formatting in code sections (#6850)
37246b1031 common : revert showing control tokens by default for server (#6860)
28103f4832 Server: fix seed for multiple slots (#6835)
Compare 9 commits »