Commit Graph

1243 Commits

Author SHA1 Message Date
klosax 6a69a693cb
gguf.py : fix rope scale kv 2023-08-21 13:23:10 +02:00
klosax c818c405e0
convert-llama-hf-to-gguf.py : fix attn_q permute 2023-08-21 04:42:09 +02:00
klosax 58bde5c5c1
Delete convert-permute-debug.py 2023-08-21 04:35:06 +02:00
klosax 287db51015
Delete convert-permute-debug-master.py 2023-08-21 04:34:39 +02:00
klosax d5c8fcfd8a
convert.py : 70b model working (change attn_q permute) 2023-08-21 04:33:33 +02:00
klosax 7de7cb4bd8
convert-permute-debug.py : change permute type of attn_q 2023-08-21 04:06:59 +02:00
klosax 4f92488dd6
convert-permute-debug-master.py : permute debug for master 2023-08-21 03:44:16 +02:00
klosax 5a02b9625a
convert-permute-debug.py : permute debug print 2023-08-21 03:24:29 +02:00
klosax f838faa874
convert-llama-7b-pth-to-gguf.py : special tokens 2023-08-20 16:56:48 +02:00
klosax 76b46627e2
convert-llama-hf-to-gguf.py : special tokens 2023-08-20 16:54:42 +02:00
klosax 28b8c265eb
cmpnct_gpt2bpe.hpp : cleanup 2023-08-19 18:26:51 +02:00
klosax c0a1269b7f
Update examples/server/README.md
Co-authored-by: slaren <slarengh@gmail.com>
2023-08-19 15:27:37 +02:00
klosax 6a2e520095
cmpnct_gpt2bpe.hpp : remove non-general stuff 2023-08-19 13:19:02 +02:00
klosax 8945d47f52
gptneox-main.cpp : fixes 2023-08-19 12:09:24 +02:00
klosax 781bf2481f
falcon-main.cpp : fixes 2023-08-19 12:08:17 +02:00
klosax dadf098b5a
cmpnct_gpt2bpe.hpp : fixes 2023-08-19 12:06:22 +02:00
klosax b3a7a2b486
convert-falcon-hf-to-gguf.py : add tensor data layout 2023-08-19 12:05:11 +02:00
klosax 2c8055b65b
convert-falcon-hf-to-gguf.py : update ref 2023-08-19 01:08:39 +02:00
klosax 1d80eea574
falcon-main.cpp : fix for falcon 40b 2023-08-19 01:03:37 +02:00
klosax bd5a57901b
gguf.py : fix for falcon 40b 2023-08-19 01:01:52 +02:00
klosax 281d6d1105
convert-llama-hf-to-gguf.py : remove extra kv 2023-08-19 00:32:56 +02:00
klosax 593b04fdcd
convert-llama-7b-pth-to-gguf.py : remove extra kv 2023-08-19 00:32:27 +02:00
klosax c0e4ca630b
convert-gptneox-hf-to-gguf.py : remove extra kv 2023-08-19 00:31:56 +02:00
klosax 16ab9ba3b3
convert-falcon-hf-to-gguf.py : remove extra kv 2023-08-19 00:31:28 +02:00
klosax d5e976c12b
falcon-main.cpp : falcon inference example 2023-08-19 00:02:18 +02:00
klosax fb7c883cd3
convert-falcon-hf-to-gguf.py : falcon HF --> gguf conversion, not tested 2023-08-18 20:14:01 +02:00
Georgi Gerganov 25b8a8922d
llama : introduce enum llama_vocab_type + remove hardcoded string constants 2023-08-18 18:46:38 +03:00
Georgi Gerganov a4ad2bf35c
llama : fix MPI build
ggml-ci
2023-08-18 17:34:27 +03:00
Georgi Gerganov 5d2656d670
llama : avoid hardcoded special tokens 2023-08-18 17:29:20 +03:00
Georgi Gerganov 035d511457
llama : minor API updates 2023-08-18 17:10:20 +03:00
Georgi Gerganov 2d6c2c757c
llama : remove C++ API + reorganize common source in /common dir 2023-08-18 16:22:48 +03:00
Georgi Gerganov 38016ed9ec
Merge branch 'master' into gguf 2023-08-18 15:21:48 +03:00
Georgi Gerganov 660ca9bbca
llama : re-order functions 2023-08-18 14:56:36 +03:00
slaren 097e121e2f
llama : add benchmark example (#2626)
* llama : add benchmark example

* add to examples CMakeLists.txt

* fix msvc build

* add missing include

* add Bessel's correction to stdev calculation

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

* improve markdown formatting

* add missing include

* print warning is NDEBUG is not defined

* remove n_prompt and n_gen from the matrix, use each value separately instead

* better checks for non-optimized builds

* llama.cpp : fix MEM_REQ_SCRATCH0 reusing the value of n_ctx of the first call

* fix json formatting

* add sql output

* add basic cpu and gpu info (linx/cuda only)

* markdown: also show values that differ from the default

* markdown: add build id

* cleanup

* improve formatting

* formatting

---------

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2023-08-18 12:44:58 +02:00
mdrokz eaf98c2649
readme : add link to Rust bindings (#2656) 2023-08-18 13:17:58 +03:00
Georgi Gerganov e9b12c332e
perplexity : more meaningful ETA number - 2 decimal points 2023-08-18 12:48:55 +03:00
Georgi Gerganov dea5be61d7
editorconfig : fix whitespaces 2023-08-18 12:42:38 +03:00
Georgi Gerganov e35f8c744e
tests : update vocab file with new magic 2023-08-18 12:39:22 +03:00
Georgi Gerganov 856afff746
Merge branch 'master' into gguf 2023-08-18 12:38:05 +03:00
Georgi Gerganov aa3efe87c8
llama : print number of tensors per type + print arch + style 2023-08-18 10:36:45 +03:00
klosax b275de745d
llama.cpp : get special token kv and linefeed token id 2023-08-18 03:34:30 +02:00
Evan Jones 604b8bdfa6
Fix unicode in grammars (fixes #2501) (#2553)
* Fix unicode in grammars (fixes #2501)

* add more comments

* fix test-llama-grammar
2023-08-17 19:54:44 -04:00
staviq 10151bee2e
server : support for saving templates in browser LocalStorage (#2486)
* support for templates in browser LocalStorage

* sync accepted #2409 fix from upstream

* convert autosave invocation to useEffect

* Apply suggestions from code review

Co-authored-by: Jhen-Jie Hong <iainst0409@gmail.com>

* Regen index.html.cpp, suggested from code review

---------

Co-authored-by: Jhen-Jie Hong <iainst0409@gmail.com>
2023-08-18 07:34:01 +08:00
klosax 306070c896
llama.cpp : print kv general.name 2023-08-18 01:06:27 +02:00
Johannes Gäßler 0992a7b8b1
README: fix LLAMA_CUDA_MMV_Y documentation (#2647) 2023-08-17 23:57:59 +02:00
klosax d9e6890a51
test-tokenizer-0.cpp : fix warning 2023-08-17 23:34:21 +02:00
klosax 147a99bd3a
gguf.py : reverse GGUF_MAGIC 2023-08-17 23:24:04 +02:00
klosax c20ae49b59
ggml.h : reverse GGUF_MAGIC 2023-08-17 23:23:17 +02:00
Henri Vasserman 6ddeefad9b
[Zig] Fixing Zig build and improvements (#2554)
* Fix zig after console.o was split

* Better include and flag management

* Change LTO to option
2023-08-17 23:11:18 +03:00
klosax 3c1b7217a9
convert-llama-7b-pth-to-gguf.py : fixes 2023-08-17 21:44:34 +02:00