llama.cpp

Commit Graph

Author	SHA1	Message	Date
klosax	6a69a693cb	gguf.py : fix rope scale kv	2023-08-21 13:23:10 +02:00
klosax	c818c405e0	convert-llama-hf-to-gguf.py : fix attn_q permute	2023-08-21 04:42:09 +02:00
klosax	58bde5c5c1	Delete convert-permute-debug.py	2023-08-21 04:35:06 +02:00
klosax	287db51015	Delete convert-permute-debug-master.py	2023-08-21 04:34:39 +02:00
klosax	d5c8fcfd8a	convert.py : 70b model working (change attn_q permute)	2023-08-21 04:33:33 +02:00
klosax	7de7cb4bd8	convert-permute-debug.py : change permute type of attn_q	2023-08-21 04:06:59 +02:00
klosax	4f92488dd6	convert-permute-debug-master.py : permute debug for master	2023-08-21 03:44:16 +02:00
klosax	5a02b9625a	convert-permute-debug.py : permute debug print	2023-08-21 03:24:29 +02:00
klosax	f838faa874	convert-llama-7b-pth-to-gguf.py : special tokens	2023-08-20 16:56:48 +02:00
klosax	76b46627e2	convert-llama-hf-to-gguf.py : special tokens	2023-08-20 16:54:42 +02:00
klosax	28b8c265eb	cmpnct_gpt2bpe.hpp : cleanup	2023-08-19 18:26:51 +02:00
klosax	c0a1269b7f	Update examples/server/README.md Co-authored-by: slaren <slarengh@gmail.com>	2023-08-19 15:27:37 +02:00
klosax	6a2e520095	cmpnct_gpt2bpe.hpp : remove non-general stuff	2023-08-19 13:19:02 +02:00
klosax	8945d47f52	gptneox-main.cpp : fixes	2023-08-19 12:09:24 +02:00
klosax	781bf2481f	falcon-main.cpp : fixes	2023-08-19 12:08:17 +02:00
klosax	dadf098b5a	cmpnct_gpt2bpe.hpp : fixes	2023-08-19 12:06:22 +02:00
klosax	b3a7a2b486	convert-falcon-hf-to-gguf.py : add tensor data layout	2023-08-19 12:05:11 +02:00
klosax	2c8055b65b	convert-falcon-hf-to-gguf.py : update ref	2023-08-19 01:08:39 +02:00
klosax	1d80eea574	falcon-main.cpp : fix for falcon 40b	2023-08-19 01:03:37 +02:00
klosax	bd5a57901b	gguf.py : fix for falcon 40b	2023-08-19 01:01:52 +02:00
klosax	281d6d1105	convert-llama-hf-to-gguf.py : remove extra kv	2023-08-19 00:32:56 +02:00
klosax	593b04fdcd	convert-llama-7b-pth-to-gguf.py : remove extra kv	2023-08-19 00:32:27 +02:00
klosax	c0e4ca630b	convert-gptneox-hf-to-gguf.py : remove extra kv	2023-08-19 00:31:56 +02:00
klosax	16ab9ba3b3	convert-falcon-hf-to-gguf.py : remove extra kv	2023-08-19 00:31:28 +02:00
klosax	d5e976c12b	falcon-main.cpp : falcon inference example	2023-08-19 00:02:18 +02:00
klosax	fb7c883cd3	convert-falcon-hf-to-gguf.py : falcon HF --> gguf conversion, not tested	2023-08-18 20:14:01 +02:00
Georgi Gerganov	25b8a8922d	llama : introduce enum llama_vocab_type + remove hardcoded string constants	2023-08-18 18:46:38 +03:00
Georgi Gerganov	a4ad2bf35c	llama : fix MPI build ggml-ci	2023-08-18 17:34:27 +03:00
Georgi Gerganov	5d2656d670	llama : avoid hardcoded special tokens	2023-08-18 17:29:20 +03:00
Georgi Gerganov	035d511457	llama : minor API updates	2023-08-18 17:10:20 +03:00
Georgi Gerganov	2d6c2c757c	llama : remove C++ API + reorganize common source in /common dir	2023-08-18 16:22:48 +03:00
Georgi Gerganov	38016ed9ec	Merge branch 'master' into gguf	2023-08-18 15:21:48 +03:00
Georgi Gerganov	660ca9bbca	llama : re-order functions	2023-08-18 14:56:36 +03:00
slaren	097e121e2f	llama : add benchmark example (#2626 ) * llama : add benchmark example * add to examples CMakeLists.txt * fix msvc build * add missing include * add Bessel's correction to stdev calculation Co-authored-by: Johannes Gäßler <johannesg@5d6.de> * improve markdown formatting * add missing include * print warning is NDEBUG is not defined * remove n_prompt and n_gen from the matrix, use each value separately instead * better checks for non-optimized builds * llama.cpp : fix MEM_REQ_SCRATCH0 reusing the value of n_ctx of the first call * fix json formatting * add sql output * add basic cpu and gpu info (linx/cuda only) * markdown: also show values that differ from the default * markdown: add build id * cleanup * improve formatting * formatting --------- Co-authored-by: Johannes Gäßler <johannesg@5d6.de>	2023-08-18 12:44:58 +02:00
mdrokz	eaf98c2649	readme : add link to Rust bindings (#2656 )	2023-08-18 13:17:58 +03:00
Georgi Gerganov	e9b12c332e	perplexity : more meaningful ETA number - 2 decimal points	2023-08-18 12:48:55 +03:00
Georgi Gerganov	dea5be61d7	editorconfig : fix whitespaces	2023-08-18 12:42:38 +03:00
Georgi Gerganov	e35f8c744e	tests : update vocab file with new magic	2023-08-18 12:39:22 +03:00
Georgi Gerganov	856afff746	Merge branch 'master' into gguf	2023-08-18 12:38:05 +03:00
Georgi Gerganov	aa3efe87c8	llama : print number of tensors per type + print arch + style	2023-08-18 10:36:45 +03:00
klosax	b275de745d	llama.cpp : get special token kv and linefeed token id	2023-08-18 03:34:30 +02:00
Evan Jones	604b8bdfa6	Fix unicode in grammars (fixes #2501 ) (#2553 ) * Fix unicode in grammars (fixes #2501) * add more comments * fix test-llama-grammar	2023-08-17 19:54:44 -04:00
staviq	10151bee2e	server : support for saving templates in browser LocalStorage (#2486 ) * support for templates in browser LocalStorage * sync accepted #2409 fix from upstream * convert autosave invocation to useEffect * Apply suggestions from code review Co-authored-by: Jhen-Jie Hong <iainst0409@gmail.com> * Regen index.html.cpp, suggested from code review --------- Co-authored-by: Jhen-Jie Hong <iainst0409@gmail.com>	2023-08-18 07:34:01 +08:00
klosax	306070c896	llama.cpp : print kv general.name	2023-08-18 01:06:27 +02:00
Johannes Gäßler	0992a7b8b1	README: fix LLAMA_CUDA_MMV_Y documentation (#2647 )	2023-08-17 23:57:59 +02:00
klosax	d9e6890a51	test-tokenizer-0.cpp : fix warning	2023-08-17 23:34:21 +02:00
klosax	147a99bd3a	gguf.py : reverse GGUF_MAGIC	2023-08-17 23:24:04 +02:00
klosax	c20ae49b59	ggml.h : reverse GGUF_MAGIC	2023-08-17 23:23:17 +02:00
Henri Vasserman	6ddeefad9b	[Zig] Fixing Zig build and improvements (#2554 ) * Fix zig after console.o was split * Better include and flag management * Change LTO to option	2023-08-17 23:11:18 +03:00
klosax	3c1b7217a9	convert-llama-7b-pth-to-gguf.py : fixes	2023-08-17 21:44:34 +02:00

1 2 3 4 5 ...

1243 Commits All Branches Search

1243 Commits

All Branches