Commit Graph

148 Commits

Author SHA1 Message Date
teleprint-me f1d067e7a6
refactor: Simplify huggingface hub api and update to reflect changes in constants.py 2024-05-28 19:16:32 -04:00
teleprint-me 9dbc9571a3
refactor: Simplify tokenizers implementation 2024-05-28 18:42:39 -04:00
teleprint-me 0a478c048a
chore: Add pre tokenizers and include enum mappings 2024-05-27 03:11:40 -04:00
teleprint-me 0732bd9051
feat: Ignore pre-existing model files 2024-05-27 00:06:53 -04:00
teleprint-me 36bea177cb
Merge branch 'master' into auto-model-support 2024-05-26 18:07:18 -04:00
teleprint-me b3a54291cb
Merge branch 'huggingface-hub-api' into auto-model-support 2024-05-25 20:28:40 -04:00
teleprint-me fcd20ab9e9
chore: Add comments for each file extension type 2024-05-25 19:12:16 -04:00
teleprint-me da72554f58
feat: Add static methods for resolving model types and model extensions 2024-05-25 19:11:56 -04:00
teleprint-me 63c3410492
refactor: Add support for model file types 2024-05-25 04:15:39 -04:00
teleprint-me 2ffe6b89c8
Refactor HFubModel and HFHubTokenizer to fix reference issues 2024-05-25 04:15:15 -04:00
teleprint-me fda2319d7b
refactor: Streamline method signatures and clarify method names related to downloading repo files 2024-05-25 03:32:27 -04:00
teleprint-me 4438d052aa
refactor: Abstract file and logger management to streamline api interface 2024-05-25 02:57:59 -04:00
teleprint-me 99275a1606
refactor: Simplify API and merge HFModel into HFHub 2024-05-25 02:10:52 -04:00
teleprint-me 168297f11c
refactor: Add remote repository listings to the bas HFHub class 2024-05-24 23:57:45 -04:00
teleprint-me 6da2bd6fbc
patch: Apply fix for paths and logging 2024-05-24 21:47:47 -04:00
compilade b83bab15a5
gguf-py : fix and simplify quantized shape round-trip (#7483)
* gguf-py : fix and simplify quantized shape round-trip

* gguf-py : remove unused import
2024-05-25 11:11:48 +10:00
fairydreaming fbca2f27fc
Add support for ArcticForCausalLM (#7020)
* common : increase max number of experts to 128

* common : add tensor LLM_TENSOR_FFN_NORM_EXPS for normalization before MoE that runs in parallel to attention + ffn

* gguf-py : add architecture-specific block mappings that override selected general block mappings

* convert-hf : add model conversion support for ArcticForCausalLM

* convert-hf : use added_tokens_decoder from tokenizer_config.json to redefine tokens from SentencePiece model (only for ArcticForCausalLM)

* llama : add inference support for LLM_ARCH_ARCTIC

---------

Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>
2024-05-24 14:31:13 +02:00
teleprint-me 64096942ce
refactor: Simplify the huggingface hub api to enable flexible model requests 2024-05-24 02:40:34 -04:00
teleprint-me 6c9ac0fc52
refactor: Add a custom tokenizer component and fix vocab request class 2024-05-24 01:30:29 -04:00
teleprint-me e62e09bbb1
refactor: Apply fix for file path references 2024-05-23 22:59:16 -04:00
teleprint-me 77bc7394c8
refactor: Add tokenizer path, add methods for extracting vocab metadata, fix checksum method name 2024-05-23 21:40:05 -04:00
teleprint-me 1749209406
refactor: Simplify huggingface hub api implementation 2024-05-23 20:50:15 -04:00
teleprint-me 0ccf579242
refactor: Apply consistent naming conventions 2024-05-23 17:17:22 -04:00
teleprint-me 9ba6b92c2d
chore: Add required vocabulary constants 2024-05-23 16:57:14 -04:00
teleprint-me 9814b7f9ab
feat: Add custom huggingface hub api
Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
2024-05-23 13:48:20 -04:00
Georgi Gerganov e84b71c2c6
ggml : drop support for QK_K=64 (#7473)
* ggml : drop support for QK_K=64

ggml-ci

* opencl : restore QK_K=256 define
2024-05-23 10:00:21 +03:00
teleprint-me cd00be886f
chore: Add model metadata 2024-05-22 19:59:13 -04:00
teleprint-me 1957ca41f2
refactor: Simplify BPE pre-tokenizer mapping 2024-05-22 16:57:29 -04:00
teleprint-me 12285b5325
chore: Map model file and vocab types 2024-05-22 02:58:12 -04:00
teleprint-me 0b43e14030
refactor: Add experimental mapping for BPE pre-tokenizers 2024-05-21 22:45:45 -04:00
teleprint-me 34e14ae96d
refactor: Add experimental model mappings 2024-05-21 19:11:51 -04:00
liuwei-git 201cc11afa
llama : add phi3 128K model support (#7225)
* add phi3 128k support in convert-hf-to-gguf

* add phi3 128k support in cuda

* address build warnings on llama.cpp

* adjust index value in cuda long rope freq factors

* add long rope support in ggml cpu backend

* make freq factors only depend on ctx size

* remove unused rope scaling type 'su' frin gguf converter

* fix flint warnings on convert-hf-to-gguf.py

* set to the short freq factor when context size is small than trained context size

* add one line of comments

* metal : support rope freq_factors

* ggml : update ggml_rope_ext API to support freq. factors

* backends : add dev messages to support rope freq. factors

* minor : style

* tests : update to use new rope API

* backends : fix pragma semicolons

* minor : cleanup

* llama : move rope factors from KV header to tensors

* llama : remove tmp assert

* cuda : fix compile warning

* convert : read/write n_head_kv

* llama : fix uninitialized tensors

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-05-21 23:28:32 +03:00
teleprint-me b2aac685d5
docs: Fix comment 2024-05-21 16:07:12 -04:00
teleprint-me 83b9fcd3e4
refactor: Rename constants to reduce confusion between references 2024-05-21 16:06:39 -04:00
teleprint-me 2fe28ad4d3
chore: Rename from repo to model repo and reorder for improved readability 2024-05-21 01:41:35 -04:00
teleprint-me 4768650aff
chore: Add formatting, set common vocab files, apply pattern to model map 2024-05-21 01:38:29 -04:00
teleprint-me fb32f50834
feat: Add hf model mapping descriptors for each repo 2024-05-21 01:07:13 -04:00
teleprint-me a35b76755f
Merge branch 'master' into auto-model-support 2024-05-21 00:16:34 -04:00
teleprint-me aed0573f68
proto: Add experimental vocab pre-tokenizer regular expressions 2024-05-21 00:14:26 -04:00
teleprint-me 5978bb007d
chore: Fix and update comments 2024-05-20 14:59:40 -04:00
teleprint-me 2fa2c7a86c
chore: Move enums and model map to constants 2024-05-20 14:51:03 -04:00
teleprint-me d9ba963cd4
refactor: Restructure tokenizer model metadata 2024-05-20 14:42:59 -04:00
teleprint-me 18bb36e496
chore: Allow the user to config the logger 2024-05-20 14:06:21 -04:00
Georgi Gerganov fabf30b4c4
llama : remove Persimmon (#7408)
* llama : remove Persimmon

* requirements : remove
2024-05-21 02:35:28 +10:00
teleprint-me bdd0286bd0
refactor: Use proper names for referenced member variables 2024-05-20 01:39:09 -04:00
teleprint-me a1951e27dc
refactor: Add proper names for remote model references 2024-05-20 01:36:44 -04:00
teleprint-me 381dad5eb3
fix: Add missing model architectures 2024-05-20 00:50:42 -04:00
teleprint-me 9a2834e24e
fix: Use __name__ as logger name 2024-05-19 22:39:30 -04:00
teleprint-me 89a46fe818
feat: Attempt to mirror the llama.cpp API for compatibility 2024-05-19 22:31:05 -04:00
teleprint-me 0479e9695f
patch: Add exception handling for non-existent vocab related files 2024-05-18 22:14:19 -04:00