Georgi Gerganov
f26c874179
scripts : restore hf.sh ( #11288 )
...
ggml-ci
2025-01-18 13:18:32 +02:00
Georgi Gerganov
f11cfdfd7f
ci : use -no-cnv in gguf-split tests ( #11254 )
...
* ci : use -no-cnv in gguf-split tests
ggml-ci
* ci : use -no-cnv in requantize tests
ggml-ci
* scripts : fix [no ci]
2025-01-15 18:28:35 +02:00
Georgi Gerganov
44d1e796d0
sync : ggml
2025-01-14 10:39:42 +02:00
Georgi Gerganov
a4f3f5d8e6
scripts : sync gguf (cont)
2025-01-14 09:40:52 +02:00
Georgi Gerganov
48e1ae0e61
scripts : sync gguf
2025-01-14 09:36:58 +02:00
Georgi Gerganov
d00a80e89d
scripts : sync opencl
2025-01-14 09:19:58 +02:00
Georgi Gerganov
99a3755a3c
sync : ggml
2025-01-08 13:40:30 +02:00
Georgi Gerganov
78c6785175
sync : ggml
2025-01-04 16:09:53 +02:00
Djip007
2cd43f4900
ggml : more perfo with llamafile tinyblas on x86_64 ( #10714 )
...
* more perfo with llamafile tinyblas on x86_64.
- add bf16 suport
- change dispache strategie (thanks:
https://github.com/ikawrakow/ik_llama.cpp/pull/71 )
- reduce memory bandwidth
simple tinyblas dispache and more cache freindly
* tinyblas dynamic dispaching
* sgemm: add M blocs.
* - git 2.47 use short id of len 9.
- show-progress is not part of GNU Wget2
* remove not stable test
2024-12-24 18:54:49 +01:00
Georgi Gerganov
5437d4aaf5
sync : ggml
2024-12-17 18:36:02 +02:00
Georgi Gerganov
87cf323cef
scripts : change build path to "build-bench" for compare-commits.sh ( #10836 )
2024-12-15 18:44:47 +02:00
Georgi Gerganov
0cd182ebcc
sync : ggml
2024-12-05 13:27:42 +02:00
Diego Devesa
59f4db1088
ggml : add predefined list of CPU backend variants to build ( #10626 )
...
* ggml : add predefined list of CPU backend variants to build
* update CPU dockerfiles
2024-12-04 14:45:40 +01:00
Georgi Gerganov
1cd3df46bd
scripts : remove amx sync
...
ggml-ci
2024-12-03 20:04:49 +02:00
Georgi Gerganov
c505471857
sync : ggml
2024-12-03 20:04:49 +02:00
Georgi Gerganov
8648c52101
make : deprecate ( #10514 )
...
* make : deprecate
ggml-ci
* ci : disable Makefile builds
ggml-ci
* docs : remove make references [no ci]
* ci : disable swift build
ggml-ci
* docs : remove obsolete make references, scripts, examples
ggml-ci
* basic fix for compare-commits.sh
* update build.md
* more build.md updates
* more build.md updates
* more build.md updates
* Update Makefile
Co-authored-by: Diego Devesa <slarengh@gmail.com>
---------
Co-authored-by: slaren <slarengh@gmail.com>
2024-12-02 21:22:53 +02:00
Diego Devesa
3420909dff
ggml : automatic selection of best CPU backend ( #10606 )
...
* ggml : automatic selection of best CPU backend
* amx : minor opt
* add GGML_AVX_VNNI to enable avx-vnni, fix checks
2024-12-01 16:12:41 +01:00
Georgi Gerganov
fee824a1a1
sync : ggml
2024-11-27 11:10:42 +02:00
Georgi Gerganov
87a533be57
sync : ggml
2024-11-21 09:22:11 +02:00
Georgi Gerganov
9fe0fb0626
sync : ggml
2024-11-19 20:03:21 +02:00
Georgi Gerganov
5c9a8b22b1
scripts : update sync
2024-11-17 08:30:29 +02:00
Johannes Gäßler
4e54be0ec6
llama/ex: remove --logdir argument ( #10339 )
2024-11-16 23:00:41 +01:00
Georgi Gerganov
f245cc28d4
scripts : fix missing key in compare-llama-bench.py ( #10332 )
2024-11-16 10:32:50 +02:00
Johannes Gäßler
4047be74da
scripts: update compare-llama-bench.py ( #10319 )
2024-11-15 21:19:03 +01:00
Georgi Gerganov
cbf5541a82
sync : ggml
2024-11-15 15:44:06 +02:00
Georgi Gerganov
4802ad350b
scripts : fix regex in sync [no ci]
2024-11-15 08:38:43 +02:00
Georgi Gerganov
5ea926dad7
sync : ggml
2024-11-13 18:11:54 +02:00
Georgi Gerganov
eec4d71737
scripts : add amx to sync-ggml.sh [no ci]
2024-11-07 23:11:36 +02:00
Georgi Gerganov
3b08828674
sync : ggml
2024-11-07 23:08:24 +02:00
Georgi Gerganov
a2c6fd747c
scripts : sync update
2024-11-07 23:07:55 +02:00
Georgi Gerganov
ce027adfb3
sync : ggml
2024-11-04 10:33:37 +02:00
Georgi Gerganov
815fe72adc
sync : ggml
2024-11-01 10:28:24 +02:00
Diego Devesa
c5b0f4b5d9
llama : refactor model loader with backend registry ( #10026 )
2024-10-30 02:01:23 +01:00
Georgi Gerganov
8d8ff71536
llama : remove Tail-Free sampling ( #10071 )
...
ggml-ci
2024-10-29 10:42:05 +02:00
Georgi Gerganov
cc2983d375
sync : ggml
2024-10-26 10:34:08 +03:00
Georgi Gerganov
9e4a2563ea
scripts : fix amx sync [no ci]
2024-10-26 10:33:31 +03:00
Georgi Gerganov
190a37d797
sync : ggml
2024-10-23 17:23:55 +03:00
Georgi Gerganov
17bb928080
readme : remove --memory-f32 references ( #9925 )
2024-10-17 23:43:05 +03:00
Georgi Gerganov
0e41b300ed
sync : ggml
2024-10-16 11:28:14 +03:00
standby24x7
fa42aa6d89
scripts : fix spelling typo in messages and comments ( #9782 )
...
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
2024-10-08 09:19:53 +03:00
Georgi Gerganov
b6d6c5289f
sync : llama.cpp
2024-10-06 12:53:28 +03:00
Georgi Gerganov
58b16695e1
sync : ggml
2024-10-05 15:53:49 +03:00
Georgi Gerganov
17880771ad
sync : ggml
2024-10-04 18:50:25 +03:00
Georgi Gerganov
1bb8a64ebf
sync : ggml
2024-10-03 21:17:49 +03:00
Diego Devesa
c83ad6d01e
ggml-backend : add device and backend reg interfaces ( #9707 )
...
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2024-10-03 01:49:47 +02:00
Georgi Gerganov
f1b8c42711
sync : ggml
2024-10-01 16:09:42 +03:00
Georgi Gerganov
d0b1d663e4
sync : ggml
2024-09-29 21:16:07 +03:00
Georgi Gerganov
bb5f819975
sync : ggml
2024-09-24 11:01:18 +03:00
Georgi Gerganov
4301535326
sync : ggml
...
ggml-ci
2024-09-20 21:15:05 +03:00
Georgi Gerganov
0d2f22e45c
scripts : verify py deps at the start of compare ( #9520 )
2024-09-18 18:34:32 +03:00
Georgi Gerganov
385decbd63
sync : ggml
2024-09-08 11:05:55 +03:00
Georgi Gerganov
60a3107ccd
scripts : option to increase git patch context
2024-09-08 11:05:55 +03:00
Georgi Gerganov
231cff5f6f
sync : ggml
2024-08-27 22:41:27 +03:00
Georgi Gerganov
4305b57c80
sync : ggml
2024-08-09 10:03:48 +03:00
Georgi Gerganov
afd27f01fe
scripts : sync cann files ( #0 )
2024-08-08 14:56:52 +03:00
Georgi Gerganov
366d486c16
scripts : fix sync filenames ( #0 )
2024-08-08 14:40:12 +03:00
Georgi Gerganov
e44a561ab0
sync : ggml
2024-08-08 13:19:47 +03:00
Georgi Gerganov
5587e57a76
sync : ggml
...
ggml-ci
2024-08-05 08:50:57 +03:00
Georgi Gerganov
5e2727fe03
scripts : sync vulkan-shaders ( #0 )
2024-07-27 18:08:47 +03:00
Georgi Gerganov
56f20aa25d
scripts : sync ggml-aarch64 sources
2024-07-27 18:07:33 +03:00
Georgi Gerganov
ae7985cd7b
sync : ggml
...
ggml-ci
2024-07-27 17:43:44 +03:00
Georgi Gerganov
3f2d538b81
scripts : fix sync for sycl
2024-07-08 13:51:31 +03:00
Georgi Gerganov
2ee44c9a18
sync : ggml
...
ggml-ci
2024-07-08 12:23:00 +03:00
compilade
3fd62a6b1c
py : type-check all Python scripts with Pyright ( #8341 )
...
* py : type-check all Python scripts with Pyright
* server-tests : use trailing slash in openai base_url
* server-tests : add more type annotations
* server-tests : strip "chat" from base_url in oai_chat_completions
* server-tests : model metadata is a dict
* ci : disable pip cache in type-check workflow
The cache is not shared between branches, and it's 250MB in size,
so it would become quite a big part of the 10GB cache limit of the repo.
* py : fix new type errors from master branch
* tests : fix test-tokenizer-random.py
Apparently, gcc applies optimisations even when pre-processing,
which confuses pycparser.
* ci : only show warnings and errors in python type-check
The "information" level otherwise has entries
from 'examples/pydantic_models_to_grammar.py',
which could be confusing for someone trying to figure out what failed,
considering that these messages can safely be ignored
even though they look like errors.
2024-07-07 15:04:39 -04:00
Georgi Gerganov
e235b267a2
py : switch to snake_case ( #8305 )
...
* py : switch to snake_case
ggml-ci
* cont
ggml-ci
* cont
ggml-ci
* cont : fix link
* gguf-py : use snake_case in scripts entrypoint export
* py : rename requirements for convert_legacy_llama.py
Needed for scripts/check-requirements.sh
---------
Co-authored-by: Francis Couture-Harpin <git@compilade.net>
2024-07-05 07:53:33 +03:00
ditsuke
821922916f
fix: Update script paths in CI scripts
2024-07-04 15:39:13 +00:00
Clint Herron
07a3fc0608
Removes multiple newlines at the end of files that is breaking the editorconfig step of CI. ( #8258 )
2024-07-02 12:18:10 -04:00
Georgi Gerganov
c70d117c37
scripts : fix filename sync
2024-06-26 23:25:22 +03:00
Georgi Gerganov
f2d48fffde
sync : ggml
2024-06-26 19:39:19 +03:00
Georgi Gerganov
f3f65429c4
llama : reorganize source code + improve CMake ( #8006 )
...
* scripts : update sync [no ci]
* files : relocate [no ci]
* ci : disable kompute build [no ci]
* cmake : fixes [no ci]
* server : fix mingw build
ggml-ci
* cmake : minor [no ci]
* cmake : link math library [no ci]
* cmake : build normal ggml library (not object library) [no ci]
* cmake : fix kompute build
ggml-ci
* make,cmake : fix LLAMA_CUDA + replace GGML_CDEF_PRIVATE
ggml-ci
* move public backend headers to the public include directory (#8122 )
* move public backend headers to the public include directory
* nix test
* spm : fix metal header
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* scripts : fix sync paths [no ci]
* scripts : sync ggml-blas.h [no ci]
---------
Co-authored-by: slaren <slarengh@gmail.com>
2024-06-26 18:33:02 +03:00
jaime-m-p
37bef89433
tokenizer : BPE fixes ( #7530 )
...
* Random test: add_bos_token, add_eos_token
* Random test: add BPE models for testing
* Custom regex split fails with codepoint 0
* Fix falcon punctuation regex
* Refactor llm_tokenizer_bpe: move code to constructor
* Move 'add_special_bos/eos' logic to llm_tokenizer_bpe
* Move tokenizer flags to vocab structure.
* Default values for special_add_bos/eos
* Build vocab.special_tokens_cache using vocab token types
* Generalize 'jina-v2' per token attributes
* Fix unicode whitespaces (deepseek-coder, deepseek-llm)
* Skip missing byte tokens (falcon)
* Better unicode data generation
* Replace char32_t with uint32_t
2024-06-18 18:40:52 +02:00
Georgi Gerganov
5326bcceeb
ggml : sync
2024-06-18 09:50:45 +03:00
Olivier Chafik
1c641e6aac
`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... ( #7809 )
...
* `main`/`server`: rename to `llama` / `llama-server` for consistency w/ homebrew
* server: update refs -> llama-server
gitignore llama-server
* server: simplify nix package
* main: update refs -> llama
fix examples/main ref
* main/server: fix targets
* update more names
* Update build.yml
* rm accidentally checked in bins
* update straggling refs
* Update .gitignore
* Update server-llm.sh
* main: target name -> llama-cli
* Prefix all example bins w/ llama-
* fix main refs
* rename {main->llama}-cmake-pkg binary
* prefix more cmake targets w/ llama-
* add/fix gbnf-validator subfolder to cmake
* sort cmake example subdirs
* rm bin files
* fix llama-lookup-* Makefile rules
* gitignore /llama-*
* rename Dockerfiles
* rename llama|main -> llama-cli; consistent RPM bin prefixes
* fix some missing -cli suffixes
* rename dockerfile w/ llama-cli
* rename(make): llama-baby-llama
* update dockerfile refs
* more llama-cli(.exe)
* fix test-eval-callback
* rename: llama-cli-cmake-pkg(.exe)
* address gbnf-validator unused fread warning (switched to C++ / ifstream)
* add two missing llama- prefixes
* Updating docs for eval-callback binary to use new `llama-` prefix.
* Updating a few lingering doc references for rename of main to llama-cli
* Updating `run-with-preset.py` to use new binary names.
Updating docs around `perplexity` binary rename.
* Updating documentation references for lookup-merge and export-lora
* Updating two small `main` references missed earlier in the finetune docs.
* Update apps.nix
* update grammar/README.md w/ new llama-* names
* update llama-rpc-server bin name + doc
* Revert "update llama-rpc-server bin name + doc"
This reverts commit e474ef1df4 .
* add hot topic notice to README.md
* Update README.md
* Update README.md
* rename gguf-split & quantize bins refs in **/tests.sh
---------
Co-authored-by: HanClinto <hanclinto@gmail.com>
2024-06-13 00:41:52 +01:00
Georgi Gerganov
1442677f92
common : refactor cli arg parsing ( #7675 )
...
* common : gpt_params_parse do not print usage
* common : rework usage print (wip)
* common : valign
* common : rework print_usage
* infill : remove cfg support
* common : reorder args
* server : deduplicate parameters
ggml-ci
* common : add missing header
ggml-ci
* common : remote --random-prompt usages
ggml-ci
* examples : migrate to gpt_params
ggml-ci
* batched-bench : migrate to gpt_params
* retrieval : migrate to gpt_params
* common : change defaults for escape and n_ctx
* common : remove chatml and instruct params
ggml-ci
* common : passkey use gpt_params
2024-06-04 21:23:39 +03:00
Georgi Gerganov
554c247caf
ggml : remove OpenCL ( #7735 )
...
ggml-ci
2024-06-04 21:23:20 +03:00
slaren
adc9ff3841
llama-bench : allow using a different printer for stderr with -oe ( #7722 )
...
compare-commits.sh : hide stdout, use -oe to print markdown
2024-06-04 14:32:42 +02:00
Johannes Gäßler
c8047d538f
scripts: update compare_llama_bench.py [no ci] ( #7673 )
2024-05-31 16:26:21 +02:00
Galunid
9c4c9cc83f
Move convert.py to examples/convert-legacy-llama.py ( #7430 )
...
* Move convert.py to examples/convert-no-torch.py
* Fix CI, scripts, readme files
* convert-no-torch -> convert-legacy-llama
* Move vocab thing to vocab.py
* Fix convert-no-torch -> convert-legacy-llama
* Fix lost convert.py in ci/run.sh
* Fix imports
* Fix gguf not imported correctly
* Fix flake8 complaints
* Fix check-requirements.sh
* Get rid of ADDED_TOKENS_FILE, FAST_TOKENIZER_FILE
* Review fixes
2024-05-30 21:40:00 +10:00
Georgi Gerganov
00281b7be3
scripts : remove mpi remnants
2024-05-29 14:31:18 +03:00
Georgi Gerganov
2ab977282b
sync : ggml
2024-05-29 14:29:52 +03:00
slaren
d359f30921
llama : remove MPI backend ( #7395 )
2024-05-20 01:17:03 +02:00
jaime-m-p
b43272afa2
Unicode codepoint flags for custom regexs ( #7245 )
...
* Replace CODEPOINT_TYPE_* with codepoint_flags
* Update and bugfix brute force random test
* Deterministic brute force random test
* Unicode normalization NFD
* Get rid of BOM
2024-05-18 01:09:13 +02:00
Brian
51e9d02599
Added a single test function script and fix debug-test.sh to be more robust ( #7279 )
...
* run-single-test.sh: added a single test function script and fix debug-test.sh to be more robust
* debug-test.sh: combined execute and gdb test mode via -g flag
* debug-test.sh: refactor
* debug-test: refactor for clarity
* debug-test.sh: comment style changes
* debug-test.sh: fix gdb
2024-05-17 22:40:14 +10:00
Georgi Gerganov
29499bb593
sync : ggml
2024-05-15 13:23:41 +03:00
Georgi Gerganov
9f773486ab
script : sync ggml-rpc
2024-05-14 19:14:38 +03:00
Georgi Gerganov
a5e3fde857
sync : ggml
...
ggml-ci
2024-05-14 19:08:09 +03:00
Georgi Gerganov
7bd4ffb780
metal : fix warnings (skipme) ( #0 )
2024-05-11 21:38:13 +03:00
Georgi Gerganov
1622ac023f
sync : ggml
2024-05-11 21:35:05 +03:00
Josh Ramer
fed0108491
Scripting & documenting debugging one test without anything else in the loop. ( #7096 )
...
* A little documentation that shares my quick tips for working in the repository.
* Update startup-testing-debugging.md
* script that shows a menu of tests to pick from & run the debugger on
* debug-test.sh: Refactor CLI help message
* debug-test.sh: documentation update
* debug-test.sh: CLI Help output corrections
* debug-test.sh: minor doc fix
---------
authored-by: Josh Ramer <ubuntu@ip-172-31-32-53.ec2.internal>
Assisted-by: brian khuu <mofosyne@gmail.com>
2024-05-12 03:26:35 +10:00
Georgi Gerganov
fae9d234b6
sync : ggml
...
ggml-ci
2024-05-11 15:38:34 +03:00
slaren
e849648888
llama-bench : add pp+tg test type ( #7199 )
2024-05-10 18:03:54 +02:00
jaime-m-p
43248e5594
llama3 custom regex split ( #6965 )
...
* merged the changes from deepseeker models to main branch
* Moved regex patterns to unicode.cpp and updated unicode.h
* Moved header files
* Resolved issues
* added and refactored unicode_regex_split and related functions
* Updated/merged the deepseek coder pr
* Refactored code
* Adding unicode regex mappings
* Adding unicode regex function
* Added needed functionality, testing remains
* Fixed issues
* Fixed issue with gpt2 regex custom preprocessor
* unicode : fix? unicode_wstring_to_utf8
* lint : fix whitespaces
* tests : add tokenizer tests for numbers
* unicode : remove redundant headers
* tests : remove and rename tokenizer test scripts
* tests : add sample usage
* gguf-py : reader prints warnings on duplicate keys
* llama : towards llama3 tokenization support (wip)
* unicode : shot in the dark to fix tests on Windows
* unicode : first try custom implementations
* convert : add "tokenizer.ggml.pre" GGUF KV (wip)
* llama : use new pre-tokenizer type
* convert : fix pre-tokenizer type writing
* lint : fix
* make : add test-tokenizer-0-llama-v3
* wip
* models : add llama v3 vocab file
* llama : adapt punctuation regex + add llama 3 regex
* minor
* unicode : set bomb
* unicode : set bomb
* unicode : always use std::wregex
* unicode : support \p{N}, \p{L} and \p{P} natively
* unicode : try fix windows
* unicode : category support via std::regex
* unicode : clean-up
* unicode : simplify
* llama3 custom regex split
* convert : add convert-hf-to-gguf-update.py
ggml-ci
* lint : update
* convert : add falcon
ggml-ci
* unicode : normalize signatures
* lint : fix
* lint : fix
* convert : remove unused functions
* convert : add comments
* convert : exercise contractions
ggml-ci
* Using char32_t for codepoints
* lint : fix
* already exists unicode_tolower()
* Typing
* Restore BOM
* cmake : refactor test targets
* tests : refactor vocab tests
ggml-ci
* tests : add more vocabs and tests
ggml-ci
* unicode : cleanup
* scripts : ignore new update script in check-requirements.sh
* Fix merge
* models : add phi-3, mpt, gpt-2, starcoder
* tests : disable obsolete
ggml-ci
* tests : use faster bpe test
ggml-ci
* llama : more prominent warning for old BPE models
* tests : disable test-tokenizer-1-bpe due to slowness
ggml-ci
* Move unused variable value
* GPT2 custom regex split
* Add alternative regex for custom aplit llama3
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Style
* Add bruteforce random tests for token encoding
* wip: fixing unicode codepoint ranges
* Fix merge
* Unicode tables: separator, lowercase, uppercase and whitespace
* llama3 custom regex split: fix \s
* Restore BOM
* Style
* wip: generate NDF table
* Ignore special tokens for testing
* Clean gen-unicode-data.py
* Refactor random tokenizer test
* lint : fix
* tests : add fail test for llama-bpe
---------
Co-authored-by: Jaggzh <jaggz.h@gmail.com>
Co-authored-by: Kazim Abrar Mahi <kazimabrarmahi135@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: jaime-m-p <>
2024-05-09 23:30:44 +10:00
Brian
acdce3cdef
compare-llama-bench.py: add missing basicConfig ( #7138 )
...
* compare-llama-bench.py: add missing basicConfig
* compare-llama-bench.py: Add line break between error message and print_help()
* Add regular print() markdown table
2024-05-08 10:54:39 +02:00
Brian
6fbd432211
py : logging and flake8 suppression refactoring ( #7081 )
...
Set one as executable and add basicConfig()
to another. Also added noqa tag to test scripts.
2024-05-05 08:07:48 +03:00
Georgi Gerganov
92139b90af
tests : add test-tokenizer-0.sh + fix some tokenizers ( #7036 )
...
* tests : add test-tokenizer-0.sh
* unicode : add all unicode number ranges
* starcoder : fix pre-tokenizer
* tests : add test that fails with DeepSeek tokenizers
* falcon : fix regex
* unicode : regenerate unicode tables
* refact : add tokenizer model
* lint : fix
* tests : disable failing tests
ggml-ci
* refact : add tests files
ggml-ci
* convert : print -> logging
ggml-ci
* lint : fix
* unicode : digit -> number
* phi-3 : update
2024-05-04 08:32:32 +03:00
Brian
a2ac89d6ef
convert.py : add python logging instead of print() ( #6511 )
...
* convert.py: add python logging instead of print()
* convert.py: verbose flag takes priority over dump flag log suppression
* convert.py: named instance logging
* convert.py: use explicit logger id string
* convert.py: convert extra print() to named logger
* convert.py: sys.stderr.write --> logger.error
* *.py: Convert all python scripts to use logging module
* requirements.txt: remove extra line
* flake8: update flake8 ignore and exclude to match ci settings
* gh-actions: add flake8-no-print to flake8 lint step
* pre-commit: add flake8-no-print to flake8 and also update pre-commit version
* convert-hf-to-gguf.py: print() to logger conversion
* *.py: logging basiconfig refactor to use conditional expression
* *.py: removed commented out logging
* fixup! *.py: logging basiconfig refactor to use conditional expression
* constant.py: logger.error then exit should be a raise exception instead
* *.py: Convert logger error and sys.exit() into a raise exception (for atypical error)
* gguf-convert-endian.py: refactor convert_byteorder() to use tqdm progressbar
* verify-checksum-model.py: This is the result of the program, it should be printed to stdout.
* compare-llama-bench.py: add blank line for readability during missing repo response
* reader.py: read_gguf_file() use print() over logging
* convert.py: warning goes to stderr and won't hurt the dump output
* gguf-dump.py: dump_metadata() should print to stdout
* convert-hf-to-gguf.py: print --> logger.debug or ValueError()
* verify-checksum-models.py: use print() for printing table
* *.py: refactor logging.basicConfig()
* gguf-py/gguf/*.py: use __name__ as logger name
Since they will be imported and not run directly.
* python-lint.yml: use .flake8 file instead
* constants.py: logger no longer required
* convert-hf-to-gguf.py: add additional logging
* convert-hf-to-gguf.py: print() --> logger
* *.py: fix flake8 warnings
* revert changes to convert-hf-to-gguf.py for get_name()
* convert-hf-to-gguf-update.py: use triple quoted f-string instead
* *.py: accidentally corrected the wrong line
* *.py: add compilade warning suggestions and style fixes
2024-05-03 22:36:41 +03:00
Georgi Gerganov
f4ab2a4147
llama : fix BPE pre-tokenization ( #6920 )
...
* merged the changes from deepseeker models to main branch
* Moved regex patterns to unicode.cpp and updated unicode.h
* Moved header files
* Resolved issues
* added and refactored unicode_regex_split and related functions
* Updated/merged the deepseek coder pr
* Refactored code
* Adding unicode regex mappings
* Adding unicode regex function
* Added needed functionality, testing remains
* Fixed issues
* Fixed issue with gpt2 regex custom preprocessor
* unicode : fix? unicode_wstring_to_utf8
* lint : fix whitespaces
* tests : add tokenizer tests for numbers
* unicode : remove redundant headers
* tests : remove and rename tokenizer test scripts
* tests : add sample usage
* gguf-py : reader prints warnings on duplicate keys
* llama : towards llama3 tokenization support (wip)
* unicode : shot in the dark to fix tests on Windows
* unicode : first try custom implementations
* convert : add "tokenizer.ggml.pre" GGUF KV (wip)
* llama : use new pre-tokenizer type
* convert : fix pre-tokenizer type writing
* lint : fix
* make : add test-tokenizer-0-llama-v3
* wip
* models : add llama v3 vocab file
* llama : adapt punctuation regex + add llama 3 regex
* minor
* unicode : set bomb
* unicode : set bomb
* unicode : always use std::wregex
* unicode : support \p{N}, \p{L} and \p{P} natively
* unicode : try fix windows
* unicode : category support via std::regex
* unicode : clean-up
* unicode : simplify
* convert : add convert-hf-to-gguf-update.py
ggml-ci
* lint : update
* convert : add falcon
ggml-ci
* unicode : normalize signatures
* lint : fix
* lint : fix
* convert : remove unused functions
* convert : add comments
* convert : exercise contractions
ggml-ci
* lint : fix
* cmake : refactor test targets
* tests : refactor vocab tests
ggml-ci
* tests : add more vocabs and tests
ggml-ci
* unicode : cleanup
* scripts : ignore new update script in check-requirements.sh
* models : add phi-3, mpt, gpt-2, starcoder
* tests : disable obsolete
ggml-ci
* tests : use faster bpe test
ggml-ci
* llama : more prominent warning for old BPE models
* tests : disable test-tokenizer-1-bpe due to slowness
ggml-ci
---------
Co-authored-by: Jaggzh <jaggz.h@gmail.com>
Co-authored-by: Kazim Abrar Mahi <kazimabrarmahi135@gmail.com>
2024-04-29 16:58:41 +03:00
Olivier Chafik
5cf5e7d490
`build`: generate hex dump of server assets during build ( #6661 )
...
* `build`: generate hex dumps of server assets on the fly
* build: workaround lack of -n on gnu xxd
* build: don't use xxd in cmake
* build: don't call xxd from build.zig
* build: more idiomatic hexing
* build: don't use xxd in Makefile (od hackery instead)
* build: avoid exceeding max cmd line limit in makefile hex dump
* build: hex dump assets at cmake build time (not config time)
2024-04-21 18:48:53 +01:00
slaren
0d56246f4b
ggml : group all experts in a single ggml_mul_mat_id ( #6505 )
...
* ggml : group all experts in a single ggml_mul_mat_id
cuda : improve mmid row copy
* cuda : fix bin bcast with non-cont src0
* test-backend-ops : only run all mul mat tests for base types
* llama : disable moe offloading with SYCL
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-04-18 15:18:48 +02:00
Pierrick Hymbert
4bd0f93e4a
model: support arch `DbrxForCausalLM` ( #6515 )
...
* model: dbrx convert to gguf
#6344
* llama: support dbrx
#6344
* doc: dbrx: add the model as supported
* scripts: get-wikitext-2 add unzip
* llama: increase maximum experts allowed
* llama: factorize moe graph implementation between grok, mixtral and dbrx
---------
Co-authored-by: Megha Agarwal <16129366+megha95@users.noreply.github.com>
2024-04-13 11:33:52 +02:00
Daniel Bevenius
f4183afe6a
scripts : add --outdir option to hf.sh ( #6600 )
...
* scripts : add --outdir option to hf.sh
This commit adds an option to the hf.sh script that allows the user to
specify an output directory for the downloaded file.
The motivation for this changes is that examples that use the hf.sh
script to download models from huggingface can now specify the output
directory, perhaps to the `models` directory to keep them in one place
and not clutter the root directory.
Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
* squash! scripts : add --outdir option to hf.sh
Fix format of the --outdir option in the usage message.
Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
---------
Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
2024-04-11 16:22:47 +03:00
Georgi Gerganov
c4a3a4ff47
sync : ggml
2024-04-09 20:29:06 +03:00
Georgi Gerganov
e11a8999b5
license : update copyright notice + add AUTHORS ( #6405 )
...
* license : add AUTHORS
* authors : update
* scipts : add LICENSE and gen-authors.sh to sync
2024-04-09 09:23:19 +03:00
Georgi Gerganov
c37247796b
sync : ggml
2024-04-07 17:05:51 +03:00
Georgi Gerganov
43e8995e75
scripts : sync ggml-cuda folder
2024-04-07 16:08:12 +03:00
Georgi Gerganov
54ea0698fb
sync : ggml
2024-04-06 18:27:46 +03:00
Johannes Gäßler
33a5244806
compare-llama-bench.py: fix long hexsha args ( #6424 )
2024-04-01 13:30:43 +02:00
Georgi Gerganov
d48ccf3ad4
sync : ggml ( #6351 )
...
* sync : ggml
ggml-ci
* cuda : move GGML_CUDA_DMMV constants to dmmv.cuh
---------
Co-authored-by: slaren <slarengh@gmail.com>
2024-03-29 17:45:46 +02:00
slaren
280345968d
cuda : rename build flag to LLAMA_CUDA ( #6299 )
2024-03-26 01:16:01 +01:00
Johannes Gäßler
50ccaf5eac
lookup: complement data from context with general text statistics ( #5479 )
...
* lookup: evaluation tools, use corpus/previous gens
* fixup! lookup: evaluation tools, use corpus/previous gens
* fixup! lookup: evaluation tools, use corpus/previous gens
* fixup! lookup: evaluation tools, use corpus/previous gens
* fixup! lookup: evaluation tools, use corpus/previous gens
2024-03-23 01:24:36 +01:00
Georgi Gerganov
b838b53ad6
sync : ggml
2024-03-10 20:10:46 +02:00
Georgi Gerganov
8a3012a4ad
ggml : add ggml-common.h to deduplicate shared code ( #5940 )
...
* ggml : add ggml-common.h to shared code
ggml-ci
* scripts : update sync scripts
* sycl : reuse quantum tables
ggml-ci
* ggml : minor
* ggml : minor
* sycl : try to fix build
2024-03-09 12:47:57 +02:00
slaren
652ca2bded
compare-llama-bench.py : remove mul_mat_q ( #5892 )
2024-03-05 22:27:29 +01:00
Georgi Gerganov
efd8533ef8
sync : ggml
...
ggml-ci
2024-03-04 20:54:23 +02:00
Georgi Gerganov
a0fc62661f
sync : ggml
2024-03-04 10:40:04 +02:00
Georgi Gerganov
ef2cd694c4
scripts : add pod-llama.sh
2024-03-02 16:54:20 +02:00
Pierrick Hymbert
3ab8b3a92e
llama : cleanup unused mmq flags ( #5772 )
...
* cleanup unused --no-mul-mat-q,-nommq, -mmq, --mul-mat-q, mul_mat_q
* remove: mul_mat_q in compare llama bench and usage
* update llama-bench
---------
Co-authored-by: slaren <slarengh@gmail.com>
2024-03-01 13:39:06 +02:00
Georgi Gerganov
8c0e8f4e73
sync : ggml
2024-02-28 11:17:32 +02:00
Georgi Gerganov
334f76fa38
sync : ggml
2024-02-22 23:21:05 +02:00
Georgi Gerganov
5022cf242d
sync : ggml
2024-02-21 16:52:52 +02:00
Georgi Gerganov
eccd7a26dd
sync : ggml ( #5633 )
...
* ggml : fix conv_2d batch mode (ggml/737)
Co-authored-by: bssrdf <bssrdf@gmail.com>
* ggml : compute forward no longer pass src tensors (ggml/729)
* sync : ggml
ggml-ci
---------
Co-authored-by: bssrdf <merlintiger@hotmail.com>
Co-authored-by: bssrdf <bssrdf@gmail.com>
2024-02-21 16:17:10 +02:00
Georgi Gerganov
337c9cbd52
sync : ggml
...
ggml-ci
2024-02-19 15:09:43 +02:00
Jared Van Bortel
a0c2dad9d4
build : pass all warning flags to nvcc via -Xcompiler ( #5570 )
...
* build : pass all warning flags to nvcc via -Xcompiler
* make : fix apparent mis-merge from #3952
* make : fix incorrect GF_CC_VER for CUDA host compiler
2024-02-18 16:21:52 -05:00
Georgi Gerganov
b1de96824b
ci : fix wikitext url + compile warnings ( #5569 )
...
ggml-ci
2024-02-18 22:39:30 +02:00
Georgi Gerganov
d2819d5577
scripts : add helpers script for bench comparing commits ( #5521 )
...
* scripts : add helpers script for bench comparing commits
* scripts : detect CUDA
* set flags after checking the command line
* fix make flags
---------
Co-authored-by: slaren <slarengh@gmail.com>
2024-02-16 15:14:40 +02:00
Georgi Gerganov
9350a1cf21
scripts : add hf.sh helper script ( #5501 )
...
* scripts : add hf.sh helper scripts
* hf : add error logs
* hf : add support for --repo and --file
2024-02-15 15:41:15 +02:00
Georgi Gerganov
3b169441df
sync : ggml ( #5452 )
...
* ggml-alloc : v3 (ggml/727)
* ggml-alloc v3
ggml-ci
* fix ci
ggml-ci
* whisper : check for backend buffer allocation failures
* whisper : avoid leaks when initialization fails
* cleanup
ggml-ci
* style fixes
ggml-ci
* sync : ggml
* update llama.cpp, clip.cpp, export-lora.cpp
* update finetune.cpp, train-text-from-scratch.cpp
ggml-ci
* ggml-backend : reduce alignment to 32 to match gguf and fix mmap
---------
Co-authored-by: slaren <slarengh@gmail.com>
2024-02-12 09:16:06 +02:00
Georgi Gerganov
cd9aea63b5
scripts : update sync scripts with new backends
2024-02-10 09:53:05 +02:00
Georgi Gerganov
43b65f5eb8
sync : ggml
2024-02-10 09:30:36 +02:00
Georgi Gerganov
30679d438d
scripts : fix typos, cleanup ( #5303 )
2024-02-05 09:48:03 +02:00
Нияз Гарифзянов
4be04c8965
scripts : add non-interactive server-llm.sh ( #5303 )
...
* Update server-llm.sh
Add flag --non-interactive that allows run script without asking a permission
* Update scripts/server-llm.sh
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-02-05 09:43:57 +02:00
Georgi Gerganov
e437b37fd0
scripts : parse wtype in server-llm.sh ( #5167 )
...
* scripts : parse wtype in server-llm.sh
* scripts : fix check for wfile
2024-02-02 14:23:40 +02:00
Neo Zhang Jianyu
01684139c3
support SYCL backend windows build ( #5208 )
...
* support SYCL backend windows build
* add windows build in CI
* add for win build CI
* correct install oneMKL
* fix install issue
* fix ci
* fix install cmd
* fix install cmd
* fix install cmd
* fix install cmd
* fix install cmd
* fix win build
* fix win build
* fix win build
* restore other CI part
* restore as base
* rm no new line
* fix no new line issue, add -j
* fix grammer issue
* allow to trigger manually, fix format issue
* fix format
* add newline
* fix format
* fix format
* fix format issuse
---------
Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
2024-01-31 08:08:07 +05:30
Georgi Gerganov
8f8ddfcfad
sync : ggml ( #0 )
2024-01-30 16:21:57 +02:00
Georgi Gerganov
35dec26cc2
sync : ggml
2024-01-28 19:48:05 +02:00
Georgi Gerganov
753eafed0e
sync : ggml
2024-01-27 17:00:24 +02:00
Georgi Gerganov
5f1925a8ce
scripts : move run-with-preset.py from root to scripts folder
2024-01-26 17:09:44 +02:00
crasm
413e7b0559
ci : add model tests + script wrapper ( #4586 )
...
* scripts : add lib.sh and lib_test.sh
* scripts : stub out new ci-run.sh script
* scripts : switch to PascalCase for functions
This looks a little odd at first, but I find it very useful as a
convention to know if a command is part of our code vs a builtin.
* scripts : add some fancy conversion from snake_case to PascalCase
* Add venv to ci/run.sh
* Revert scripts work
* scripts : add wrapper script for local use of ci/run.sh
* Simplify .gitignore for tests, clang-tidy fixes
* Label all ctest tests
* ci : ctest uses -L main
* Attempt at writing ctest_with_model
* Update test-model-load-cancel
* ci : add ctest_with_model for debug and release
ggml-ci
* Fix gg_get_model function
ggml-ci
* got stuck on CMake
* Add get_model.cpp to tests/CMakeLists.txt
ggml-ci
* Fix README.md output for ctest_with_model
ggml-ci
* workflows : use `-L main` for all ctest
ggml-ci
* Fixes
* GG_RUN_CTEST_MODELFILE => LLAMACPP_TESTMODELFILE
* Always show warning rather than failing if model file variable is not
set
* scripts : update usage text for ci-run.sh
2024-01-26 14:18:00 +02:00
Georgi Gerganov
e9240cdfa0
scripts : add get-winogrande.sh
2024-01-18 20:45:39 +02:00
Georgi Gerganov
dcad445d0c
scritps : add helper script to get hellaswag data in txt format
2024-01-18 11:44:49 +02:00
Georgi Gerganov
6b6916b215
sync : ggml
2024-01-17 20:54:50 +02:00
Georgi Gerganov
9408cfdad6
scripts : sync-ggml-am.sh option to skip commits
2024-01-14 11:08:41 +02:00
Georgi Gerganov
76484fbfd3
sync : ggml
2024-01-14 00:14:46 +02:00
Johannes Gäßler
7dc78764e2
compare-llama-bench: tweak output format ( #4910 )
2024-01-13 15:52:53 +01:00
Georgi Gerganov
de473f5f8e
sync : ggml
2024-01-12 22:02:43 +02:00
Georgi Gerganov
64802ec00d
sync : ggml
2024-01-11 09:39:08 +02:00
Johannes Gäßler
4f56458d34
Python script to compare commits with llama-bench ( #4844 )
2024-01-10 01:04:33 +01:00
Georgi Gerganov
9a818f7c42
scripts : improve get-pg.sh ( #4838 )
2024-01-09 19:21:13 +02:00
Georgi Gerganov
d9653894df
scripts : script to get Paul Graham essays in txt format ( #4838 )
2024-01-09 16:23:05 +02:00
Georgi Gerganov
91d38876df
metal : switch back to default.metallib (ggml/681)
...
ggml-ci
2024-01-05 18:02:06 +02:00