Georgi Gerganov
66a66a05a8
readme : add notice about new file format
...
ggml-ci
2023-08-21 22:42:14 +03:00
Georgi Gerganov
1e7a0092dd
Merge branch 'master' into gguf
...
ggml-ci
2023-08-21 16:28:30 +03:00
Adrian
2d8b76a110
Add link to clojure bindings to Readme. ( #2659 )
2023-08-18 21:39:22 +02:00
Georgi Gerganov
7af633aec3
readme : incoming BREAKING CHANGE
2023-08-18 17:48:31 +03:00
Georgi Gerganov
38016ed9ec
Merge branch 'master' into gguf
2023-08-18 15:21:48 +03:00
mdrokz
eaf98c2649
readme : add link to Rust bindings ( #2656 )
2023-08-18 13:17:58 +03:00
Georgi Gerganov
856afff746
Merge branch 'master' into gguf
2023-08-18 12:38:05 +03:00
Johannes Gäßler
0992a7b8b1
README: fix LLAMA_CUDA_MMV_Y documentation ( #2647 )
2023-08-17 23:57:59 +02:00
Henri Vasserman
6ddeefad9b
[Zig] Fixing Zig build and improvements ( #2554 )
...
* Fix zig after console.o was split
* Better include and flag management
* Change LTO to option
2023-08-17 23:11:18 +03:00
Georgi Gerganov
dd9e2fc988
ci : update ".bin" to ".gguf" extension
...
ggml-ci
2023-08-17 19:32:14 +03:00
Johannes Gäßler
25d43e0eb5
CUDA: tuned mul_mat_q kernels ( #2546 )
2023-08-09 09:42:34 +02:00
ldwang
220d931864
readme : add Aquila-7B model series to supported models ( #2487 )
...
* support bpe tokenizer in convert
Signed-off-by: ldwang <ftgreat@gmail.com>
* support bpe tokenizer in convert
Signed-off-by: ldwang <ftgreat@gmail.com>
* support bpe tokenizer in convert, fix
Signed-off-by: ldwang <ftgreat@gmail.com>
* Add Aquila-7B models in README.md
Signed-off-by: ldwang <ftgreat@gmail.com>
* Up Aquila-7B models in README.md
Signed-off-by: ldwang <ftgreat@gmail.com>
---------
Signed-off-by: ldwang <ftgreat@gmail.com>
Co-authored-by: ldwang <ftgreat@gmail.com>
2023-08-02 11:21:11 +03:00
Yiming Cui
a312193e18
readme : Add Chinese LLaMA-2 / Alpaca-2 to supported models ( #2475 )
...
* add support for chinese llama-2 / alpaca-2
* remove white spaces
2023-08-02 09:18:31 +03:00
Johannes Gäßler
0728c5a8b9
CUDA: mmq CLI option, fixed mmq build issues ( #2453 )
2023-07-31 15:44:35 +02:00
Johannes Gäßler
11f3ca06b8
CUDA: Quantized matrix matrix multiplication ( #2160 )
...
* mmq implementation for non k-quants
* q6_K
* q2_K
* q3_k
* q4_K
* vdr
* q5_K
* faster q8_1 loading
* loop unrolling
* add __restrict__
* q2_K sc_high
* GGML_CUDA_MMQ_Y
* Updated Makefile
* Update Makefile
* DMMV_F16 -> F16
* Updated README, CMakeLists
* Fix CMakeLists.txt
* Fix CMakeLists.txt
* Fix multi GPU out-of-bounds
2023-07-29 23:04:44 +02:00
niansa/tuxifan
edcc7ae7d2
Obtaining LLaMA 2 instructions ( #2308 )
...
* Obtaining LLaMA 2 instructions
* Removed sharing warning for LLaMA 2
* Linked TheBloke's GGML repos
* Add LLaMA 2 to list of supported models
* Added LLaMA 2 usage instructions
* Added links to LLaMA 2 70B models
2023-07-28 03:14:11 +02:00
Johannes Gäßler
70d26ac388
Fix __dp4a documentation ( #2348 )
2023-07-23 17:49:06 +02:00
Jose Maldonado
91171b8072
make : fix CLBLAST compile support in FreeBSD ( #2331 )
...
* Fix Makefile for CLBLAST compile support and instructions for compile llama.cpp FreeBSD
* More general use-case for CLBLAST support (Linux and FreeBSD)
2023-07-23 14:52:08 +03:00
wzy
78a3d13424
flake : remove intel mkl from flake.nix due to missing files ( #2277 )
...
NixOS's mkl misses some libraries like mkl-sdl.pc. See #2261
Currently NixOS doesn't have intel C compiler (icx, icpx). See https://discourse.nixos.org/t/packaging-intel-math-kernel-libraries-mkl/975
So remove it from flake.nix
Some minor changes:
- Change pkgs.python310 to pkgs.python3 to keep latest
- Add pkgconfig to devShells.default
- Remove installPhase because we have `cmake --install` from #2256
2023-07-21 13:26:34 +03:00
wzy
45a1b07e9b
flake : update flake.nix ( #2270 )
...
When `isx86_32 || isx86_64`, it will use mkl, else openblas
According to
https://discourse.nixos.org/t/rpath-of-binary-contains-a-forbidden-reference-to-build/12200/3 ,
add -DCMAKE_SKIP_BUILD_RPATH=ON
Fix #2261 , Nix doesn't provide mkl-sdl.pc.
When we build with -DBUILD_SHARED_LIBS=ON, -DLLAMA_BLAS_VENDOR=Intel10_lp64
replace mkl-sdl.pc by mkl-dynamic-lp64-iomp.pc
2023-07-19 10:01:55 +03:00
Jiří Podivín
27ab66e437
py : turn verify-checksum-models.py into executable ( #2245 )
...
README.md was adjusted to reflect the change.
Signed-off-by: Jiri Podivin <jpodivin@gmail.com>
2023-07-16 22:54:47 +03:00
Chad Brewbaker
917831c63a
readme : fix zig build instructions ( #2171 )
2023-07-11 19:03:06 +03:00
Evan Miller
5656d10599
mpi : add support for distributed inference via MPI ( #2099 )
...
* MPI support, first cut
* fix warnings, update README
* fixes
* wrap includes
* PR comments
* Update CMakeLists.txt
* Add GH workflow, fix test
* Add info to README
* mpi : trying to move more MPI stuff into ggml-mpi (WIP) (#2099 )
* mpi : add names for layer inputs + prep ggml_mpi_graph_compute()
* mpi : move all MPI logic into ggml-mpi
Not tested yet
* mpi : various fixes - communication now works but results are wrong
* mpi : fix output tensor after MPI compute (still not working)
* mpi : fix inference
* mpi : minor
* Add OpenMPI to GH action
* [mpi] continue-on-error: true
* mpi : fix after master merge
* [mpi] Link MPI C++ libraries to fix OpenMPI
* tests : fix new llama_backend API
* [mpi] use MPI_INT32_T
* mpi : factor out recv / send in functions and reuse
* mpi : extend API to allow usage with outer backends (e.g. Metal)
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-10 18:49:56 +03:00
JackJollimore
18780e0a5e
readme : update Termux instructions ( #2147 )
...
The file pathing is significant when running models inside of Termux on Android devices. llama.cpp performance is improved with loading a .bin from the $HOME directory.
2023-07-09 11:20:43 +03:00
rankaiyx
2492a53fd0
readme : add more docs indexes ( #2127 )
...
* Update README.md to add more docs indexes
* Update README.md to add more docs indexes
2023-07-09 10:38:42 +03:00
dylan
84525e7962
docker : add support for CUDA in docker ( #1461 )
...
Co-authored-by: canardleteer <eris.has.a.dad+github@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-07 21:25:25 +03:00
Judd
36680f6e40
convert : update for baichuan ( #2081 )
...
1. guess n_layers;
2. relax warnings on context size;
3. add a note that its derivations are also supported.
Co-authored-by: Judd <foldl@boxvest.com>
2023-07-06 19:23:49 +03:00
Johannes Gäßler
924dd22fd3
Quantized dot products for CUDA mul mat vec ( #2067 )
2023-07-05 14:19:42 +02:00
Georgi Gerganov
b472f3fca5
readme : add link web chat PR
2023-07-04 22:25:22 +03:00
Judd
471aab6e4c
convert : add support of baichuan-7b ( #2055 )
...
Co-authored-by: Judd <foldl@boxvest.com>
2023-07-01 20:00:25 +03:00
Roman Parykin
d38e451578
readme : add Scala 3 bindings repo ( #2010 )
2023-06-26 22:47:59 +03:00
Gustavo Rocha Dias
aa777abbb7
readme : LD_LIBRARY_PATH complement for some Android devices when building with CLBlast inside Termux ( #2007 )
...
* docs - Alternative way to build at Android, with CLBlast.
* doc - LD_LIBRARY_PATH complement for some Android devices when building with CLBlast inside Termux.
* doc- fix typo
2023-06-26 22:34:45 +03:00
Georgi Gerganov
412c60e473
readme : add link to new k-quants for visibility
2023-06-26 19:45:09 +03:00
Georgi Gerganov
447ccbe8c3
readme : add new roadmap + manifesto
2023-06-25 16:08:12 +03:00
Georgi Gerganov
66a2555ba6
readme : add Azure CI discussion link
2023-06-25 09:07:03 +03:00
Georgi Gerganov
11da1a85cd
readme : fix whitespaces
2023-06-24 13:38:18 +03:00
Alberto
235b610d65
readme : fixed termux instructions ( #1973 )
2023-06-24 13:32:13 +03:00
eiery
d7b7484f74
Add OpenLLaMA instructions to the README ( #1954 )
...
* add openllama to readme
2023-06-23 10:38:01 +02:00
Rahul Vivek Nair
fb98254f99
Fix typo in README.md ( #1961 )
2023-06-21 23:48:43 +02:00
Georgi Gerganov
049aa16b8c
readme : add link to p1
2023-06-20 19:05:54 +03:00
Xiake Sun
2322ec223a
Fix typo ( #1949 )
2023-06-20 15:42:40 +03:00
Johannes Gäßler
16b9cd1939
Convert vector to f16 for dequantize mul mat vec ( #1913 )
...
* Convert vector to f16 for dmmv
* compile option
* Added compilation option description to README
* Changed cmake CUDA_ARCHITECTURES from "OFF" to "native"
2023-06-19 10:23:56 +02:00
Mike
e1886cf4fe
readme : update Android build instructions ( #1922 )
...
Add steps for using termux on android devices to prevent common errors.
2023-06-18 11:28:26 +03:00
Johannes Gäßler
2c9380dd2f
Only one CUDA stream per device for async compute ( #1898 )
2023-06-17 19:15:02 +02:00
Gustavo Rocha Dias
bac19927c3
readme : alternative way to build for Android with CLBlast. ( #1828 )
2023-06-17 12:01:06 +03:00
Aisuko
059e99066d
doc : fix wrong address of BLIS.md ( #1772 )
...
Signed-off-by: Aisuko <urakiny@gmail.com>
2023-06-10 17:08:11 +03:00
Georgi Gerganov
4dc62c545d
readme : add June roadmap
2023-06-07 07:15:08 +03:00
Yuval Peled
f4c55d3bd7
docs : add performance troubleshoot + example benchmark documentation ( #1674 )
...
* test anchor link
* test table
* add benchmarks
* Add performance troubleshoot & benchmark
* add benchmarks
* remove unneeded line
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-06-05 23:32:36 +03:00
Foul-Tarnished
f1465624c2
readme : fix typo ( #1700 )
...
Fix a typo in a command in README.md
2023-06-05 23:28:37 +03:00
Georgi Gerganov
827f5eda91
readme : update hot topics
2023-06-04 23:38:19 +03:00