llama.cpp

Commit Graph

Author	SHA1	Message	Date
Xuejun Zhai	8ff73e5d53	Removed API m_outputs	2026-01-15 11:39:08 -08:00
Xuejun Zhai	111c96c266	Removed API get_output_ggml_tensor(const std::string & name)	2026-01-15 11:39:08 -08:00
Xuejun Zhai	ba852f2a60	Removed API GgmlOvDecoder::get_output_op_params(const std::string & name)	2026-01-15 11:39:08 -08:00
Xuejun Zhai	6d7a0d6047	Modified API GgmlOvDecoder::get_output_type(const std::string & name)	2026-01-15 11:39:08 -08:00
Xuejun Zhai	f516db1db5	remove unused API get_output_shape(const std::string & name)	2026-01-15 11:39:08 -08:00
Xuejun Zhai	497964afbb	remove unused API GgmlOvDecoder::get_output_names()	2026-01-15 11:39:08 -08:00
Yu, Zijun	8f4ee4eee2	minor update due to ov 2025.4	2026-01-15 11:39:08 -08:00
Xuejun Zhai	0ea8238ad0	remove unused API GgmlOvDecoder::get_output_stride(const std::string & name)	2026-01-15 11:39:08 -08:00
Yu, Zijun	2a9d4ca836	Refactor: split ov_graph_compute for dynamic and static	2026-01-15 11:39:08 -08:00
Yu, Zijun	808619e274	NPU support llma-perplexity -b 512 --no-warmup	2026-01-15 11:39:08 -08:00
Yu, Zijun	65348b5d20	fallback naive run with accuracy issue	2026-01-15 11:39:08 -08:00
Yu, Zijun	59e7e7c47d	NPU fix llama-bench	2026-01-15 11:39:08 -08:00
Yu, Zijun	38254cf592	NPU prefill chunking	2026-01-15 11:39:08 -08:00
XuejunZhai	992dea73fd	Fix error for naive	2026-01-15 11:39:08 -08:00
XuejunZhai	ae936519d2	Remove the second decoder for node. Moving the function into the model decoder	2026-01-15 11:39:05 -08:00
Arshath	4400b5cb4b	Update ggml-decoder.cpp	2026-01-15 11:38:13 -08:00
Arshath	98396b275a	Update ggml-decoder.cpp	2026-01-15 11:38:13 -08:00
Arshath	4a57b37d4d	Update ggml-decoder.cpp	2026-01-15 11:38:13 -08:00
Arshath	bed495226d	Update ggml-decoder.cpp	2026-01-15 11:38:13 -08:00
Arshath	11b4cc5a67	Update ggml-decoder.cpp	2026-01-15 11:38:13 -08:00
Arshath	047bfb5c90	Update ggml-decoder.cpp Hitting error while compiling on windows: error C3861: 'unsetenv': identifier not found Reason: unsetenv() is a POSIX function; it doesn’t exist on Windows. Visual Studio (MSVC) won’t recognize it. Proposed fix: Use _putenv_s() (Windows equivalent) This is supported by MSVC and achieves the same effect: it removes the environment variable from the process environment. This keeps cross-platform compatibility.	2026-01-15 11:38:07 -08:00
Yu, Zijun	531941b348	Fix NPU	2026-01-15 11:28:48 -08:00
Yu, Zijun	ae404f7cbb	Fix llama-bench	2026-01-15 11:28:48 -08:00
Yu, Zijun	072dde0b2b	change graph to 4d, support multi sequences	2026-01-15 11:28:48 -08:00
Yu, Zijun	ea2c99be1c	NPU unify PD (handled internally)	2026-01-15 11:28:48 -08:00
Yu, Zijun	303923aba7	Clean placeholders in ggml-openvino.cpp	2026-01-15 11:27:30 -08:00
Zijun Yu	b8690bc055	NPU Unify PD (#14 ) * Stateless. Fix llama-cli llama-server * Simplify broadcast op in attention * Replace get_output_tensor+memcpy with set_output_tensor * NPU unify PD. Unify dynamic and static dims	2026-01-15 11:27:30 -08:00
Yu, Zijun	eba8113dc4	Style: middle ptr and ref align, omit optional struct keyword	2026-01-15 11:27:30 -08:00
Yu, Zijun	bd3093f90c	Style: use switch in supports_ops	2026-01-15 11:27:30 -08:00
Ravi Panchumarthy	3a1129e073	Update OV dockerfile to use OV2025.3 and update build docs	2026-01-15 11:27:30 -08:00
Ravi Panchumarthy	45af912b48	Update CI to run OV dep install before build	2026-01-15 11:27:30 -08:00
Ravi Panchumarthy	38e8a19f50	Apply CISC review and update CI to OV2025.3	2026-01-15 11:27:28 -08:00
Yu, Zijun	4c8406eb70	Add OV CI cache	2026-01-15 11:26:00 -08:00
Ravi Panchumarthy	841d673bd0	Update to OV-2025.3 and CMakeLists.txt	2026-01-15 11:26:00 -08:00
Yu, Zijun	2d2f00a41f	Fix llama-3-8b and phi3-mini q4_0 NPU	2026-01-15 11:26:00 -08:00
Yu, Zijun	299f4923bb	fix after rebasing	2026-01-15 11:26:00 -08:00
Yu, Zijun	8b82d1153b	Fix add_sliced_mask; Revert mulmat, softmax; Remove input attention_size, iSWA model not working	2026-01-15 11:26:00 -08:00
Yu, Zijun	a9371ea646	Fix llama-cli (need to run with --no-warmup)	2026-01-15 11:26:00 -08:00
cavusmustafa	05d7abae8c	Fix for Phi3	2026-01-15 11:26:00 -08:00
cavusmustafa	e7252920e1	env variable GGML_OPENVINO_DISABLE_SDPA_OPTIMIZATION added	2026-01-15 11:26:00 -08:00
cavusmustafa	c112bc4e73	kvcachefusion support	2026-01-15 11:26:00 -08:00
Yu, Zijun	973a80fd02	Always apply Eliminate_ZP to fix GPU compile issue on some platforms	2026-01-15 11:26:00 -08:00
Yu, Zijun	fdadca1e89	Fix after rebasing	2026-01-15 11:26:00 -08:00
Yu, Zijun	f3afa7b914	Requantize Q6_K (gs16) to gs32 on GPU	2026-01-15 11:26:00 -08:00
Yu, Zijun	e4bfe5a20d	Add Q5_K to support phi-3-q4_k_m	2026-01-15 11:26:00 -08:00
Yu, Zijun	2f1d50fb07	Minor refactor	2026-01-15 11:26:00 -08:00
Yu, Zijun	67e178a2f6	Minor: not add attention_size_swa for non-swa model	2026-01-15 11:26:00 -08:00
Yu, Zijun	1a38339cea	Fix ROPE accuracy when freq_scale != 1	2026-01-15 11:26:00 -08:00
Yu, Zijun	602f9ca4af	Fix NPU accuracy	2026-01-15 11:26:00 -08:00
Yu, Zijun	9de874cb7b	Support iSWA	2026-01-15 11:25:58 -08:00

1 2 3 4 5 ...

7938 Commits All Branches Search

7938 Commits

All Branches