llama.cpp

Commit Graph

Author	SHA1	Message	Date
Yu, Zijun	6ed44a3dff	FEAT: do PERMUTE eagerly	2026-01-15 10:05:41 -08:00
Yu, Zijun	8b408869ae	Arbitrary token len (>32) work; Fix bug in mulmat	2026-01-15 10:05:41 -08:00
Yu, Zijun	8d263bd6a5	2nd+ token correct by fix CPY in OV, remove single op backend compute code	2026-01-15 10:05:41 -08:00
Yu, Zijun	91d2a195b5	change op mappings to list in openvino_supports_op	2026-01-15 10:05:41 -08:00
Yu, Zijun	651b2c06cb	* Use find_package in CMake to configure OpenVINO * Remove OPENVINO_OP_DEBUG * Simplify set_input_output in decoder * Fix CPY in set_input_output * Use params from converted ov model in setting input	2026-01-15 10:05:41 -08:00
zhanmyz	84be5c6f15	1. Delete some comments 2. Process Prompt and predict first token is OK	2026-01-15 10:05:41 -08:00
zhanmyz	eac9a99530	1. Solve the AC issue of Permute+VIEW and MULMAL issue in the phase of “1. Process Prompt and predict the first token”. 2. There is still an AC issue in the "2. Predict the subsequent tokens phase" and it is being debugged. A deviation has been detected in the computation of OpenVINO's CPY Node at stage 2, and it is currently being fixed.	2026-01-15 10:05:41 -08:00
zhanmyz	8ae700ae11	Process Prompt and predict first token is OK	2026-01-15 10:05:41 -08:00
zhanmyz	8020138406	add debug info	2026-01-15 10:05:41 -08:00
zhanmyz	b02265a507	1. In the Prompt process and predict first token stage, the PERMUTE node needs to be integrated into the OV Frontend 2. In the predict latest token stage, the VIEW, CONT, Reshape need to be integrated into the OV Frontend.	2026-01-15 10:05:41 -08:00
zhanmyz	19ec9b6bf5	Try to add VIEW node to OV Frontend and have some issues that need to be dealt with	2026-01-15 10:05:41 -08:00
zhanmyz	b14b49d5f6	Minor Update	2026-01-15 10:05:41 -08:00
zhanmyz	467a5ddf04	1. Update the implementation of CPY node when it's non-contiguous 2. Remove duplicate get node operation function	2026-01-15 10:05:41 -08:00
zhanmyz	cff473a9e2	1. All operators implemented using OpenVINO can be successfully executed individually. 2. VIEW op output tensor shape is not same with CONT(non-contiguous) input tensor shape 3. CPY(non-contiguous) can't be implemented with original input/output tensor shape and data(need change the original shape when create input/output tensor) Currently. VIEW op executed in the ggml backend and others executed in the OpenVINO Frontend.	2026-01-15 10:05:41 -08:00
zhanmyz	e08a7fda33	All adjacent ops can conversion but calculation result is wrong and need debugging	2026-01-15 10:05:41 -08:00
zhanmyz	d05c458421	change CONT and MULMAT input node shape	2026-01-15 10:05:41 -08:00
zhanmyz	246a2d1021	Change the input and ouput node shape of MUL_MAT operator	2026-01-15 10:05:41 -08:00
zhanmyz	f37fa21a5c	Change the input and ouput node shape of MUL_MAT operator	2026-01-15 10:05:41 -08:00
zhanmyz	f98d215162	Change the input parameter shape of CONT operator	2026-01-15 10:05:41 -08:00
zhanmyz	9a7b7d8d6d	OV Frontend supports GET_ROWS/RMS_NORM/MUL/MUL_MAT/ROPE/SCALE/SOFTMAX/ADD adjacent op graph conversion	2026-01-15 10:05:41 -08:00
zhanmyz	95ae982d59	OV Frontend supports GET_ROWS/RMS_NORM/MUL/MUL_MAT graph conversion of consecutive OPs	2026-01-15 10:05:41 -08:00
zhanmyz	901f7347ff	Execute CONT & VIEW operators in OV Frontend is OK	2026-01-15 10:05:41 -08:00
zhanmyz	081b52667b	Execute singel CONT operator is OK	2026-01-15 10:05:41 -08:00
zhanmyz	afb8594194	add tmp source code files	2026-01-15 10:05:41 -08:00
zhanmyz	57582fda39	add implementation of CPY when the output tensor is non-contiguous	2026-01-15 10:05:41 -08:00
zhanmyz	8484769981	add implementation of MUL_MAT, CPY, CONT of GGML ops using OV ops	2026-01-15 10:05:41 -08:00
zhanmyz	cb2729bc4a	Move CPY from GGML OV Backend to OV Frontend	2026-01-15 10:05:41 -08:00
zhanmyz	2b04bd43be	Add MUL_MAT,CPY,CONT as operators implemented in OpenVINO for GGML backend	2026-01-15 10:05:41 -08:00
zhanmyz	0f7d07de7d	Add support for RMS_NORM OP	2026-01-15 10:05:41 -08:00
yumengbo	2353c73f53	Support ROPE op.	2026-01-15 10:05:41 -08:00
yumengbo	8aba03bac6	Support Softmax op	2026-01-15 10:05:41 -08:00
yumengbo	d218c61e6d	Support Softmax op	2026-01-15 10:05:41 -08:00
yumengbo	590f587b27	Add support for UNARY SILU op . Fix pytorch impl bugs.	2026-01-15 10:05:41 -08:00
yumengbo	b100f89bad	Change to implementation following pytorch frontend	2026-01-15 10:05:41 -08:00
yumengbo	e95f29cbc0	Fix issue for output memory copy of infer request	2026-01-15 10:05:41 -08:00
zhanmyz	8c5a609f8d	add the rms_norm operator implemented using OpenVINO to the GGML backend of llama.cpp	2026-01-15 10:05:41 -08:00
zhanmyz	80c330a469	Update build.md and add operation mapping(GGML to OpenVINO)	2026-01-15 10:05:41 -08:00
zhanmyz	49804f43fc	add GET_ROWS operator of OpenVINO to GGML of llama.cpp	2026-01-15 10:05:41 -08:00
yumengbo	5b46dc23be	Change output for infer request to set output tensor. Support scale, view op.	2026-01-15 10:05:41 -08:00
yumengbo	31bd816426	Add GGML_OV_FRONTEND option. Add readme.	2026-01-15 10:05:41 -08:00
yumengbo	9b7b63d12c	Convert subgraph with add, sub, mul, div op to ov model and do infer on openvino device	2026-01-15 10:05:41 -08:00
yumengbo	34e826ac14	Implement GgmlOvDecoder. Add dump functions.	2026-01-15 10:05:41 -08:00
yumengbo	171c4681f4	Add PoC of integration of openvino frontend. Main changes: ggml-ov-frontend-utils, GraphIterator, Decoder	2026-01-15 10:05:41 -08:00
zhanmyz	ee31dc1c1b	add get openvino available ops function	2026-01-15 10:05:41 -08:00
zhanmyz	77d68146a8	add OpenVINO frontend convert process steps	2026-01-15 10:05:41 -08:00
zhanmyz	0a81aa19f7	Add compile options	2026-01-15 10:05:40 -08:00
zhanmyz	adc2c70f44	Add OpenVINO MUL operator to GGML of Llama.cpp.	2026-01-15 10:05:40 -08:00
zhanmyz	faa4a7de76	Solve the issue of abnormal model output caused by using OpenVINO ADD operator	2026-01-15 10:05:40 -08:00
zhanmyz	9b9d51dddf	* Configure the device(default CPU) that uses OpenVINO to compile the model * Add OpenVINO ADD operator to Llama.cpp. The output is somewhat abnormal and needs further debugging.	2026-01-15 10:05:40 -08:00
zhanmyz	5294402b50	add openvino as optional backend for Llama.cpp ggml	2026-01-15 10:05:40 -08:00

1 2 3 4 5 ...

7800 Commits All Branches Search

7800 Commits

All Branches