Yu, Zijun
8ce5cc597a
Add cgraph tensor output name to OV op name
2026-01-15 10:20:18 -08:00
Yu, Zijun
fd324366d0
Update build doc
2026-01-15 10:20:18 -08:00
Yu, Zijun
d7cc802292
PERF: use Slice+Concat in writing cache_v
2026-01-15 10:20:18 -08:00
Yu, Zijun
8ac5c225aa
FIX: set_max_token_len
2026-01-15 10:20:18 -08:00
Yu, Zijun
a30dc6e726
PERF: add weight constant in parallel
2026-01-15 10:20:18 -08:00
Yu, Zijun
c57f61494a
FIX: input shape of KQ_mask
2026-01-15 10:20:18 -08:00
Yu, Zijun
041d220dfa
FIX: Re-add tensor names in cgraph, Add another case for RESHAPE
2026-01-15 10:20:13 -08:00
Yu, Zijun
0d505b4e56
STYLE and minor REFACTOR
2026-01-15 10:10:00 -08:00
Yu, Zijun
cdf5370cb5
PERF: favor low precision matmul
2026-01-15 10:10:00 -08:00
Yu, Zijun
0d009fe61a
FEAT: Add all conversion code from ov side
2026-01-15 10:10:00 -08:00
Yu, Zijun
f15a2cc057
STYLE: clang-format
2026-01-15 10:10:00 -08:00
Yu, Zijun
a0b30529bf
FIX: backend buffer type issue
2026-01-15 10:10:00 -08:00
Zijun Yu
4c905b2b25
fix build error
2026-01-15 10:10:00 -08:00
Viraj Wadhwa
ffabe95e2a
Rebase - Bring up to date and fix build process
2026-01-15 10:09:23 -08:00
Yu, Zijun
a8e5efa44e
PERF: compile once (dynamic graph + cache)
2026-01-15 10:05:41 -08:00
Yu, Zijun
7d5e234254
FEAT: improve debug capability
2026-01-15 10:05:41 -08:00
Yu, Zijun
0a8cc9ab03
BUILD: update build doc, add cmake preset, add CACHE_DIR env var
2026-01-15 10:05:41 -08:00
Yu, Zijun
d3bdca25bd
PERF: share const nodes for weights for diff infer
2026-01-15 10:05:41 -08:00
Yu, Zijun
96ba47dd43
STYLE: minor refactor
2026-01-15 10:05:41 -08:00
Yu, Zijun
c04966cda6
REFACTOR: support weigts as constant
2026-01-15 10:05:41 -08:00
Yu, Zijun
0c7b026ecc
FEAT: Add interleaved mode for ROPE
2026-01-15 10:05:41 -08:00
Yu, Zijun
6ed44a3dff
FEAT: do PERMUTE eagerly
2026-01-15 10:05:41 -08:00
Yu, Zijun
8b408869ae
Arbitrary token len (>32) work; Fix bug in mulmat
2026-01-15 10:05:41 -08:00
Yu, Zijun
8d263bd6a5
2nd+ token correct by fix CPY in OV, remove single op backend compute code
2026-01-15 10:05:41 -08:00
Yu, Zijun
91d2a195b5
change op mappings to list in openvino_supports_op
2026-01-15 10:05:41 -08:00
Yu, Zijun
651b2c06cb
* Use find_package in CMake to configure OpenVINO
...
* Remove OPENVINO_OP_DEBUG
* Simplify set_input_output in decoder
* Fix CPY in set_input_output
* Use params from converted ov model in setting input
2026-01-15 10:05:41 -08:00
zhanmyz
84be5c6f15
1. Delete some comments
...
2. Process Prompt and predict first token is OK
2026-01-15 10:05:41 -08:00
zhanmyz
eac9a99530
1. Solve the AC issue of Permute+VIEW and MULMAL issue in the phase of “1. Process Prompt and predict the first token”.
...
2. There is still an AC issue in the "2. Predict the subsequent tokens phase" and it is being debugged.
A deviation has been detected in the computation of OpenVINO's CPY Node at stage 2, and it is currently being fixed.
2026-01-15 10:05:41 -08:00
zhanmyz
8ae700ae11
Process Prompt and predict first token is OK
2026-01-15 10:05:41 -08:00
zhanmyz
8020138406
add debug info
2026-01-15 10:05:41 -08:00
zhanmyz
b02265a507
1. In the Prompt process and predict first token stage, the PERMUTE node needs to be integrated into the OV Frontend
...
2. In the predict latest token stage, the VIEW, CONT, Reshape need to be integrated into the OV Frontend.
2026-01-15 10:05:41 -08:00
zhanmyz
19ec9b6bf5
Try to add VIEW node to OV Frontend and have some issues that need to be dealt with
2026-01-15 10:05:41 -08:00
zhanmyz
b14b49d5f6
Minor Update
2026-01-15 10:05:41 -08:00
zhanmyz
467a5ddf04
1. Update the implementation of CPY node when it's non-contiguous
...
2. Remove duplicate get node operation function
2026-01-15 10:05:41 -08:00
zhanmyz
cff473a9e2
1. All operators implemented using OpenVINO can be successfully executed individually.
...
2. VIEW op output tensor shape is not same with CONT(non-contiguous) input tensor shape
3. CPY(non-contiguous) can't be implemented with original input/output tensor shape and data(need change the original shape when create input/output tensor)
Currently. VIEW op executed in the ggml backend and others executed in the OpenVINO Frontend.
2026-01-15 10:05:41 -08:00
zhanmyz
e08a7fda33
All adjacent ops can conversion but calculation result is wrong and need debugging
2026-01-15 10:05:41 -08:00
zhanmyz
d05c458421
change CONT and MULMAT input node shape
2026-01-15 10:05:41 -08:00
zhanmyz
246a2d1021
Change the input and ouput node shape of MUL_MAT operator
2026-01-15 10:05:41 -08:00
zhanmyz
f37fa21a5c
Change the input and ouput node shape of MUL_MAT operator
2026-01-15 10:05:41 -08:00
zhanmyz
f98d215162
Change the input parameter shape of CONT operator
2026-01-15 10:05:41 -08:00
zhanmyz
9a7b7d8d6d
OV Frontend supports GET_ROWS/RMS_NORM/MUL/MUL_MAT/ROPE/SCALE/SOFTMAX/ADD adjacent op graph conversion
2026-01-15 10:05:41 -08:00
zhanmyz
95ae982d59
OV Frontend supports GET_ROWS/RMS_NORM/MUL/MUL_MAT graph conversion of consecutive OPs
2026-01-15 10:05:41 -08:00
zhanmyz
901f7347ff
Execute CONT & VIEW operators in OV Frontend is OK
2026-01-15 10:05:41 -08:00
zhanmyz
081b52667b
Execute singel CONT operator is OK
2026-01-15 10:05:41 -08:00
zhanmyz
afb8594194
add tmp source code files
2026-01-15 10:05:41 -08:00
zhanmyz
57582fda39
add implementation of CPY when the output tensor is non-contiguous
2026-01-15 10:05:41 -08:00
zhanmyz
8484769981
add implementation of MUL_MAT, CPY, CONT of GGML ops using OV ops
2026-01-15 10:05:41 -08:00
zhanmyz
cb2729bc4a
Move CPY from GGML OV Backend to OV Frontend
2026-01-15 10:05:41 -08:00
zhanmyz
2b04bd43be
Add MUL_MAT,CPY,CONT as operators implemented in OpenVINO for GGML backend
2026-01-15 10:05:41 -08:00
zhanmyz
0f7d07de7d
Add support for RMS_NORM OP
2026-01-15 10:05:41 -08:00