Yu, Zijun
|
44f4cf34b1
|
Fix Phi3 ROPE; Add test-backend-ops
|
2026-01-15 10:26:28 -08:00 |
Yu, Zijun
|
d61f83c9b7
|
Fix CPY due to cgraph change
|
2026-01-15 10:23:35 -08:00 |
Yu, Zijun
|
a80da69448
|
Pull out sin cos from rope
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
3533c14cf6
|
Fix Phi3 SwiGLU and SoftMax
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
0fa7a5efef
|
Refactor: remove past_token_len from extra_inputs
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
acf358d1ce
|
Pull out indices creation for kv cache update
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
bf5414c95e
|
Replace Concat with Broadcast in MulMat for GQA
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
ebc4fc9f95
|
Fuse to SDPA
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
73ee84fffe
|
Add SwiGLU
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
4c582ac7a3
|
Statful transformation for CPU GPU
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
8afee795ad
|
Update clang-format
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
593484ce5f
|
Refactor: clean, fix warning
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
592d7f8bbb
|
Change due to ggml cgraph changes, llama-3.2 CPU work
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
f7ad77930e
|
Change due to ggml cgraph changes, not correct yet
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
d9ca8f5dbe
|
NPU support version 2: prefill + kvcache
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
34531abce4
|
draft NPU support version 2: prefill + kvcache
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
7fec223334
|
Add initial NPU support
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
8ce5cc597a
|
Add cgraph tensor output name to OV op name
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
d7cc802292
|
PERF: use Slice+Concat in writing cache_v
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
041d220dfa
|
FIX: Re-add tensor names in cgraph, Add another case for RESHAPE
|
2026-01-15 10:20:13 -08:00 |
Yu, Zijun
|
0d505b4e56
|
STYLE and minor REFACTOR
|
2026-01-15 10:10:00 -08:00 |
Yu, Zijun
|
cdf5370cb5
|
PERF: favor low precision matmul
|
2026-01-15 10:10:00 -08:00 |
Yu, Zijun
|
0d009fe61a
|
FEAT: Add all conversion code from ov side
|
2026-01-15 10:10:00 -08:00 |