Commit Graph

23 Commits

Author SHA1 Message Date
Yu, Zijun 44f4cf34b1 Fix Phi3 ROPE; Add test-backend-ops 2026-01-15 10:26:28 -08:00
Yu, Zijun d61f83c9b7 Fix CPY due to cgraph change 2026-01-15 10:23:35 -08:00
Yu, Zijun a80da69448 Pull out sin cos from rope 2026-01-15 10:20:18 -08:00
Yu, Zijun 3533c14cf6 Fix Phi3 SwiGLU and SoftMax 2026-01-15 10:20:18 -08:00
Yu, Zijun 0fa7a5efef Refactor: remove past_token_len from extra_inputs 2026-01-15 10:20:18 -08:00
Yu, Zijun acf358d1ce Pull out indices creation for kv cache update 2026-01-15 10:20:18 -08:00
Yu, Zijun bf5414c95e Replace Concat with Broadcast in MulMat for GQA 2026-01-15 10:20:18 -08:00
Yu, Zijun ebc4fc9f95 Fuse to SDPA 2026-01-15 10:20:18 -08:00
Yu, Zijun 73ee84fffe Add SwiGLU 2026-01-15 10:20:18 -08:00
Yu, Zijun 4c582ac7a3 Statful transformation for CPU GPU 2026-01-15 10:20:18 -08:00
Yu, Zijun 8afee795ad Update clang-format 2026-01-15 10:20:18 -08:00
Yu, Zijun 593484ce5f Refactor: clean, fix warning 2026-01-15 10:20:18 -08:00
Yu, Zijun 592d7f8bbb Change due to ggml cgraph changes, llama-3.2 CPU work 2026-01-15 10:20:18 -08:00
Yu, Zijun f7ad77930e Change due to ggml cgraph changes, not correct yet 2026-01-15 10:20:18 -08:00
Yu, Zijun d9ca8f5dbe NPU support version 2: prefill + kvcache 2026-01-15 10:20:18 -08:00
Yu, Zijun 34531abce4 draft NPU support version 2: prefill + kvcache 2026-01-15 10:20:18 -08:00
Yu, Zijun 7fec223334 Add initial NPU support 2026-01-15 10:20:18 -08:00
Yu, Zijun 8ce5cc597a Add cgraph tensor output name to OV op name 2026-01-15 10:20:18 -08:00
Yu, Zijun d7cc802292 PERF: use Slice+Concat in writing cache_v 2026-01-15 10:20:18 -08:00
Yu, Zijun 041d220dfa FIX: Re-add tensor names in cgraph, Add another case for RESHAPE 2026-01-15 10:20:13 -08:00
Yu, Zijun 0d505b4e56 STYLE and minor REFACTOR 2026-01-15 10:10:00 -08:00
Yu, Zijun cdf5370cb5 PERF: favor low precision matmul 2026-01-15 10:10:00 -08:00
Yu, Zijun 0d009fe61a FEAT: Add all conversion code from ov side 2026-01-15 10:10:00 -08:00