Commit Graph

8 Commits

Author SHA1 Message Date
Yu, Zijun 65e1b1af6d Fix after rebasing
- Layout of cache k and cache v are unified: [seq, n_head, head_size]
- Add CPY and FLASH_ATTN_EXT, flash attn is not used yet
- Skip test-backend-ops due to flash attn test crash
- Add mutex around graph conversion to avoid test-thread-safety fali in the future
- Update NPU config
- Update GPU config to disable SDPA opt to make phi-3 run
2026-01-15 11:19:15 -08:00
Yu, Zijun 839f8c66a0 Remove CPY 2026-01-15 11:19:15 -08:00
Yu, Zijun 63d000ba40 Support op SET_ROWS 2026-01-15 11:19:15 -08:00
Yu, Zijun 73ee84fffe Add SwiGLU 2026-01-15 10:20:18 -08:00
Yu, Zijun 8afee795ad Update clang-format 2026-01-15 10:20:18 -08:00
Yu, Zijun 593484ce5f Refactor: clean, fix warning 2026-01-15 10:20:18 -08:00
Yu, Zijun 0d505b4e56 STYLE and minor REFACTOR 2026-01-15 10:10:00 -08:00
Yu, Zijun 0d009fe61a FEAT: Add all conversion code from ov side 2026-01-15 10:10:00 -08:00