Ravi Panchumarthy
|
2f99135ccc
|
Update build.md
|
2026-01-15 10:26:28 -08:00 |
Yu, Zijun
|
43489bbfaa
|
Revert changes in fuse_to_sdpa
|
2026-01-15 10:26:28 -08:00 |
Cavus Mustafa
|
1a19566b23
|
add mark decomp pass
|
2026-01-15 10:26:28 -08:00 |
Cavus Mustafa
|
93b2d09a2d
|
mulmat type conversion update
|
2026-01-15 10:26:28 -08:00 |
Cavus Mustafa
|
e2fdc1b988
|
mulmat input conversion fix
|
2026-01-15 10:26:28 -08:00 |
Yu, Zijun
|
01cdf4a9cc
|
matmul in fp32
|
2026-01-15 10:26:28 -08:00 |
Cavus Mustafa
|
9cf56d6837
|
temp. changes for mark decomp
|
2026-01-15 10:26:28 -08:00 |
Yu, Zijun
|
4e7f04a307
|
Fix llama-perplexity
|
2026-01-15 10:26:28 -08:00 |
Yu, Zijun
|
75eec6265f
|
Fix llama-bench; Clang-format
|
2026-01-15 10:26:28 -08:00 |
Yu, Zijun
|
6dc4b90635
|
Fix NPU
|
2026-01-15 10:26:28 -08:00 |
Yu, Zijun
|
44f4cf34b1
|
Fix Phi3 ROPE; Add test-backend-ops
|
2026-01-15 10:26:28 -08:00 |
Yu, Zijun
|
1ed49bbfaf
|
Fix llama-cli
|
2026-01-15 10:26:28 -08:00 |
ravi9
|
ea75772e48
|
Added OpenVINO CI/CD. Updated docs
|
2026-01-15 10:26:25 -08:00 |
Yu, Zijun
|
d61f83c9b7
|
Fix CPY due to cgraph change
|
2026-01-15 10:23:35 -08:00 |
Yu, Zijun
|
f3c0519096
|
Reduce memory: free ov weights node after graph conversion
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
a80da69448
|
Pull out sin cos from rope
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
3533c14cf6
|
Fix Phi3 SwiGLU and SoftMax
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
0fa7a5efef
|
Refactor: remove past_token_len from extra_inputs
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
acf358d1ce
|
Pull out indices creation for kv cache update
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
bf5414c95e
|
Replace Concat with Broadcast in MulMat for GQA
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
ebc4fc9f95
|
Fuse to SDPA
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
73ee84fffe
|
Add SwiGLU
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
4c582ac7a3
|
Statful transformation for CPU GPU
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
8afee795ad
|
Update clang-format
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
593484ce5f
|
Refactor: clean, fix warning
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
42d4240937
|
Change due to ggml cgraph changes, all device work
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
e27738a987
|
Add AMD64 to CMakeLists
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
592d7f8bbb
|
Change due to ggml cgraph changes, llama-3.2 CPU work
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
f7ad77930e
|
Change due to ggml cgraph changes, not correct yet
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
d9ca8f5dbe
|
NPU support version 2: prefill + kvcache
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
34531abce4
|
draft NPU support version 2: prefill + kvcache
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
7fec223334
|
Add initial NPU support
|
2026-01-15 10:20:18 -08:00 |
Ravi Panchumarthy
|
3051d5ae07
|
Update openvino build instructions
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
8ce5cc597a
|
Add cgraph tensor output name to OV op name
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
fd324366d0
|
Update build doc
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
d7cc802292
|
PERF: use Slice+Concat in writing cache_v
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
8ac5c225aa
|
FIX: set_max_token_len
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
a30dc6e726
|
PERF: add weight constant in parallel
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
c57f61494a
|
FIX: input shape of KQ_mask
|
2026-01-15 10:20:18 -08:00 |
Yu, Zijun
|
041d220dfa
|
FIX: Re-add tensor names in cgraph, Add another case for RESHAPE
|
2026-01-15 10:20:13 -08:00 |
Yu, Zijun
|
0d505b4e56
|
STYLE and minor REFACTOR
|
2026-01-15 10:10:00 -08:00 |
Yu, Zijun
|
cdf5370cb5
|
PERF: favor low precision matmul
|
2026-01-15 10:10:00 -08:00 |
Yu, Zijun
|
0d009fe61a
|
FEAT: Add all conversion code from ov side
|
2026-01-15 10:10:00 -08:00 |
Yu, Zijun
|
f15a2cc057
|
STYLE: clang-format
|
2026-01-15 10:10:00 -08:00 |
Yu, Zijun
|
a0b30529bf
|
FIX: backend buffer type issue
|
2026-01-15 10:10:00 -08:00 |
Zijun Yu
|
4c905b2b25
|
fix build error
|
2026-01-15 10:10:00 -08:00 |
Viraj Wadhwa
|
ffabe95e2a
|
Rebase - Bring up to date and fix build process
|
2026-01-15 10:09:23 -08:00 |
Yu, Zijun
|
a8e5efa44e
|
PERF: compile once (dynamic graph + cache)
|
2026-01-15 10:05:41 -08:00 |
Yu, Zijun
|
7d5e234254
|
FEAT: improve debug capability
|
2026-01-15 10:05:41 -08:00 |
Yu, Zijun
|
0a8cc9ab03
|
BUILD: update build doc, add cmake preset, add CACHE_DIR env var
|
2026-01-15 10:05:41 -08:00 |