From d3649c11cbf2d4967ed8b2871c05d299a14e3cd8 Mon Sep 17 00:00:00 2001 From: Yamini Nimmagadda Date: Tue, 13 Jan 2026 14:53:27 -0800 Subject: [PATCH] Update OPENVINO.md --- docs/backend/OPENVINO.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/docs/backend/OPENVINO.md b/docs/backend/OPENVINO.md index 87c537f20b..acb461f435 100644 --- a/docs/backend/OPENVINO.md +++ b/docs/backend/OPENVINO.md @@ -108,8 +108,6 @@ GGML_OPENVINO_DEVICE=GPU ./llama-bench -fa 1 ### NPU Notes -- Smaller context sizes are recommended (e.g. `-c 512`) -- Static compilation mode is enabled automatically - Model caching is not yet supported - Does not support llama-server -np > 1 (multiple parallel sequences) - Only supports llama-perplexity -b 512 or smaller