From 6e0c8cbc40c4abf49e5c52f0f51267c2afdfc053 Mon Sep 17 00:00:00 2001
From: Bart Louwers <bart.louwers@gmail.com>
Date: Tue, 30 Dec 2025 22:13:49 +0100
Subject: [PATCH] docs : document that JSON Schema is not available to model
 when using response_format  (#18492)

* Document unsupported JSON Schema annotations

Add note about unsupported JSON Schema annotations.

* Update README.md

* Update README.md

* Update README.md
---
 grammars/README.md | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/grammars/README.md b/grammars/README.md
index daac7f4d8d..dcd28648b1 100644
--- a/grammars/README.md
+++ b/grammars/README.md
@@ -150,6 +150,9 @@ You can use GBNF grammars:
     - in CLI, with [examples/json_schema_to_grammar.py](../examples/json_schema_to_grammar.py)
     - in JavaScript with [json-schema-to-grammar.mjs](../tools/server/public_legacy/json-schema-to-grammar.mjs) (this is used by the [server](../tools/server)'s Web UI)
 
+> [!NOTE]
+> The JSON schema is only used to constrain the model output and is not injected into the prompt. The model has no visibility into the schema, so if you want it to understand the expected structure, describe it explicitly in your prompt. This does not apply to tool calling, where schemas are injected into the prompt.
+
 Take a look at [tests](../tests/test-json-schema-to-grammar.cpp) to see which features are likely supported (you'll also find usage examples in https://github.com/ggml-org/llama.cpp/pull/5978, https://github.com/ggml-org/llama.cpp/pull/6659 & https://github.com/ggml-org/llama.cpp/pull/6555).
 
 ```bash