server: (doc) clarify in-scope and out-scope features (#20794)

* server: (doc) clarify in-scope and out-scope features * Apply suggestions from code review Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2026-03-20 14:03:50 +01:00 · 2026-03-20 14:03:50 +01:00 · fb78ad29bb
parent e06c3ab2bc
commit fb78ad29bb
2 changed files with 32 additions and 0 deletions
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@ -178,6 +178,8 @@ Maintainers reserve the right to decline review or close pull requests for any r
 - New code should follow the guidelines (coding, naming, etc.) outlined in this document. Exceptions are allowed in isolated, backend-specific parts of the code that do not interface directly with the `ggml` interfaces.
  _(NOTE: for legacy reasons, existing code is not required to follow this guideline)_

+- For changes in server, please make sure to refer to the [server development documentation](./tools/server/README-dev.md)
+
 # Documentation

 - Documentation is a community effort
--- a/tools/server/README-dev.md
+++ b/tools/server/README-dev.md
@ -4,6 +4,36 @@ This document provides an in-depth technical overview of `llama-server`, intende

 If you are an end user consuming `llama-server` as a product, please refer to the main [README](./README.md) instead.

+## Scope of features
+
+In-scope types of feature:
+
+- Backend:
+    - Basic inference features: text completion, embeddings output
+    - Chat-oriented features: chat completion, tool calling
+    - Third-party API compatibility, e.g. OAI-compat, Anthropic-compat
+    - Multimodal input/output
+    - Memory management: save/load state, context checkpoints
+    - Model management
+    - Features that are required by the Web UI
+- Frontend:
+    - Chat-oriented features, example: basic chat, image upload, edit messages
+    - Agentic features, example: MCP
+    - Model management
+
+Note: For security reasons, features that require reading or writing external files must be **disabled by default**. This covers features like: MCP, model save/load
+
+Out-of-scope features:
+
+- Backend:
+    - Features that require a loop of external API calls, e.g. server-side agentic loop. This is because external API calls in C++ are costly to maintain. Any complex third-party logic should be implemented outside of server code.
+    - Features that expose the internal state of the model to the API, example: getting the intermediate activation from API. This is because llama.cpp doesn't support a stable API for doing this, and relying on `eval_callback` can make it complicated to maintain as this API is not intended to be used in multi-sequence setup.
+    - Model-specific features. All API calls and features must remain model-agnostic.
+- Frontend:
+    - Third-party plugins, it is costly to maintain a public plugin API for such features. Instead, users can make their own MCP server for their needs.
+    - Customizable themes, it is also costly to maintain. While we do focus on the aesthetic, we try to achieve this by perfecting a small set of themes.
+    - Browser-specific features, example: [Chrome's built-in AI API](https://developer.chrome.com/docs/ai/built-in-apis).
+
 ## Backend

 ### Overview