llama.cpp/tools
Turkka Mannila 52c283a951 server : add encoder-decoder model support (T5, BART, MADLAD)
Add support for encoder-decoder models in llama-server, matching the
behavior of llama-cli. This enables translation models like MADLAD
and other T5-based models to work with the server.

Changes:
- Add has_encoder flag to detect encoder-decoder models at load time
- Implement llama_encode() call for encoder-decoder prompt processing
- Use decoder_start_token to initialize decoder after encoding
- Clear decoder KV cache before each new request (no prefix caching)
- Disable incompatible features for encoder-decoder models:
  - Context shift (encoder outputs are fixed)
  - Speculative decoding (not supported)
  - Prompt caching (encoder outputs depend on entire input)
  - Slot selection by LCP similarity (meaningless for enc-dec)
- Add edge case handling for empty text tokens

The encoder processes the full prompt, then the decoder generates
output using cross-attention to the encoder's hidden states.
2025-12-16 11:24:17 +02:00
..
batched-bench batched-bench : add "separate text gen" mode (#17103) 2025-11-10 12:59:29 +02:00
cli cli: new CLI experience (#17824) 2025-12-10 15:28:59 +01:00
completion cli: enable jinja by default (#17911) 2025-12-10 22:19:42 +01:00
cvector-generator cmake : Do not install tools on iOS targets (#15903) 2025-09-16 09:54:44 +07:00
export-lora cmake : Do not install tools on iOS targets (#15903) 2025-09-16 09:54:44 +07:00
gguf-split cli: new CLI experience (#17824) 2025-12-10 15:28:59 +01:00
imatrix Manually link -lbsd to resolve flock symbol on AIX (#16610) 2025-10-23 19:37:31 +08:00
llama-bench ggml-zendnn : add ZenDNN backend for AMD CPUs (#17690) 2025-12-07 00:13:33 +08:00
mtmd mtmd: some small clean up (#17909) 2025-12-10 22:20:06 +01:00
perplexity perplexity : show more kl-divergence data (#16321) 2025-09-29 09:30:45 +03:00
quantize cli: new CLI experience (#17824) 2025-12-10 15:28:59 +01:00
rpc Install rpc-server when GGML_RPC is ON. (#17149) 2025-11-11 10:53:59 +00:00
run Manually link -lbsd to resolve flock symbol on AIX (#16610) 2025-10-23 19:37:31 +08:00
server server : add encoder-decoder model support (T5, BART, MADLAD) 2025-12-16 11:24:17 +02:00
tokenize cmake : Do not install tools on iOS targets (#15903) 2025-09-16 09:54:44 +07:00
tts model : Apertus model implementation (#15852) 2025-10-02 20:43:22 +03:00
CMakeLists.txt cli: new CLI experience (#17824) 2025-12-10 15:28:59 +01:00