History

Johannes Gäßler e81b8e4b7f llama: use FA + max. GPU layers by default (#15434 ) * llama: use max. GPU layers by default, auto -fa * ggml-backend: abort instead of segfault		2025-08-30 16:32:10 +02:00
..
CMakeLists.txt	Support diffusion models: Add Dream 7B (#14644 )	2025-07-16 20:03:51 +08:00
README.md	Add LLaDA 8b Diffusion model (#14771 )	2025-07-31 19:49:09 +08:00
diffusion-cli.cpp	llama: use FA + max. GPU layers by default (#15434 )	2025-08-30 16:32:10 +02:00

README.md

Diffusion Text Generation

This directory contains implementations for Diffusion LLMs (DLLMs)

More Info:

https://github.com/ggml-org/llama.cpp/pull/14644
https://github.com/ggml-org/llama.cpp/pull/14771

Example of using Dream architechture: llama-diffusion-cli -m dream7b.gguf -p "write code to train MNIST in pytorch" -ub 512 --diffusion-eps 0.001 --diffusion-algorithm 3 --diffusion-steps 256 --diffusion-visual

Example of using LLaDA architechture: llama-diffusion-cli -m llada-8b.gguf -p "write code to train MNIST in pytorch" -ub 512 --diffusion-block-length 32 --diffusion-steps 256 --diffusion-visual