squash! sampling : support intermixed backend/cpu samplers
Fix llama-save-load-state which currently fails by handling the case when batch.logits is nullptr (like when loading state) by allocating space for all outputs as CPU logits.
This commit is contained in:
parent
9ad6522be6
commit
459b7ae7b9
|
|
@ -1676,6 +1676,10 @@ uint32_t llama_context::output_reserve(int32_t n_outputs, const llama_batch & ba
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
} else {
|
||||||
|
// When batch.logits is nullptr (when loading state with a dummy batch),
|
||||||
|
// allocate CPU logits.
|
||||||
|
batch_needs_cpu_logits = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
size_t backend_float_count = 0;
|
size_t backend_float_count = 0;
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue