As there could be failure wrt getting the response from the ai server some where in between a long response spread over multiple parts, the logic uses the latestResponse to cache the response as it is being received. However once the full response is got, one needs to transfer it to a new instance of AssistantResponse class, so that latestResponse can be cleared, while the new instance can be used in other locations in the flow as needed. Achieve the same now. |
||
|---|---|---|
| .. | ||
| batched-bench | ||
| cvector-generator | ||
| export-lora | ||
| gguf-split | ||
| imatrix | ||
| llama-bench | ||
| main | ||
| mtmd | ||
| perplexity | ||
| quantize | ||
| rpc | ||
| run | ||
| server | ||
| tokenize | ||
| tts | ||
| CMakeLists.txt | ||