llama.cpp

Author	SHA1	Message	Date
Matt Grosso	c0c95edc89	rebase, handle sometimes smaller embd & new type 0. initialize dest with zeros 1. embd is not a vector anymore 2. embd size from embd_size may be smaller than batch_tokens because it honors logits array, so we use ctx->n_outputs to bound our embd outer loop. 3. Remove the batch_tokens foot-gun parameter since we have authoritative information on the size of the embedding outputs from the context. 4. improve comment docs 5. incorporate new usage for gritlm example	2024-04-18 17:00:46 -07:00
Matt Grosso	798c29d6b9	gritlm example using llama_get_embeddings_mean_pooled	2024-04-17 17:51:34 -07:00
Georgi Gerganov	0fd6c1f015	embedding : print cosine similarity (#899 )	2024-03-14 10:12:29 +02:00
DAN™	bcebd7dbf6	llama : add support for GritLM (#5959 ) * add gritlm example * gritlm results match * tabs to spaces * comment out debug printing * rebase to new embed * gritlm embeddings are back babeee * add to gitignore * allow to toggle embedding mode * Clean-up GritLM sample code. * Fix types. * Flush stdout and output ending newline if streaming. * mostly style fixes; correct KQ_mask comment * add causal_attn flag to llama_cparams * gritml : minor * llama : minor --------- Co-authored-by: Douglas Hanley <thesecretaryofwar@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-03-10 17:56:30 +02:00