mirror of https://github.com/google/gemma.cpp.git
This computes softmax on the top-K logits, instead of computing softmax first and then getting top-K probs. So we end up avoiding renormalizing too. Additionally, modify softmax to do temperature scaling, if temp != 1.0 PiperOrigin-RevId: 727702149 |
||
|---|---|---|
| .. | ||
| bench_matmul.cc | ||
| dot-inl.h | ||
| dot_test.cc | ||
| fp_arith-inl.h | ||
| gemma_matvec_test.cc | ||
| matmul-inl.h | ||
| matmul.h | ||
| matmul_test.cc | ||
| matmul_unit_test.cc | ||
| matvec-inl.h | ||
| ops-inl.h | ||
| ops.h | ||
| ops_test.cc | ||
| sum-inl.h | ||