* fix test failure * fix: correct scaling calculations in rope_cache_init * fix: optimize element copying in rope_hex_f32 using memcpy * fix: optimize loop boundaries in rope_hex_f32 for better performance * feat: add profiling macros for performance measurement in operations |
||
|---|---|---|
| .. | ||
| htp | ||
| CMakeLists.txt | ||
| ggml-hexagon.cpp | ||
| htp-utils.c | ||
| htp-utils.h | ||