Commit Graph

8 Commits

Author SHA1 Message Date
Georgi Gerganov 2affd0b221
unicode : set bomb 2024-04-27 11:56:02 +03:00
Georgi Gerganov ad929833cb
llama : adapt punctuation regex + add llama 3 regex 2024-04-27 11:06:08 +03:00
Georgi Gerganov 06d3e693db
unicode : fix? unicode_wstring_to_utf8 2024-04-26 12:55:11 +03:00
Kazim Abrar Mahi 753580360b
Fixed issues 2024-04-26 11:43:29 +03:00
Kazim Abrar Mahi feeaf4f39c
Added needed functionality, testing remains 2024-04-26 11:43:29 +03:00
Kazim Abrar Mahi 7e308ed212
Adding unicode regex function 2024-04-26 11:43:29 +03:00
Kazim Abrar Mahi a5710a4101
Adding unicode regex mappings 2024-04-26 11:43:29 +03:00
Jared Van Bortel 32c8486e1f
wpm : portable unicode tolower (#6305)
Also use C locale for ispunct/isspace, and split unicode-data.cpp from unicode.cpp.
2024-03-26 17:46:21 -04:00