Han Yin
6cde2fe1bd
core: support GGML_CPU_ALL_VARIANTS on Android!
2025-10-28 11:39:18 -07:00
Han Yin
c5058366dc
lib: hide the internal implementations, only expose a facade and interfaces
2025-10-28 11:39:17 -07:00
Han Yin
4b3f6ef8d7
misc: rename LlamaAndroid related class to InferenceEngine prefixes
2025-10-28 11:39:17 -07:00
Han Yin
a4c66c4baf
nit: print current pp & tg in llama-bench
2025-10-28 11:39:17 -07:00
Han Yin
37f3e1c415
Feature: use local llama_context for benchmarking; support context init with custom context size
2025-10-28 11:39:16 -07:00
Han Yin
6d2279e9cd
REWRITE JNI bridge; Update viewmodel
2025-10-28 11:39:16 -07:00
Han Yin
e1bc87610e
Perf: allocate `llama_batch` on stack with `llama_batch_init`
2025-10-28 11:39:16 -07:00
Han Yin
2b52563737
Polish: better logging & documentation
2025-10-28 11:39:16 -07:00
Han Yin
ec502cfde9
Feature: implement infinite conversation via context shifting
2025-10-28 11:39:16 -07:00
Han Yin
4e515727b4
Abort on system prompt too long; Truncate user prompt if too long.
2025-10-28 11:39:16 -07:00
Han Yin
4809112ec5
Polish: adopt common naming; init modularization;
2025-10-28 11:39:16 -07:00
Han Yin
8bf2f4d412
Feature: chat template auto formatting
2025-10-28 11:39:16 -07:00
Han Yin
bb5b824208
Polish: populate backend names in `benchModel`
2025-10-28 11:39:16 -07:00
Han Yin
c14c11dcbd
Feature: decode system and user prompt in batches
2025-10-28 11:39:16 -07:00
Han Yin
7bbb53aaf8
Clang-tidy linting: make functions & global variables static
2025-10-28 11:39:16 -07:00
Han Yin
0ade7fb4d7
Polish binding: Remove verbose setup JNI APIs; Update state machine states.
2025-10-28 11:39:16 -07:00
Han Yin
44720859d6
Rewrite llama-android JNI implementation
2025-10-28 11:39:15 -07:00
Han Yin
d4ab3832cf
Use common sampler
2025-10-28 11:39:15 -07:00
Han Yin
1f255d4bca
Tidy & clean LLamaAndroid binding
2025-10-28 11:39:15 -07:00
Georgi Gerganov
745aa5319b
llama : deprecate llama_kv_self_ API ( #14030 )
...
* llama : deprecate llama_kv_self_ API
ggml-ci
* llama : allow llama_memory_(nullptr)
ggml-ci
* memory : add flag for optional data clear in llama_memory_clear
ggml-ci
2025-06-06 14:11:15 +03:00
Georgi Gerganov
e0dbec0bc6
llama : refactor llama_context, llama_kv_cache, llm_build_context ( #12181 )
...
* llama : refactor llama_context, llama_kv_cache, llm_build_context
ggml-ci
* graph : don't mutate the KV cache during defrag
ggml-ci
* context : reduce virtuals + remove test function
ggml-ci
* context : move interface implementation to source file + factory
ggml-ci
* graph : move KV cache build functions to llama_context impl
ggml-ci
* graph : remove model reference from build_pooling
ggml-ci
* graph : remove llama_model reference
ggml-ci
* kv_cache : provide rope factors
ggml-ci
* graph : rework inputs to use only unique_ptr, remove attn input abstraction
ggml-ci
* context : remove llama_context_i abstraction
ggml-ci
* context : clean-up
ggml-ci
* graph : clean-up
ggml-ci
* llama : remove redundant keywords (struct, enum)
ggml-ci
* model : adapt gemma3
ggml-ci
* graph : restore same attention ops as on master
ggml-ci
* llama : remove TODO + fix indent
ggml-ci
2025-03-13 12:35:44 +02:00
Han Yin
57b6abf85a
android : fix KV cache log message condition ( #12212 )
2025-03-06 08:22:49 +02:00
codezjx
3edfa7d375
llama.android: add field formatChat to control whether to parse special tokens when send message ( #11270 )
2025-01-17 14:57:56 +02:00
Georgi Gerganov
afa8a9ec9b
llama : add `llama_vocab`, functions -> methods, naming ( #11110 )
...
* llama : functions -> methods (#11110 )
* llama : add struct llama_vocab to the API (#11156 )
ggml-ci
* hparams : move vocab params to llama_vocab (#11159 )
ggml-ci
* vocab : more pimpl (#11165 )
ggml-ci
* vocab : minor tokenization optimizations (#11160 )
ggml-ci
Co-authored-by: Diego Devesa <slarengh@gmail.com>
* lora : update API names (#11167 )
ggml-ci
* llama : update API names to use correct prefix (#11174 )
* llama : update API names to use correct prefix
ggml-ci
* cont
ggml-ci
* cont
ggml-ci
* minor [no ci]
* vocab : llama_vocab_add_[be]os -> llama_vocab_get_add_[be]os (#11174 )
ggml-ci
* vocab : llama_vocab_n_vocab -> llama_vocab_n_tokens (#11174 )
ggml-ci
---------
Co-authored-by: Diego Devesa <slarengh@gmail.com>
2025-01-12 11:32:42 +02:00
ag2s20150909
c250ecb315
android : fix llama_batch free ( #11014 )
2024-12-30 14:35:13 +02:00
Xuan Son Nguyen
cda0e4b648
llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch ( #9745 )
...
* refactor llama_batch_get_one
* adapt all examples
* fix simple.cpp
* fix llama_bench
* fix
* fix context shifting
* free batch before return
* use common_batch_add, reuse llama_batch in loop
* null terminated seq_id list
* fix save-load-state example
* fix perplexity
* correct token pos in llama_batch_allocr
2024-10-18 23:18:01 +02:00
Diego Devesa
7eee341bee
common : use common_ prefix for common library functions ( #9805 )
...
* common : use common_ prefix for common library functions
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-10-10 22:57:42 +02:00
slaren
5fb5e24811
llama : minor sampling refactor (2) ( #9386 )
2024-09-09 17:10:46 +02:00
Georgi Gerganov
a5b5d9a101
llama.android : fix build ( #9350 )
2024-09-08 00:33:50 +03:00
Georgi Gerganov
df270ef745
llama : refactor sampling v2 ( #9294 )
...
- Add `struct llama_sampler` and `struct llama_sampler_i`
- Add `llama_sampler_` API
- Add `llama_sampler_chain_` API for chaining multiple samplers
- Remove `LLAMA_API_INTERNAL`
- Add `llama_perf_` API and remove old `llama_print_timings` and `llama_reset_timings`
2024-09-07 15:16:19 +03:00
devojony
b7c11d36e6
examples: fix android example cannot be generated continuously ( #8621 )
...
When generation ends `completion_loop()` should return a NULL, not the empty string
2024-07-22 09:54:42 +03:00
Raj Hammeer Singh Hada
ac146628e4
Fix llama-android.cpp for error - "common/common.h not found" ( #8145 )
...
- Path seems to be wrong for the common.h header file in llama-android.cpp file. Fixing the path so the Android Build doesn't fail with the error "There is no file common/common.h"
2024-06-27 03:57:57 +02:00
Elton Kola
9791f40258
android : module ( #7502 )
...
* move ndk code to a new library
* add gradle file
2024-05-25 11:11:33 +03:00
Brian
1265c670fd
Revert "move ndk code to a new library ( #6951 )" ( #7282 )
...
This reverts commit efc8f767c8 .
2024-05-14 16:10:39 +03:00
Elton Kola
efc8f767c8
move ndk code to a new library ( #6951 )
2024-05-14 17:30:30 +10:00