Commit Graph

35 Commits

Author SHA1 Message Date
Han Yin 6cde2fe1bd core: support GGML_CPU_ALL_VARIANTS on Android! 2025-10-28 11:39:18 -07:00
Han Yin c5058366dc lib: hide the internal implementations, only expose a facade and interfaces 2025-10-28 11:39:17 -07:00
Han Yin 4b3f6ef8d7 misc: rename LlamaAndroid related class to InferenceEngine prefixes 2025-10-28 11:39:17 -07:00
Han Yin a4c66c4baf nit: print current pp & tg in llama-bench 2025-10-28 11:39:17 -07:00
Han Yin 37f3e1c415 Feature: use local llama_context for benchmarking; support context init with custom context size 2025-10-28 11:39:16 -07:00
Han Yin 6d2279e9cd REWRITE JNI bridge; Update viewmodel 2025-10-28 11:39:16 -07:00
Han Yin e1bc87610e Perf: allocate `llama_batch` on stack with `llama_batch_init` 2025-10-28 11:39:16 -07:00
Han Yin 2b52563737 Polish: better logging & documentation 2025-10-28 11:39:16 -07:00
Han Yin ec502cfde9 Feature: implement infinite conversation via context shifting 2025-10-28 11:39:16 -07:00
Han Yin 4e515727b4 Abort on system prompt too long; Truncate user prompt if too long. 2025-10-28 11:39:16 -07:00
Han Yin 4809112ec5 Polish: adopt common naming; init modularization; 2025-10-28 11:39:16 -07:00
Han Yin 8bf2f4d412 Feature: chat template auto formatting 2025-10-28 11:39:16 -07:00
Han Yin bb5b824208 Polish: populate backend names in `benchModel` 2025-10-28 11:39:16 -07:00
Han Yin c14c11dcbd Feature: decode system and user prompt in batches 2025-10-28 11:39:16 -07:00
Han Yin 7bbb53aaf8 Clang-tidy linting: make functions & global variables static 2025-10-28 11:39:16 -07:00
Han Yin 0ade7fb4d7 Polish binding: Remove verbose setup JNI APIs; Update state machine states. 2025-10-28 11:39:16 -07:00
Han Yin 44720859d6 Rewrite llama-android JNI implementation 2025-10-28 11:39:15 -07:00
Han Yin d4ab3832cf Use common sampler 2025-10-28 11:39:15 -07:00
Han Yin 1f255d4bca Tidy & clean LLamaAndroid binding 2025-10-28 11:39:15 -07:00
Georgi Gerganov 745aa5319b
llama : deprecate llama_kv_self_ API (#14030)
* llama : deprecate llama_kv_self_ API

ggml-ci

* llama : allow llama_memory_(nullptr)

ggml-ci

* memory : add flag for optional data clear in llama_memory_clear

ggml-ci
2025-06-06 14:11:15 +03:00
Georgi Gerganov e0dbec0bc6
llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181)
* llama : refactor llama_context, llama_kv_cache, llm_build_context

ggml-ci

* graph : don't mutate the KV cache during defrag

ggml-ci

* context : reduce virtuals + remove test function

ggml-ci

* context : move interface implementation to source file + factory

ggml-ci

* graph : move KV cache build functions to llama_context impl

ggml-ci

* graph : remove model reference from build_pooling

ggml-ci

* graph : remove llama_model reference

ggml-ci

* kv_cache : provide rope factors

ggml-ci

* graph : rework inputs to use only unique_ptr, remove attn input abstraction

ggml-ci

* context : remove llama_context_i abstraction

ggml-ci

* context : clean-up

ggml-ci

* graph : clean-up

ggml-ci

* llama : remove redundant keywords (struct, enum)

ggml-ci

* model : adapt gemma3

ggml-ci

* graph : restore same attention ops as on master

ggml-ci

* llama : remove TODO + fix indent

ggml-ci
2025-03-13 12:35:44 +02:00
Han Yin 57b6abf85a
android : fix KV cache log message condition (#12212) 2025-03-06 08:22:49 +02:00
codezjx 3edfa7d375
llama.android: add field formatChat to control whether to parse special tokens when send message (#11270) 2025-01-17 14:57:56 +02:00
Georgi Gerganov afa8a9ec9b
llama : add `llama_vocab`, functions -> methods, naming (#11110)
* llama : functions -> methods (#11110)

* llama : add struct llama_vocab to the API (#11156)

ggml-ci

* hparams : move vocab params to llama_vocab (#11159)

ggml-ci

* vocab : more pimpl (#11165)

ggml-ci

* vocab : minor tokenization optimizations (#11160)

ggml-ci

Co-authored-by: Diego Devesa <slarengh@gmail.com>

* lora : update API names (#11167)

ggml-ci

* llama : update API names to use correct prefix (#11174)

* llama : update API names to use correct prefix

ggml-ci

* cont

ggml-ci

* cont

ggml-ci

* minor [no ci]

* vocab : llama_vocab_add_[be]os -> llama_vocab_get_add_[be]os (#11174)

ggml-ci

* vocab : llama_vocab_n_vocab -> llama_vocab_n_tokens (#11174)

ggml-ci

---------

Co-authored-by: Diego Devesa <slarengh@gmail.com>
2025-01-12 11:32:42 +02:00
ag2s20150909 c250ecb315
android : fix llama_batch free (#11014) 2024-12-30 14:35:13 +02:00
Xuan Son Nguyen cda0e4b648
llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745)
* refactor llama_batch_get_one

* adapt all examples

* fix simple.cpp

* fix llama_bench

* fix

* fix context shifting

* free batch before return

* use common_batch_add, reuse llama_batch in loop

* null terminated seq_id list

* fix save-load-state example

* fix perplexity

* correct token pos in llama_batch_allocr
2024-10-18 23:18:01 +02:00
Diego Devesa 7eee341bee
common : use common_ prefix for common library functions (#9805)
* common : use common_ prefix for common library functions

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-10-10 22:57:42 +02:00
slaren 5fb5e24811
llama : minor sampling refactor (2) (#9386) 2024-09-09 17:10:46 +02:00
Georgi Gerganov a5b5d9a101
llama.android : fix build (#9350) 2024-09-08 00:33:50 +03:00
Georgi Gerganov df270ef745
llama : refactor sampling v2 (#9294)
- Add `struct llama_sampler` and `struct llama_sampler_i`
- Add `llama_sampler_` API
- Add `llama_sampler_chain_` API for chaining multiple samplers
- Remove `LLAMA_API_INTERNAL`
- Add `llama_perf_` API and remove old `llama_print_timings` and `llama_reset_timings`
2024-09-07 15:16:19 +03:00
devojony b7c11d36e6
examples: fix android example cannot be generated continuously (#8621)
When generation ends `completion_loop()` should return a NULL, not the empty string
2024-07-22 09:54:42 +03:00
Raj Hammeer Singh Hada ac146628e4
Fix llama-android.cpp for error - "common/common.h not found" (#8145)
- Path seems to be wrong for the common.h header file in llama-android.cpp file. Fixing the path so the Android Build doesn't fail with the error "There is no file common/common.h"
2024-06-27 03:57:57 +02:00
Elton Kola 9791f40258
android : module (#7502)
* move ndk code to a new library

* add gradle file
2024-05-25 11:11:33 +03:00
Brian 1265c670fd
Revert "move ndk code to a new library (#6951)" (#7282)
This reverts commit efc8f767c8.
2024-05-14 16:10:39 +03:00
Elton Kola efc8f767c8
move ndk code to a new library (#6951) 2024-05-14 17:30:30 +10:00
Renamed from examples/llama.android/app/src/main/cpp/llama-android.cpp (Browse further)