Han Yin
|
659f59e22a
|
UI: update Arm features indicator; fix the broken hyperlinks
|
2025-10-28 11:39:18 -07:00 |
Han Yin
|
518d042e24
|
lib: add UnsupportedArchitectureException for triaged error message
|
2025-10-28 11:39:18 -07:00 |
Han Yin
|
173c4c61a4
|
core: verify model file path is readable
|
2025-10-28 11:39:18 -07:00 |
Han Yin
|
ca1cda37fd
|
lib: fix the `SIMD` typo in Tier description
|
2025-10-28 11:39:18 -07:00 |
Han Yin
|
46e82c09f6
|
lib: refactor the GgufMetadataReader to take InputStream instead of absolute path as argument
|
2025-10-28 11:39:18 -07:00 |
Han Yin
|
381994234c
|
lib: read & validate the magic number from the picked source file before executing the import
|
2025-10-28 11:39:18 -07:00 |
Han Yin
|
dd5b20d74d
|
llm: properly propagate error to UI upon failing to load selected model
|
2025-10-28 11:39:18 -07:00 |
Han Yin
|
a5a54375a2
|
lib: tested on JFrog Artifactory for Maven publishing
|
2025-10-28 11:39:18 -07:00 |
Han Yin
|
4ff924b273
|
lib: optimize engine loader; always perform a fresh detection when cache is null
|
2025-10-28 11:39:17 -07:00 |
Han Yin
|
e6413dd05d
|
UI: support `NONE` Llama Tier in general settings
|
2025-10-28 11:39:17 -07:00 |
Han Yin
|
1f41ae2315
|
lib: refactored InferenceEngineLoader; added a `NONE` Llama Tier
|
2025-10-28 11:39:17 -07:00 |
Han Yin
|
21e61281fa
|
lib: expose Arm features
|
2025-10-28 11:39:17 -07:00 |
Han Yin
|
c5058366dc
|
lib: hide the internal implementations, only expose a facade and interfaces
|
2025-10-28 11:39:17 -07:00 |
Han Yin
|
57c3a9dda7
|
lib: replace the naive & plain SharedPreferences with DataStore implementation
|
2025-10-28 11:39:17 -07:00 |
Han Yin
|
130cba9aa6
|
lib: expose GgufMetadataReader as interface only
|
2025-10-28 11:39:17 -07:00 |
Han Yin
|
6a5bc94ff1
|
[WIP] lib: move GgufMetadata into the lib submodule
|
2025-10-28 11:39:17 -07:00 |
Han Yin
|
4b3f6ef8d7
|
misc: rename LlamaAndroid related class to InferenceEngine prefixes
|
2025-10-28 11:39:17 -07:00 |
Han Yin
|
b59c59e5c3
|
core: add back OpenMP due to huge perf loss on TG128
|
2025-10-28 11:39:17 -07:00 |
Han Yin
|
53ac8af67a
|
core: swap out hardcoded LlamaAndroid library loading
|
2025-10-28 11:39:17 -07:00 |
Han Yin
|
1b79db877d
|
core: implement cpu_detector native lib
|
2025-10-28 11:39:17 -07:00 |
Han Yin
|
98c8f5e59e
|
[WIP] llama: enable KleidiAI and disable tier 4 due to `+sve+sve2` bug caused by `ggml_add_cpu_backend_variant_impl` as explained below
```CMake
if (NOT SME_ENABLED MATCHES -1)
...
set(PRIVATE_ARCH_FLAGS "-fno-tree-vectorize;${PRIVATE_ARCH_FLAGS}+sve+sve2")
...
```
|
2025-10-28 11:39:17 -07:00 |
Han Yin
|
ead41ff655
|
[WIP] llama: disable OpenMP in ABI split since most SoCs are big.LITTLE
|
2025-10-28 11:39:17 -07:00 |
Han Yin
|
3884bbcb86
|
[WIP] llama: ABI split where five tiers are built sequentially.
|
2025-10-28 11:39:17 -07:00 |
Han Yin
|
75d1abe24a
|
[WIP] llama: ABI split builds five .so artifacts.
However, all .so are performing on SVE level
|
2025-10-28 11:39:17 -07:00 |
Han Yin
|
eab502a735
|
llama: migrate C/CXX flags into CMakeList
|
2025-10-28 11:39:17 -07:00 |
Han Yin
|
a4c66c4baf
|
nit: print current pp & tg in llama-bench
|
2025-10-28 11:39:17 -07:00 |
Han Yin
|
d1b018e375
|
UI: show a Snack bar to warn user that system prompt is not always supported
|
2025-10-28 11:39:17 -07:00 |
Han Yin
|
e1c77c6bbd
|
LLama: add a new Initializing state; ; add two extension properties; rename LibraryLoaded state to Initialized
|
2025-10-28 11:39:17 -07:00 |
Han Yin
|
c08d02d233
|
LLama: add ModelUnloadingState to engine State; add missing state checks in stub engine; fix instrumentation engine's error messages
|
2025-10-28 11:39:17 -07:00 |
Han Yin
|
65d4a57a8b
|
LLama: refactor loadModel by splitting the system prompt setting into a separate method
|
2025-10-28 11:39:16 -07:00 |
Han Yin
|
46859c10f0
|
LLama: update engine state after handling the cancellation of sendUserPrompt
|
2025-10-28 11:39:16 -07:00 |
Han Yin
|
d70b8fe323
|
core: swap in LLamaAndroid and mark stub engine for testing only
|
2025-10-28 11:39:16 -07:00 |
Han Yin
|
cbe7133742
|
UI: introduce new dependencies, update versions & references
|
2025-10-28 11:39:16 -07:00 |
Han Yin
|
37f3e1c415
|
Feature: use local llama_context for benchmarking; support context init with custom context size
|
2025-10-28 11:39:16 -07:00 |
Han Yin
|
6d2279e9cd
|
REWRITE JNI bridge; Update viewmodel
|
2025-10-28 11:39:16 -07:00 |
Han Yin
|
e1bc87610e
|
Perf: allocate `llama_batch` on stack with `llama_batch_init`
|
2025-10-28 11:39:16 -07:00 |
Han Yin
|
2b52563737
|
Polish: better logging & documentation
|
2025-10-28 11:39:16 -07:00 |
Han Yin
|
ec502cfde9
|
Feature: implement infinite conversation via context shifting
|
2025-10-28 11:39:16 -07:00 |
Han Yin
|
4e515727b4
|
Abort on system prompt too long; Truncate user prompt if too long.
|
2025-10-28 11:39:16 -07:00 |
Han Yin
|
4809112ec5
|
Polish: adopt common naming; init modularization;
|
2025-10-28 11:39:16 -07:00 |
Han Yin
|
8bf2f4d412
|
Feature: chat template auto formatting
|
2025-10-28 11:39:16 -07:00 |
Han Yin
|
1b0754c0f5
|
Perf: optimize performance with ARM features
|
2025-10-28 11:39:16 -07:00 |
Han Yin
|
bb5b824208
|
Polish: populate backend names in `benchModel`
|
2025-10-28 11:39:16 -07:00 |
Han Yin
|
c14c11dcbd
|
Feature: decode system and user prompt in batches
|
2025-10-28 11:39:16 -07:00 |
Han Yin
|
02465137ca
|
Bug fix: null system prompt state update; Safeguard empty user prompt
|
2025-10-28 11:39:16 -07:00 |
Han Yin
|
7bbb53aaf8
|
Clang-tidy linting: make functions & global variables static
|
2025-10-28 11:39:16 -07:00 |
Han Yin
|
f44882aeeb
|
Enforce centralized dependency management; bump Gradle & deps versions
|
2025-10-28 11:39:16 -07:00 |
Han Yin
|
0ade7fb4d7
|
Polish binding: Remove verbose setup JNI APIs; Update state machine states.
|
2025-10-28 11:39:16 -07:00 |
Han Yin
|
7dc9968f82
|
Restructure `LLamaAndroid.kt`
|
2025-10-28 11:39:16 -07:00 |
Han Yin
|
44720859d6
|
Rewrite llama-android JNI implementation
|
2025-10-28 11:39:15 -07:00 |