llama.cpp

Commit Graph

Author	SHA1	Message	Date
Han Yin	b92c6dc2f6	build: [BREAKING] bump the versions of libraries and plugins	2025-10-28 11:39:18 -07:00
Han Yin	659f59e22a	UI: update Arm features indicator; fix the broken hyperlinks	2025-10-28 11:39:18 -07:00
Han Yin	518d042e24	lib: add UnsupportedArchitectureException for triaged error message	2025-10-28 11:39:18 -07:00
Han Yin	173c4c61a4	core: verify model file path is readable	2025-10-28 11:39:18 -07:00
Han Yin	ca1cda37fd	lib: fix the `SIMD` typo in Tier description	2025-10-28 11:39:18 -07:00
Han Yin	46e82c09f6	lib: refactor the GgufMetadataReader to take InputStream instead of absolute path as argument	2025-10-28 11:39:18 -07:00
Han Yin	381994234c	lib: read & validate the magic number from the picked source file before executing the import	2025-10-28 11:39:18 -07:00
Han Yin	dd5b20d74d	llm: properly propagate error to UI upon failing to load selected model	2025-10-28 11:39:18 -07:00
Han Yin	a5a54375a2	lib: tested on JFrog Artifactory for Maven publishing	2025-10-28 11:39:18 -07:00
Han Yin	4ff924b273	lib: optimize engine loader; always perform a fresh detection when cache is null	2025-10-28 11:39:17 -07:00
Han Yin	e6413dd05d	UI: support `NONE` Llama Tier in general settings	2025-10-28 11:39:17 -07:00
Han Yin	1f41ae2315	lib: refactored InferenceEngineLoader; added a `NONE` Llama Tier	2025-10-28 11:39:17 -07:00
Han Yin	21e61281fa	lib: expose Arm features	2025-10-28 11:39:17 -07:00
Han Yin	c5058366dc	lib: hide the internal implementations, only expose a facade and interfaces	2025-10-28 11:39:17 -07:00
Han Yin	57c3a9dda7	lib: replace the naive & plain SharedPreferences with DataStore implementation	2025-10-28 11:39:17 -07:00
Han Yin	130cba9aa6	lib: expose GgufMetadataReader as interface only	2025-10-28 11:39:17 -07:00
Han Yin	6a5bc94ff1	[WIP] lib: move GgufMetadata into the lib submodule	2025-10-28 11:39:17 -07:00
Han Yin	4b3f6ef8d7	misc: rename LlamaAndroid related class to InferenceEngine prefixes	2025-10-28 11:39:17 -07:00
Han Yin	b59c59e5c3	core: add back OpenMP due to huge perf loss on TG128	2025-10-28 11:39:17 -07:00
Han Yin	53ac8af67a	core: swap out hardcoded LlamaAndroid library loading	2025-10-28 11:39:17 -07:00
Han Yin	1b79db877d	core: implement cpu_detector native lib	2025-10-28 11:39:17 -07:00
Han Yin	98c8f5e59e	[WIP] llama: enable KleidiAI and disable tier 4 due to `+sve+sve2` bug caused by `ggml_add_cpu_backend_variant_impl` as explained below ```CMake if (NOT SME_ENABLED MATCHES -1) ... set(PRIVATE_ARCH_FLAGS "-fno-tree-vectorize;${PRIVATE_ARCH_FLAGS}+sve+sve2") ... ```	2025-10-28 11:39:17 -07:00
Han Yin	ead41ff655	[WIP] llama: disable OpenMP in ABI split since most SoCs are big.LITTLE	2025-10-28 11:39:17 -07:00
Han Yin	3884bbcb86	[WIP] llama: ABI split where five tiers are built sequentially.	2025-10-28 11:39:17 -07:00
Han Yin	75d1abe24a	[WIP] llama: ABI split builds five .so artifacts. However, all .so are performing on SVE level	2025-10-28 11:39:17 -07:00
Han Yin	eab502a735	llama: migrate C/CXX flags into CMakeList	2025-10-28 11:39:17 -07:00
Han Yin	a4c66c4baf	nit: print current pp & tg in llama-bench	2025-10-28 11:39:17 -07:00
Han Yin	d1b018e375	UI: show a Snack bar to warn user that system prompt is not always supported	2025-10-28 11:39:17 -07:00
Han Yin	e1c77c6bbd	LLama: add a new Initializing state; ; add two extension properties; rename LibraryLoaded state to Initialized	2025-10-28 11:39:17 -07:00
Han Yin	c08d02d233	LLama: add ModelUnloadingState to engine State; add missing state checks in stub engine; fix instrumentation engine's error messages	2025-10-28 11:39:17 -07:00
Han Yin	65d4a57a8b	LLama: refactor loadModel by splitting the system prompt setting into a separate method	2025-10-28 11:39:16 -07:00
Han Yin	46859c10f0	LLama: update engine state after handling the cancellation of sendUserPrompt	2025-10-28 11:39:16 -07:00
Han Yin	d70b8fe323	core: swap in LLamaAndroid and mark stub engine for testing only	2025-10-28 11:39:16 -07:00
Han Yin	cbe7133742	UI: introduce new dependencies, update versions & references	2025-10-28 11:39:16 -07:00
Han Yin	37f3e1c415	Feature: use local llama_context for benchmarking; support context init with custom context size	2025-10-28 11:39:16 -07:00
Han Yin	6d2279e9cd	REWRITE JNI bridge; Update viewmodel	2025-10-28 11:39:16 -07:00
Han Yin	e1bc87610e	Perf: allocate `llama_batch` on stack with `llama_batch_init`	2025-10-28 11:39:16 -07:00
Han Yin	2b52563737	Polish: better logging & documentation	2025-10-28 11:39:16 -07:00
Han Yin	ec502cfde9	Feature: implement infinite conversation via context shifting	2025-10-28 11:39:16 -07:00
Han Yin	4e515727b4	Abort on system prompt too long; Truncate user prompt if too long.	2025-10-28 11:39:16 -07:00
Han Yin	4809112ec5	Polish: adopt common naming; init modularization;	2025-10-28 11:39:16 -07:00
Han Yin	8bf2f4d412	Feature: chat template auto formatting	2025-10-28 11:39:16 -07:00
Han Yin	1b0754c0f5	Perf: optimize performance with ARM features	2025-10-28 11:39:16 -07:00
Han Yin	bb5b824208	Polish: populate backend names in `benchModel`	2025-10-28 11:39:16 -07:00
Han Yin	c14c11dcbd	Feature: decode system and user prompt in batches	2025-10-28 11:39:16 -07:00
Han Yin	02465137ca	Bug fix: null system prompt state update; Safeguard empty user prompt	2025-10-28 11:39:16 -07:00
Han Yin	7bbb53aaf8	Clang-tidy linting: make functions & global variables static	2025-10-28 11:39:16 -07:00
Han Yin	f44882aeeb	Enforce centralized dependency management; bump Gradle & deps versions	2025-10-28 11:39:16 -07:00
Han Yin	0ade7fb4d7	Polish binding: Remove verbose setup JNI APIs; Update state machine states.	2025-10-28 11:39:16 -07:00
Han Yin	7dc9968f82	Restructure `LLamaAndroid.kt`	2025-10-28 11:39:16 -07:00

1 2

74 Commits