llama.cpp

Commit Graph

Author	SHA1	Message	Date
Han Yin	af0d68d611	nit: combine temperatureMetrics and useFahrenheit	2025-10-28 11:39:16 -07:00
Han Yin	5e4972e93e	UI: refactor top app bars	2025-10-28 11:39:16 -07:00
Han Yin	2a41c0e354	vm: replace token metrics stubs with actual implementation	2025-10-28 11:39:16 -07:00
Han Yin	e47e3b77ee	UI: locks user in alert dialog when model is unloading	2025-10-28 11:39:16 -07:00
Han Yin	6b341b0fbe	bugfix: handle user quitting on model loading	2025-10-28 11:39:16 -07:00
Han Yin	e8b84c6ebf	UI: code polish	2025-10-28 11:39:16 -07:00
Han Yin	fddf060d92	data: code polish	2025-10-28 11:39:16 -07:00
Han Yin	3b499ac7e4	UI: polish conversation screen	2025-10-28 11:39:16 -07:00
Han Yin	64ebdc67a6	UI: update app name to be more Arm	2025-10-28 11:39:16 -07:00
Han Yin	55681847e9	UI: rename `ModeSelection` to `ModelLoading` for better clarity	2025-10-28 11:39:16 -07:00
Han Yin	75c986afc5	bugfix: properly handle user's quitting conversation screen while tokens in generation	2025-10-28 11:39:16 -07:00
Han Yin	4848bf93d0	data: introduce repo for System Prompt; flow data from Room to VM	2025-10-28 11:39:16 -07:00
Han Yin	5596d5203b	DB: setup Room database	2025-10-28 11:39:16 -07:00
Han Yin	4046cd16fd	Deps: bump Kotlin plugin; introduce KSP; apply in :app subproject	2025-10-28 11:39:16 -07:00
Han Yin	5868eaa66b	UI: polish system prompt setup UI	2025-10-28 11:39:16 -07:00
Han Yin	a7ee3d305f	UI: split a nested parent settings screen into separate child settings screens	2025-10-28 11:39:16 -07:00
Han Yin	65c09b2b32	UI: allow drawer's gesture control only on Home and Settings screens; enable alert dialog on back navigation inside conversation and benchmark	2025-10-28 11:39:16 -07:00
Han Yin	648b97818e	UI: disable triggering drawer via gesture; enable alert dialog on back navigation inside conversation and benchmark	2025-10-28 11:39:16 -07:00
Han Yin	a7ae8b7ce0	[WIP] DI: implement simple local vm factory provider	2025-10-28 11:39:16 -07:00
Han Yin	ca2b7772ce	UI: add a new MainActivity; update manifest	2025-10-28 11:39:16 -07:00
Han Yin	7e5c80cee9	UI: implement core flow's screens	2025-10-28 11:39:16 -07:00
Han Yin	5ad65919e9	util: implement user preferences utility	2025-10-28 11:39:16 -07:00
Han Yin	46bd638c5f	util: implement performance monitor; wrap it with a viewmodel	2025-10-28 11:39:16 -07:00
Han Yin	4dd755e25b	UI: implement basic UI components	2025-10-28 11:39:16 -07:00
Han Yin	32608fb225	UI: app navigation	2025-10-28 11:39:16 -07:00
Han Yin	3f913ce440	LLM: stub a local inference engine for faster iteration	2025-10-28 11:39:16 -07:00
Han Yin	3787fbddb0	data: define data models for LLM and system prompts	2025-10-28 11:39:16 -07:00
Han Yin	697d778db7	UI: define theme, color palette, typography and shape	2025-10-28 11:39:16 -07:00
Han Yin	cbe7133742	UI: introduce new dependencies, update versions & references	2025-10-28 11:39:16 -07:00
Han Yin	44a522dbc8	UI: move existing UI src files into `legacy` package	2025-10-28 11:39:16 -07:00
Han Yin	37f3e1c415	Feature: use local llama_context for benchmarking; support context init with custom context size	2025-10-28 11:39:16 -07:00
Han Yin	6d2279e9cd	REWRITE JNI bridge; Update viewmodel	2025-10-28 11:39:16 -07:00
Han Yin	e1bc87610e	Perf: allocate `llama_batch` on stack with `llama_batch_init`	2025-10-28 11:39:16 -07:00
Han Yin	2b52563737	Polish: better logging & documentation	2025-10-28 11:39:16 -07:00
Han Yin	ec502cfde9	Feature: implement infinite conversation via context shifting	2025-10-28 11:39:16 -07:00
Han Yin	4e515727b4	Abort on system prompt too long; Truncate user prompt if too long.	2025-10-28 11:39:16 -07:00
Han Yin	4809112ec5	Polish: adopt common naming; init modularization;	2025-10-28 11:39:16 -07:00
Han Yin	8bf2f4d412	Feature: chat template auto formatting	2025-10-28 11:39:16 -07:00
Han Yin	1b0754c0f5	Perf: optimize performance with ARM features	2025-10-28 11:39:16 -07:00
Han Yin	bb5b824208	Polish: populate backend names in `benchModel`	2025-10-28 11:39:16 -07:00
Han Yin	c14c11dcbd	Feature: decode system and user prompt in batches	2025-10-28 11:39:16 -07:00
Han Yin	02465137ca	Bug fix: null system prompt state update; Safeguard empty user prompt	2025-10-28 11:39:16 -07:00
Han Yin	7bbb53aaf8	Clang-tidy linting: make functions & global variables static	2025-10-28 11:39:16 -07:00
Han Yin	f44882aeeb	Enforce centralized dependency management; bump Gradle & deps versions	2025-10-28 11:39:16 -07:00
Han Yin	0ade7fb4d7	Polish binding: Remove verbose setup JNI APIs; Update state machine states.	2025-10-28 11:39:16 -07:00
Han Yin	7dc9968f82	Restructure `LLamaAndroid.kt`	2025-10-28 11:39:16 -07:00
Han Yin	44720859d6	Rewrite llama-android JNI implementation	2025-10-28 11:39:15 -07:00
Han Yin	d4ab3832cf	Use common sampler	2025-10-28 11:39:15 -07:00
Han Yin	1f255d4bca	Tidy & clean LLamaAndroid binding	2025-10-28 11:39:15 -07:00
Daniel Bevenius	56b4795842	model-conversion : add support for SentenceTransformers (#16387 ) * model-conversion : add support for SentenceTransformers This commit adds support for models that use SentenceTransformer layers. The motivation for this is that if converted model includes any of the numbered layers specified in the original models repository then these changes enable these models to be used and verified. Currently the model-conversion only support the base model output without any of the additional transformation layers. Usage: Convert the model that also includes the SentenceTransformer layers: ```console (venv) $ export EMBEDDING_MODEL_PATH="~/google/embeddinggemma-300M" (venv) make embedding-convert-model ``` Verify the produced embeddings from the converted model against the original model embeddings: ```console (venv) make embedding-verify-logits-st ``` The original model can be run using SentenceTransformer: ```console (venv) make embedding-run-original-model-st ``` Run the converted model using "SentenceTransformer" layers whic enables pooling and normalization: ```console (venv) make embedding-run-converted-model-st ``` * add model-conversion example requirements * add support for -st flag in embedding model conversion This commit add support for the -st flag in the embedding model conversion script. This will enable models to be converted using sentence transformers dense layers.	2025-10-09 14:35:22 +02:00

1 2 3 4 5 ...

1589 Commits