Han Yin
7540c2a8b9
nit: refactor data.local package structure
2025-10-28 11:39:17 -07:00
Han Yin
7ed79319e5
GGUF: make GgufMetadata serializable in order to be compatible with Room
2025-10-28 11:39:17 -07:00
Han Yin
8ae0c3d2fa
DB: introduce Kotlin serialization extension's library and plugin; add Room runtime library
2025-10-28 11:39:17 -07:00
Han Yin
67499727ef
gguf: add GGUF metadata data holder and its corresponding extractor implementation
2025-10-28 11:39:17 -07:00
Han Yin
a9466c0370
navigation: sink model loading state management from AppContent down into ModelLoadingScreen; pass ModelLoadingMetrics to Benchmark and Conversation screens
2025-10-28 11:39:17 -07:00
Han Yin
8a682ff85d
core: throw Exception instead of returning null if model fails to load
2025-10-28 11:39:17 -07:00
Han Yin
f313362ced
nit: polish ModelLoadingScreen UI
2025-10-28 11:39:17 -07:00
Han Yin
1d508f367e
UI: update AppContent to pass in correct navigation callbacks
2025-10-28 11:39:17 -07:00
Han Yin
0d65c4b06b
nit: extract app name into a constant value; remove unused onBackPressed callbacks
2025-10-28 11:39:17 -07:00
Han Yin
9f1d26ac95
UI: migrate ConversationViewModel onto ModelLoadingViewModel; update & refine ConversationScreen
2025-10-28 11:39:17 -07:00
Han Yin
cb508be782
UI: migrate ModelLoadingScreen onto ModelLoadingViewModel; update & refine ModelLoadingScreen
2025-10-28 11:39:17 -07:00
Han Yin
f61c512223
UI: expose a single facade ModelUnloadDialogHandler; move UnloadModelState into ModelUnloadingViewModel.kt
2025-10-28 11:39:17 -07:00
Han Yin
c5a3ac7eb1
UI: Introduce an abstract ViewModel to handle additional model unloading logics
2025-10-28 11:39:17 -07:00
Han Yin
e1c77c6bbd
LLama: add a new Initializing state; ; add two extension properties; rename LibraryLoaded state to Initialized
2025-10-28 11:39:17 -07:00
Han Yin
ba40d689a1
UI: implement BenchmarkScreen's individual back handling
2025-10-28 11:39:17 -07:00
Han Yin
8203ddb97a
UI: refactor back handling by removing centralized BackHandlerSetup and UnloadModelConfirmationDialog from AppContent
2025-10-28 11:39:17 -07:00
Han Yin
c08d02d233
LLama: add ModelUnloadingState to engine State; add missing state checks in stub engine; fix instrumentation engine's error messages
2025-10-28 11:39:17 -07:00
Han Yin
481ba6e9d3
UI: remove code duplication in sort menu
2025-10-28 11:39:17 -07:00
Han Yin
41615be5ae
UI: fix the typo `totalGb` in `StorageMetrics`
2025-10-28 11:39:17 -07:00
Han Yin
69f2bd62f9
UI: replace ugly optional as casts in AppScaffold with extension functions
2025-10-28 11:39:17 -07:00
Han Yin
e269da655f
UI: combine TopBarConfig and BottomBarConfig into each route's ScaffoldConfig
2025-10-28 11:39:17 -07:00
Han Yin
225c5435c5
UI: refactor BottomBarConfig.ModelsManagement APIs
2025-10-28 11:39:17 -07:00
Han Yin
63fc56d603
UI: centralize the AppScaffold and modularize its configs
2025-10-28 11:39:17 -07:00
Han Yin
72e97b93c5
feature: check for available space before copying local model
2025-10-28 11:39:16 -07:00
Han Yin
65d4a57a8b
LLama: refactor loadModel by splitting the system prompt setting into a separate method
2025-10-28 11:39:16 -07:00
Han Yin
9f77155535
VM: handle the cancellation of ongoing token generation
2025-10-28 11:39:16 -07:00
Han Yin
46859c10f0
LLama: update engine state after handling the cancellation of sendUserPrompt
2025-10-28 11:39:16 -07:00
Han Yin
06448a60a5
UI: update UI ongoing model import's cancellation
2025-10-28 11:39:16 -07:00
Han Yin
9ba74a9d3d
data: allow canceling the ongoing model import
2025-10-28 11:39:16 -07:00
Han Yin
d70b8fe323
core: swap in LLamaAndroid and mark stub engine for testing only
2025-10-28 11:39:16 -07:00
Han Yin
c2426a42e5
UI: unify Model Card components
2025-10-28 11:39:16 -07:00
Han Yin
434933f5b3
UI: show model card in Conversation screen
2025-10-28 11:39:16 -07:00
Han Yin
9769467723
UI: show model card in Model Loading screen
2025-10-28 11:39:16 -07:00
Han Yin
9cfa74f754
core: break down InferenceManager due to Interface Segregation Principle
2025-10-28 11:39:16 -07:00
Han Yin
286ed05f13
vm: merge SystemPromptViewModel into ModelLoadingViewModel
2025-10-28 11:39:16 -07:00
Han Yin
23d411d86e
vm: split mono MainViewModel into separate individual ViewModels
2025-10-28 11:39:16 -07:00
Han Yin
32d778bb8e
core: extract conversation and benchmark logics into InferenceManager; add logs and missing state updates in stub InferenceEngine
2025-10-28 11:39:16 -07:00
Han Yin
51b120f464
data: pass through getModelById from ModelDao into ModelRepository
2025-10-28 11:39:16 -07:00
Han Yin
59f5caa699
Util: split FileUtils from ModelUtils; extract copy methods into FileUtils
2025-10-28 11:39:16 -07:00
Han Yin
4913ad0dae
nit: tidy SystemPromptViewModel
2025-10-28 11:39:16 -07:00
Han Yin
2614f91226
UI: replace model selection screen's data stubbing; add empty view
2025-10-28 11:39:16 -07:00
Han Yin
6b48f7473f
UI: extract a shared ModelCard component
2025-10-28 11:39:16 -07:00
Han Yin
0d41e75ca5
UI: add a confirmation step when user picks a file; refactor model import overlay into AlertDialog
2025-10-28 11:39:16 -07:00
Han Yin
1bebd1bb07
util: extract file size formatting into ModelUtils
2025-10-28 11:39:16 -07:00
Han Yin
561fe0222f
UI: handle back navigation when user is in multi-selection mode
2025-10-28 11:39:16 -07:00
Han Yin
2d6b8856f6
UI: implement multiple models deletion; update Models Management screen
2025-10-28 11:39:16 -07:00
Han Yin
025e3d2417
UI: enrich ModelManagementState; extract filename to show correct importing UI
2025-10-28 11:39:16 -07:00
Han Yin
adfbfe3ffb
data: add a util file for extracting file name & size and model metadata
2025-10-28 11:39:16 -07:00
Han Yin
290a6bfebe
bugfix: use List instead of Collection for ModelDao's deletion
2025-10-28 11:39:16 -07:00
Han Yin
5de0b5d6d0
data: import local model with file picker
2025-10-28 11:39:16 -07:00
Han Yin
a3ebdac58f
UI: polish sort order menu
2025-10-28 11:39:16 -07:00
Han Yin
760d66c97d
UI: replace Models Management screen's stubbing with instrumentation
2025-10-28 11:39:16 -07:00
Han Yin
bc93c384a7
data: introduce Model entity and DAO; update DI module
2025-10-28 11:39:16 -07:00
Han Yin
f5e2edda87
data: [WIP] prepare for ModelRepository refactor & impl
2025-10-28 11:39:16 -07:00
Han Yin
b6cc8f0c01
DI: abstract the protocol of SystemPromptRepository; update AppModule
2025-10-28 11:39:16 -07:00
Han Yin
eebc05b559
UI: polish UI for ModelsManagementScreen; inject ModelsManagementVieModel
2025-10-28 11:39:16 -07:00
Han Yin
6e82bb37d3
Feature: Introduce ModelRepository and ModelsManagementViewModel; update AppModule
2025-10-28 11:39:16 -07:00
Han Yin
aedf442632
DI: Optimize AppModule
2025-10-28 11:39:16 -07:00
Han Yin
d60bba9b8f
UI: navigation with more natural animated transitions
2025-10-28 11:39:16 -07:00
Han Yin
511df35704
bugfix: wait for model to load before navigating to benchmark screen; use NavigationActions instead of raw navController
2025-10-28 11:39:16 -07:00
Han Yin
ea11ee3c94
UI: optimize AppContent's composing
2025-10-28 11:39:16 -07:00
Han Yin
0afd087f35
DI: replace manual DI with Hilt DI
2025-10-28 11:39:16 -07:00
Han Yin
a1f6e7e476
DI: make viewmodels Hilt injectable
2025-10-28 11:39:16 -07:00
Han Yin
564b095427
DI: make app Hilt injectable
2025-10-28 11:39:16 -07:00
Han Yin
65741a7e64
DI: introduce Hilt plugin + processor + lib dependencies
2025-10-28 11:39:16 -07:00
Han Yin
af0d68d611
nit: combine temperatureMetrics and useFahrenheit
2025-10-28 11:39:16 -07:00
Han Yin
5e4972e93e
UI: refactor top app bars
2025-10-28 11:39:16 -07:00
Han Yin
2a41c0e354
vm: replace token metrics stubs with actual implementation
2025-10-28 11:39:16 -07:00
Han Yin
e47e3b77ee
UI: locks user in alert dialog when model is unloading
2025-10-28 11:39:16 -07:00
Han Yin
6b341b0fbe
bugfix: handle user quitting on model loading
2025-10-28 11:39:16 -07:00
Han Yin
e8b84c6ebf
UI: code polish
2025-10-28 11:39:16 -07:00
Han Yin
fddf060d92
data: code polish
2025-10-28 11:39:16 -07:00
Han Yin
3b499ac7e4
UI: polish conversation screen
2025-10-28 11:39:16 -07:00
Han Yin
64ebdc67a6
UI: update app name to be more Arm
2025-10-28 11:39:16 -07:00
Han Yin
55681847e9
UI: rename `ModeSelection` to `ModelLoading` for better clarity
2025-10-28 11:39:16 -07:00
Han Yin
75c986afc5
bugfix: properly handle user's quitting conversation screen while tokens in generation
2025-10-28 11:39:16 -07:00
Han Yin
4848bf93d0
data: introduce repo for System Prompt; flow data from Room to VM
2025-10-28 11:39:16 -07:00
Han Yin
5596d5203b
DB: setup Room database
2025-10-28 11:39:16 -07:00
Han Yin
4046cd16fd
Deps: bump Kotlin plugin; introduce KSP; apply in :app subproject
2025-10-28 11:39:16 -07:00
Han Yin
5868eaa66b
UI: polish system prompt setup UI
2025-10-28 11:39:16 -07:00
Han Yin
a7ee3d305f
UI: split a nested parent settings screen into separate child settings screens
2025-10-28 11:39:16 -07:00
Han Yin
65c09b2b32
UI: allow drawer's gesture control only on Home and Settings screens; enable alert dialog on back navigation inside conversation and benchmark
2025-10-28 11:39:16 -07:00
Han Yin
648b97818e
UI: disable triggering drawer via gesture; enable alert dialog on back navigation inside conversation and benchmark
2025-10-28 11:39:16 -07:00
Han Yin
a7ae8b7ce0
[WIP] DI: implement simple local vm factory provider
2025-10-28 11:39:16 -07:00
Han Yin
ca2b7772ce
UI: add a new MainActivity; update manifest
2025-10-28 11:39:16 -07:00
Han Yin
7e5c80cee9
UI: implement core flow's screens
2025-10-28 11:39:16 -07:00
Han Yin
5ad65919e9
util: implement user preferences utility
2025-10-28 11:39:16 -07:00
Han Yin
46bd638c5f
util: implement performance monitor; wrap it with a viewmodel
2025-10-28 11:39:16 -07:00
Han Yin
4dd755e25b
UI: implement basic UI components
2025-10-28 11:39:16 -07:00
Han Yin
32608fb225
UI: app navigation
2025-10-28 11:39:16 -07:00
Han Yin
3f913ce440
LLM: stub a local inference engine for faster iteration
2025-10-28 11:39:16 -07:00
Han Yin
3787fbddb0
data: define data models for LLM and system prompts
2025-10-28 11:39:16 -07:00
Han Yin
697d778db7
UI: define theme, color palette, typography and shape
2025-10-28 11:39:16 -07:00
Han Yin
cbe7133742
UI: introduce new dependencies, update versions & references
2025-10-28 11:39:16 -07:00
Han Yin
44a522dbc8
UI: move existing UI src files into `legacy` package
2025-10-28 11:39:16 -07:00
Han Yin
37f3e1c415
Feature: use local llama_context for benchmarking; support context init with custom context size
2025-10-28 11:39:16 -07:00
Han Yin
6d2279e9cd
REWRITE JNI bridge; Update viewmodel
2025-10-28 11:39:16 -07:00
Han Yin
e1bc87610e
Perf: allocate `llama_batch` on stack with `llama_batch_init`
2025-10-28 11:39:16 -07:00
Han Yin
2b52563737
Polish: better logging & documentation
2025-10-28 11:39:16 -07:00
Han Yin
ec502cfde9
Feature: implement infinite conversation via context shifting
2025-10-28 11:39:16 -07:00
Han Yin
4e515727b4
Abort on system prompt too long; Truncate user prompt if too long.
2025-10-28 11:39:16 -07:00
Han Yin
4809112ec5
Polish: adopt common naming; init modularization;
2025-10-28 11:39:16 -07:00
Han Yin
8bf2f4d412
Feature: chat template auto formatting
2025-10-28 11:39:16 -07:00
Han Yin
1b0754c0f5
Perf: optimize performance with ARM features
2025-10-28 11:39:16 -07:00
Han Yin
bb5b824208
Polish: populate backend names in `benchModel`
2025-10-28 11:39:16 -07:00
Han Yin
c14c11dcbd
Feature: decode system and user prompt in batches
2025-10-28 11:39:16 -07:00
Han Yin
02465137ca
Bug fix: null system prompt state update; Safeguard empty user prompt
2025-10-28 11:39:16 -07:00
Han Yin
7bbb53aaf8
Clang-tidy linting: make functions & global variables static
2025-10-28 11:39:16 -07:00
Han Yin
f44882aeeb
Enforce centralized dependency management; bump Gradle & deps versions
2025-10-28 11:39:16 -07:00
Han Yin
0ade7fb4d7
Polish binding: Remove verbose setup JNI APIs; Update state machine states.
2025-10-28 11:39:16 -07:00
Han Yin
7dc9968f82
Restructure `LLamaAndroid.kt`
2025-10-28 11:39:16 -07:00
Han Yin
44720859d6
Rewrite llama-android JNI implementation
2025-10-28 11:39:15 -07:00
Han Yin
d4ab3832cf
Use common sampler
2025-10-28 11:39:15 -07:00
Han Yin
1f255d4bca
Tidy & clean LLamaAndroid binding
2025-10-28 11:39:15 -07:00
Georgi Gerganov
745aa5319b
llama : deprecate llama_kv_self_ API ( #14030 )
...
* llama : deprecate llama_kv_self_ API
ggml-ci
* llama : allow llama_memory_(nullptr)
ggml-ci
* memory : add flag for optional data clear in llama_memory_clear
ggml-ci
2025-06-06 14:11:15 +03:00
Xuan-Son Nguyen
bd3f59f812
cmake : enable curl by default ( #12761 )
...
* cmake : enable curl by default
* no curl if no examples
* fix build
* fix build-linux-cross
* add windows-setup-curl
* fix
* shell
* fix path
* fix windows-latest-cmake*
* run: include_directories
* LLAMA_RUN_EXTRA_LIBS
* sycl: no llama_curl
* no test-arg-parser on windows
* clarification
* try riscv64 / arm64
* windows: include libcurl inside release binary
* add msg
* fix mac / ios / android build
* will this fix xcode?
* try clearing the cache
* add bunch of licenses
* revert clear cache
* fix xcode
* fix xcode (2)
* fix typo
2025-04-07 13:35:19 +02:00
Georgi Gerganov
e0dbec0bc6
llama : refactor llama_context, llama_kv_cache, llm_build_context ( #12181 )
...
* llama : refactor llama_context, llama_kv_cache, llm_build_context
ggml-ci
* graph : don't mutate the KV cache during defrag
ggml-ci
* context : reduce virtuals + remove test function
ggml-ci
* context : move interface implementation to source file + factory
ggml-ci
* graph : move KV cache build functions to llama_context impl
ggml-ci
* graph : remove model reference from build_pooling
ggml-ci
* graph : remove llama_model reference
ggml-ci
* kv_cache : provide rope factors
ggml-ci
* graph : rework inputs to use only unique_ptr, remove attn input abstraction
ggml-ci
* context : remove llama_context_i abstraction
ggml-ci
* context : clean-up
ggml-ci
* graph : clean-up
ggml-ci
* llama : remove redundant keywords (struct, enum)
ggml-ci
* model : adapt gemma3
ggml-ci
* graph : restore same attention ops as on master
ggml-ci
* llama : remove TODO + fix indent
ggml-ci
2025-03-13 12:35:44 +02:00
Han Yin
57b6abf85a
android : fix KV cache log message condition ( #12212 )
2025-03-06 08:22:49 +02:00
Georgi Gerganov
68ff663a04
repo : update links to new url ( #11886 )
...
* repo : update links to new url
ggml-ci
* cont : more urls
ggml-ci
2025-02-15 16:40:57 +02:00
codezjx
3edfa7d375
llama.android: add field formatChat to control whether to parse special tokens when send message ( #11270 )
2025-01-17 14:57:56 +02:00
Georgi Gerganov
afa8a9ec9b
llama : add `llama_vocab`, functions -> methods, naming ( #11110 )
...
* llama : functions -> methods (#11110 )
* llama : add struct llama_vocab to the API (#11156 )
ggml-ci
* hparams : move vocab params to llama_vocab (#11159 )
ggml-ci
* vocab : more pimpl (#11165 )
ggml-ci
* vocab : minor tokenization optimizations (#11160 )
ggml-ci
Co-authored-by: Diego Devesa <slarengh@gmail.com>
* lora : update API names (#11167 )
ggml-ci
* llama : update API names to use correct prefix (#11174 )
* llama : update API names to use correct prefix
ggml-ci
* cont
ggml-ci
* cont
ggml-ci
* minor [no ci]
* vocab : llama_vocab_add_[be]os -> llama_vocab_get_add_[be]os (#11174 )
ggml-ci
* vocab : llama_vocab_n_vocab -> llama_vocab_n_tokens (#11174 )
ggml-ci
---------
Co-authored-by: Diego Devesa <slarengh@gmail.com>
2025-01-12 11:32:42 +02:00
ag2s20150909
c250ecb315
android : fix llama_batch free ( #11014 )
2024-12-30 14:35:13 +02:00
Diego Devesa
9177484f58
ggml : fix arm build ( #10890 )
...
* ggml: GGML_NATIVE uses -mcpu=native on ARM
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* ggml: Show detected features with GGML_NATIVE
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* remove msvc support, add GGML_CPU_ARM_ARCH option
* disable llamafile in android example
* march -> mcpu, skip adding feature macros
ggml-ci
---------
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
Co-authored-by: Adrien Gallouët <angt@huggingface.co>
2024-12-18 23:21:42 +01:00
Xuan Son Nguyen
cda0e4b648
llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch ( #9745 )
...
* refactor llama_batch_get_one
* adapt all examples
* fix simple.cpp
* fix llama_bench
* fix
* fix context shifting
* free batch before return
* use common_batch_add, reuse llama_batch in loop
* null terminated seq_id list
* fix save-load-state example
* fix perplexity
* correct token pos in llama_batch_allocr
2024-10-18 23:18:01 +02:00
Diego Devesa
7eee341bee
common : use common_ prefix for common library functions ( #9805 )
...
* common : use common_ prefix for common library functions
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-10-10 22:57:42 +02:00
Diego Devesa
c81f3bbb05
cmake : do not build common library by default when standalone ( #9804 )
2024-10-09 18:49:52 +02:00
slaren
5fb5e24811
llama : minor sampling refactor (2) ( #9386 )
2024-09-09 17:10:46 +02:00
Georgi Gerganov
a5b5d9a101
llama.android : fix build ( #9350 )
2024-09-08 00:33:50 +03:00
Georgi Gerganov
df270ef745
llama : refactor sampling v2 ( #9294 )
...
- Add `struct llama_sampler` and `struct llama_sampler_i`
- Add `llama_sampler_` API
- Add `llama_sampler_chain_` API for chaining multiple samplers
- Remove `LLAMA_API_INTERNAL`
- Add `llama_perf_` API and remove old `llama_print_timings` and `llama_reset_timings`
2024-09-07 15:16:19 +03:00
devojony
b7c11d36e6
examples: fix android example cannot be generated continuously ( #8621 )
...
When generation ends `completion_loop()` should return a NULL, not the empty string
2024-07-22 09:54:42 +03:00
Raj Hammeer Singh Hada
387952651a
Delete examples/llama.android/llama/CMakeLists.txt ( #8165 )
...
* Delete examples/llama.android/llama/CMakeLists.txt
https://github.com/ggerganov/llama.cpp/pull/8145#issuecomment-2194534244
This file is not being used for building on Android. `llama.cpp/examples/llama.android/llama/src/main/cpp/CMakeLists.txt` is being used instead.
* Update CMakeLists.txt
Pick local llama.cpp files instead of fetching content from git
2024-06-27 16:39:29 +02:00
Raj Hammeer Singh Hada
ac146628e4
Fix llama-android.cpp for error - "common/common.h not found" ( #8145 )
...
- Path seems to be wrong for the common.h header file in llama-android.cpp file. Fixing the path so the Android Build doesn't fail with the error "There is no file common/common.h"
2024-06-27 03:57:57 +02:00
Elton Kola
9791f40258
android : module ( #7502 )
...
* move ndk code to a new library
* add gradle file
2024-05-25 11:11:33 +03:00
Georgi Gerganov
854d365aba
cmake : update android comments ( #7341 )
2024-05-19 11:01:01 +03:00
Georgi Gerganov
511182eabb
android : use "ci-android" branch for CI ( #7341 )
...
* android : use "ci-android" branch for CI
* ggml : disable SIMD exp and silu for 32-bit ARM
ggml-ci
* android : do not fetch, use add_subdirectory instead
* cmake : provide binary dir
2024-05-18 20:40:39 +10:00
Brian
1265c670fd
Revert "move ndk code to a new library ( #6951 )" ( #7282 )
...
This reverts commit efc8f767c8 .
2024-05-14 16:10:39 +03:00
Elton Kola
efc8f767c8
move ndk code to a new library ( #6951 )
2024-05-14 17:30:30 +10:00
Pedro Cuenca
b97bc3966e
llama : support Llama 3 HF conversion ( #6745 )
...
* Support Llama 3 conversion
The tokenizer is BPE.
* style
* Accept suggestion
Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>
* llama : add llama_token_is_eog()
ggml-ci
* llama : auto-detect more EOT tokens when missing in KV data
* convert : replacing EOS token is a hack
* llama : fix codegemma EOT token + add TODOs
* llama : fix model type string for 8B model
---------
Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-04-21 14:50:41 +03:00
Dean
7ab7b733bb
android : fix utf8 decoding error ( #5935 )
...
* examples: fix utf8 decoding error
some models have a tokenizer that decodes an id into an incomplete utf8 sequence, need to validate and wait for next token
one example would be: https://huggingface.co/Qwen/Qwen1.5-1.8B-Chat-GGUF/resolve/main/qwen1_5-1_8b-chat-q4_0.gguf and and an example of the token is 18137
* android : minor
---------
Co-authored-by: zhangfuwen <zhangfuwen@foxmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-03-10 22:03:17 +02:00
Radosław Gryta
abbabc5e51
ggml-quants : provide ggml_vqtbl1q_u8 for 64bit compatibility ( #5711 )
...
* [ggml-quants] Provide ggml_vqtbl1q_u8 for 64bit compatibility
vqtbl1q_u8 is not part of arm v7 neon library
* [android-example] Remove abi filter after arm v7a fix
* [github-workflows] Do not skip Android armeabi-v7a build
2024-02-25 20:43:00 +02:00
bmwl
f486f6e1e5
ggml : add numa options ( #5377 )
...
* Added numa options to allow finer grained control as well as plumbing for a new mirror mode that will require numa.h
* Reverted Makefile
* Fixed include
* Removed sched.h from ggml.h, moved ggml_get_numa_affinity into ggml.c, removed trailing whitespace and fixed up a few inconsistent variables
* removed trailing whitespace
* Added numa options to allow finer grained control as well as plumbing for a new mirror mode that will require numa.h
* Reverting Makefile
* Fixed a number of issues with the move from BOOL to ggml_numa_strategies. Added a note about mirror mode note being implemented yet
* Removing MIRROR_MODE code for this PR
* Removing last bit of MIRROR_MODE code for this PR
* Removing unneeded branch in server.cpp example and moving get_numa_affinity and making it static
* Fixed lingering init_llama_backend() bool calls in tests and examples
* Remote enum llama_numa_strategies
* Revert bad merge with dynatemp flags
* add missing enum ggml_numa_strategies declaration and revert sync problem with master
* add missing enum ggml_numa_strategies declaration
* fixed ggml_init_numa variable
* Update ggml.h
Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
* Update READMEs with info about numa flags, change INTERLEAVE strategy name to DISTRIBUTE everywhere, implement the improved distribution strategy from @rankaiyx, fix a spelling mistake and un-merge some bad merges
* split numa init out from llama_backend_init and created llama_numa_init. Updated all code paths and samples
* Fix up some boolean vs enum comparisons
* Added #ifdefs for non-Linux OS that don't have cpu_set_t datatype
* Update ggml.h
Align enum values
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Update ggml.c
Remove whitespace
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Update ggml.c
align paremeters
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Update examples/server/server.cpp
remove whitespace and align brace
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Update common/common.cpp
Remove whitespace and align brace
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* unified ggml_numa_strategy enum and fixed text alignment in server.cpp example
* Update ggml.c
simplified return for platforms without NUMA support
Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
* removed redundant else from cli argument processing of --numa
* whitespace
---------
Co-authored-by: root <root@nenya.lothlorien.ca>
Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>
2024-02-16 11:31:07 +02:00
Valentin Konovalov
256d1bb0dd
android : use release cmake build type by default ( #5123 )
2024-01-25 19:05:51 +02:00
Neuman Vong
862f5e41ab
android : introduce starter project example ( #4926 )
...
* Introduce starter project for Android
Based on examples/llama.swiftui.
* Add github workflow
* Set NDK version
* Only build arm64-v8a in CI
* Sync bench code
* Rename CI prop to skip-armeabi-v7a
* Remove unused tests
2024-01-16 15:47:34 +02:00