* Remove make dependency
* Added option to specify Ninja generator
* use ninja-build as default for several CI
* Revert "use ninja-build as default for several CI"
This reverts commit f552c4559b.
* changed use plain string rather than arrays
* Enabled ninja build by default for experimentation
* ci: add run.sh to test conditions to trigger GitHub CI and self-hosted runners
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* Enabled ninja build by default on self-hosted envs for experimentation
* ci: revert generator to ninja instead of ninja multi-config
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ci: install ninja-build for self-hosted workflows
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ci: revert ninja from self-hosted runners
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ci: missed one self-hosted step
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* ci: fix windows ci errors from an errenous revert
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* Added explicit build types for Ninja
Also reverted some needless change
* ci: use ninja multi-config for vulkan-x64 build
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
* added time command to measure build time
* Keeping some configs to use Ninja which show improvement
* minor fix based on review
Co-authored-by: Aaron Teo <taronaeo@gmail.com>
* ci: rm `time` from custom containers
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
---------
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Co-authored-by: Aaron Teo <aaron.teo1@ibm.com>
Co-authored-by: Aaron Teo <taronaeo@gmail.com>
* common : add standard Hugging Face cache support
- Use HF API to find all files
- Migrate all manifests to hugging face cache at startup
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* Check with the quant tag
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* Cleanup
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* Improve error handling and report API errors
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* Restore common_cached_model_info and align mmproj filtering
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* Prefer main when getting cached ref
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* Use cached files when HF API fails
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* Use final_path..
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* Check all inputs
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
---------
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* hex-dma: make chained dma the default to handle newer models
This also includes some new instrumentation that we can remove later.
* hexagon: add uint32 dump helper
* hexagon: use single-page VTCM allocation to avoid issues with large gather ops in ssm-conv
ssm-conv uses HVX gather instruction and that instruction cannot handle cases where the base+offset
spans page boundaries.
* hexagon: update ssm-conv to make base-addr compute a bit easier to read
* hex-dma: use 1d mode for reshaping, it supports sizes up to 24-bits (>16MB)
* hex-bin: fix incorrect stride logic
* hexagon: make sure repack buffs are dumped for verbose > 2
* hex-bin: consistently use dma_queue_push even for dummy dst transactions
* hex-dma: start using 2d-wide mode on v75 and up
The removes the need to deal with the 16-bit limitaion for the strides.
* hex-bin: cleanup kernel selection logic
* hex-bin: cleanup binary op core and fix transposed tensor handling
* snapdragon: update run-bench to use larger ubatch and fa-on