- Use bfloat16 dtype for UNet on Blackwell GPUs (compute major >= 12) which have native bf16 tensor core support - Skip manual_cast for bfloat16 weights to avoid unnecessary casting - Fix numpy TypeError with bfloat16 tensors in patch.py and ip_adapter.py by converting to float32 before .numpy() calls Tested on RTX 5070 (sm_120, CUDA 12.8) with PyTorch nightly (cu128). Generates images at ~3.2 it/s including Image Prompt (IP-Adapter) mode. Fixes #3862, #4123, #4141 |
||
|---|---|---|
| .. | ||
| BLIP | ||
| GroundingDINO | ||
| facexlib | ||
| safety_checker | ||
| sam | ||
| censor.py | ||
| expansion.py | ||
| face_crop.py | ||
| inpaint_mask.py | ||
| interrogate.py | ||
| ip_adapter.py | ||
| preprocessors.py | ||
| resampler.py | ||
| vae_interpose.py | ||
| wd14tagger.py | ||