Commit Graph

6 Commits

Author SHA1 Message Date
Nikhil Dev Goyal 259b757aef Use Lookup8 and detail::IsFull(d) in FastSigmoid
Fix targeted for scalable architectures

PiperOrigin-RevId: 888633434
2026-03-24 06:36:55 -07:00
Nikhil Dev Goyal 90f3de7f15 Use paralell blend chain path in FastSigmoid on architectures having >=32 registers
PiperOrigin-RevId: 886178215
2026-03-19 07:54:05 -07:00
Nikhil Dev Goyal 50144738f1 Change calculation from (ax+b)/(cx+d) to (x + b')/(c'x+ d') this replaces a MulAdd with Add reducing port contention on modern cpus and thus increasing throughput.
Also reduces the need for 1 register to hold b as 1.0 here

PiperOrigin-RevId: 886170146
2026-03-19 07:36:52 -07:00
Nikhil Dev Goyal 5081341200 Use CappedTag to prevent potential out of bound reads.
PiperOrigin-RevId: 879141747
2026-03-05 10:40:52 -08:00
Nikhil Dev Goyal 6721dddf38 Implement FastSigmoid.
PiperOrigin-RevId: 878453196
2026-03-04 06:12:33 -08:00
Nikhil Dev Goyal dd268ddbe8 Add FastGelu activation function in a newly created created fast_ops-inl.h files.
This replaces the Tanh call with FastTanh call in the Gelu function written in math-inl.h.

PiperOrigin-RevId: 876339830
2026-02-27 11:14:47 -08:00