Commit Graph

7 Commits

Author SHA1 Message Date
bssrdf be25be8ed3 WIP: debugging tensor core kernel 2025-10-24 14:24:26 -04:00
bssrdf 3f99818925 unroll some loops 2025-10-15 12:46:46 -04:00
bssrdf b70cca2ea3 add support for both NCHW and NHWC layouts 2025-10-14 14:24:35 -04:00
bssrdf 2237722056 added block variants; to be debugged 2025-10-14 11:02:10 -04:00
bssrdf c6255442bb minor updates 2025-10-08 13:38:16 -04:00
bssrdf 53a2ccbe12 minor update and add direct conv in benchmarking 2025-09-24 21:48:20 -04:00
bssrdf 83a3b7d6a9 Refactor conv2d_implicit_kernel for improved bitwise operations; add test for implicit convolution 2025-09-06 17:26:19 -04:00