|
|
58af95eb25
|
Remove cudaDeviceReset calls from tests
|
2024-04-21 22:47:12 +02:00 |
|
|
|
bdbb3f978e
|
Fix matmul and max reduce memcheck errors
|
2024-04-21 22:11:02 +02:00 |
|
|
|
18522c2dea
|
Cleanup and refactor
|
2024-04-11 22:52:41 +02:00 |
|
|
|
4b9d123e94
|
Implement device vector utils
|
2024-04-11 22:22:33 +02:00 |
|
|
|
710a33bdde
|
Move softmax partial kernels to matmul
|
2024-04-11 22:01:47 +02:00 |
|
|
|
bf7c961b9e
|
Add cudaDeviceReset at the end of each test
|
2024-04-11 19:55:02 +02:00 |
|
|
|
b49dddf34a
|
Improve softmax numerical stability
|
2024-04-08 23:25:46 +02:00 |
|
|
|
0c22fac64e
|
Add toplevel CUDANet namespace
|
2024-03-17 16:08:53 +01:00 |
|
|
|
77004c16be
|
Use shared memory for mat vec mul kernel
|
2024-03-13 22:13:11 +01:00 |
|