|
пре 2 година | |
---|---|---|
.. | ||
README.md | пре 2 година | |
cuda_matmul.cu | пре 2 година | |
profiler_demo_utils.py | пре 2 година | |
pytorch_profiler_demo.py | пре 2 година |
This folder contains code for the following blog posts.
All code was tested on a PC with RTX 3090 and AMD Ryzen 5800X.
Kernel version:
sf@trantor:~/Downloads$ uname -r
5.4.0-121-generic
nvcc cuda_matmul.cu -lm -o cu_mm.out
./cu_mm.out 2048 256 512
On the tested system, the GPU was about 650 times faster than the CPU.
Want to become an expert in AI? AI Courses by OpenCV is a great place to start.