|
vor 2 Jahren | |
---|---|---|
.. | ||
README.md | vor 2 Jahren | |
cuda_matmul.cu | vor 2 Jahren | |
profiler_demo_utils.py | vor 2 Jahren | |
pytorch_profiler_demo.py | vor 2 Jahren |
This folder contains code for the following blog posts.
All code was tested on a PC with RTX 3090 and AMD Ryzen 5800X.
Kernel version:
sf@trantor:~/Downloads$ uname -r
5.4.0-121-generic
nvcc cuda_matmul.cu -lm -o cu_mm.out
./cu_mm.out 2048 256 512
On the tested system, the GPU was about 650 times faster than the CPU.
Want to become an expert in AI? AI Courses by OpenCV is a great place to start.