Discussion about this post

User's avatar
keesh lauria's avatar

CUDA’s explicit management of shared memory has always seemed better to me than the situation with CPUs. In CPU programming we still have to think hard about managing what’s in cache, but we don’t have explicit control, so it’s harder to know if what we’re trying to do is actually happening. And a small change in the code or, even worse, in the compiler, can change performance a lot.

Expand full comment
2 more comments...

No posts

Ready for more?