Nicholas Wilt
@CUDAHandbook
Nicholas Wilt was on the inception team for CUDA, wrote The CUDA Handbook, and writes at https://parallelprogrammer.substack.com
You might like
Violating copyright by screenshotting paywalled content is not “the lord’s work.” I hope all 77k people who enjoyed this thread have subscribed to the Substack.
Interval training is your friend. Go to a 400m track. Warm up, then run a complete circuit around the track As Fast As You Can(tm), then walk/jog around the track to your starting point. Repeat 4-6x. Intervals are hands-down the most time-efficient way to build cardio.
engineers & founders, please share your advice for getting fit and staying fit while spending 10-12hr/day working on the computer.
lol The HIP ecosystem, such as it is, would like a word. @SpectralCom does it better though—no need for intermediate source files.
This isn't why. Trying to "compile" CUDA for AMD is nonsense; NVIDIA loves when people try. CUDA will never be fast on AMD (how do you compile if the shared memory / tensor cores are a different size?). It's the wrong layer to do this at.
Great thread. CPU overhead always has been a point of emphasis for CUDA. GPU are too big and expensive to let them just sit there, starving. Asynchronous operation is important.
Never block the GPU! In a new @modal blogpost, we walk through a major class of inefficiency in AI inference: host overhead. We include three cases where we worked with @sgl_project to cut host overhead and prevent GPU stalls. Every microsecond counts. modal.com/blog/host-over…
Nancy Kress won the Hugo with a novella on this topic, Beggars In Spain. There’s a twist in the premise that I won’t spoil here, but yes, this work anticipated eugenics for the ultra wealthy. @DavidBrin debuted TASAT (There’s A Story About That) in 2017. david-brin.medium.com/can-science-fi…
A visit to this booth is like time travel to the future.
Most people come to Booth #6552 for a free Scaley plushie. Some come to see the same, untouched CUDA code running on both AMD and NVIDIA GPUs. We don't judge your priorities. Just come say hi. #SC25 #HPC #HardwareFreedom #CUDA #AMD #NVIDIA @Supercomputing
tbh I am surprised it took this long
✨ The great flip: Today, over 85% of the TOP100 HPC systems use GPUs, not just CPUs, turning the 2019 landscape upside down when CPU-only systems made up 70%. ⚡ NVIDIA now powers 78% of the @top500supercomp list. With 388 systems -- 218 GPU-accelerated systems and 362 systems…
All true, and the server chassis and the infrastructure they plug into are all tightly codesigned, so it’s not as if you can field upgrade A100 machines to hold H100’s.. you wring every useful clock cycle you can out of these machines.
This is the lifecycle of GPUs. Older GPUs don't suddenly become obsolete and worthless when new models are released. Bleeding-edge chips are used for training, while prior generations are repurposed for inference as newer GPUs take over training tasks. There is also typically…
Not to mention SIMD, which has been on x86 since 1998.
And this is what many benchmarks fail to understand. You cannot compare a O(n³) nested loop that anyone can do to highly optimized BLAS libraries. A terrible algorithm will lead to terrible results regardless of the language.
return 0==a;
This thread is timely—the first refactor I did in this limit order book implementation I posted about today was replace the POSIX calls with mmap().
If your code to process a 10GB file looks like read(fd, buf, ...) in a loop, you're wasting memory and killing performance. There's a better way. mmap() lets you treat a 100GB file as if it's just a giant array in memory, even with only a few MB of RAM. Here's why it's a…
United States Trends
- 1. Ravens 47.2K posts
- 2. Ravens 47.2K posts
- 3. Lamar 38.2K posts
- 4. Joe Burrow 14.4K posts
- 5. Zay Flowers 3,426 posts
- 6. Chiefs 102K posts
- 7. Cowboys 86.3K posts
- 8. Derrick Henry 4,034 posts
- 9. #WhoDey 2,636 posts
- 10. Zac Taylor 2,390 posts
- 11. Perine 1,414 posts
- 12. #CINvsBAL 2,379 posts
- 13. Mahomes 32.1K posts
- 14. Cam Boozer 1,839 posts
- 15. Tanner Hudson 1,173 posts
- 16. Sarah Beckstrom 182K posts
- 17. Tinsley 1,545 posts
- 18. AFC North 1,762 posts
- 19. Jason Garrett N/A
- 20. Myles Murphy N/A
Something went wrong.
Something went wrong.