asyncakash's profile picture. learning to make gpus go brrr | 🦀 | prev: @availproject, @puffer_finance, @class_lambda, topology | alum @iitroorkee

zkash

@asyncakash

learning to make gpus go brrr | 🦀 | prev: @availproject, @puffer_finance, @class_lambda, topology | alum @iitroorkee

Repost di zkash

Some unstructured thoughts on what creates abundance mindset..

nikunj's tweet image. Some unstructured thoughts on what creates abundance mindset..

what’s with the meteoric pump of $zec while the entire market is bleeding red?!


Repost di zkash

New post in the GPU 𝕻𝖊𝖗𝖋𝖔𝖗𝖒𝖆𝖓𝖈𝖊 Glossary on memory coalescing -- a hardware feature that CUDA programmers need to mind to get anywhere near full memory bandwidth utilization. The article includes a quick µ-benchmark, reproducible with Godbolt. What a tool!

charles_irl's tweet image. New post in the GPU 𝕻𝖊𝖗𝖋𝖔𝖗𝖒𝖆𝖓𝖈𝖊 Glossary on memory coalescing -- a hardware feature that CUDA programmers need to mind to get anywhere near full memory bandwidth utilization.

The article includes a quick µ-benchmark, reproducible with Godbolt. What a tool!

Repost di zkash

it's insane to me how little attention the llm.q repo has it's a fully C/C++/CUDA implementation of multi-gpu (zero + fsdp), quantized LLM training with support for selective AC it's genuinely the coolest OSS thing I've seen this year (what's crazier is 1 person wrote it!)

a1zhang's tweet image. it's insane to me how little attention the llm.q repo has

it's a fully C/C++/CUDA implementation of multi-gpu (zero + fsdp), quantized LLM training with support for selective AC

it's genuinely the coolest OSS thing I've seen this year (what's crazier is 1 person wrote it!)

Repost di zkash

We reverse-engineered Flash Attention 4.

charles_irl's tweet image. We reverse-engineered Flash Attention 4.

Really enjoyed @samsja19’s talk on the challenges of decentralized training (e.g. DiLoCo) under low-bandwidth conditions. Was surprised to learn how much weather can destabilize training 🤯 @PrimeIntellect is doing some wild stuff with decentralized RL! 🚀 Thanks for the…


Repost di zkash

too much new learning material! we're releasing a few chapters of hard study on post training AI models. it covers all major aspects plus more to come. - Evaluating Large Language models on benchmarks and custom use cases - Preference Alignment with DPO - Fine tuning Vision…

ben_burtenshaw's tweet image. too much new learning material! we're releasing a few chapters of hard study on post training AI models. it covers all major aspects plus more to come.

- Evaluating Large Language models on benchmarks and custom use cases
- Preference Alignment with DPO
- Fine tuning Vision…

Repost di zkash

hi! if you’re interested in using or writing mega kernels for AI (one big GPU kernel for an entire model) you should tune in to today’s @GPU_MODE livestream today in ~3 hours we have the authors of MPK talking about their awesome new compiler for mega kernels! see you there :)

a1zhang's tweet image. hi! if you’re interested in using or writing mega kernels for AI (one big GPU kernel for an entire model) you should tune in to today’s @GPU_MODE livestream

today in ~3 hours we have the authors of MPK talking about their awesome new compiler for mega kernels!

see you there :)

Repost di zkash

I was lucky to work in both China and the US LLM labs, and I've been thinking this for a while. The current values of pretraining are indeed different: US labs be like: - lots of GPUs and much larger flops run - Treating stabilities more seriously, and could not tolerate spikes…

I bet OpenAI/xAI is laughing so hard, this result is obvious tbh, they took a permanent architectural debuff in order to save on compute costs.



Qwen is basically the Samsung (smartphone) of llms. They ship nice new models everything month.

China saved opensource LLMs, some notable releases from July only > Kimi K2 > Qwen3 235B-A22B-2507 > Qwen3 Coder 480B-A35B > Qwen3 235B-A22B-Thinking-2507 > GLM-4.5 > GLM-4.5 Air > Qwen3 30B-A3B-2507 > Qwen3 30B-A3B-Thinking-2507 > Qwen3 Coder 30B-A3B US & EU need to do better



imagine trying to “learn to code” in cursor when the tab key is basically god mode 💀

We've trained a new Tab model that is now the default in Cursor. This model makes 21% fewer suggestions than the previous model while having a 28% higher accept rate for the suggestions it makes. Learn more about how we improved Tab with online RL.



ai bros really out here teaching each other how to draw assholes 😭

asyncakash's tweet image. ai bros really out here teaching each other how to draw assholes 😭

chinese ai labs slaying it 🔥

🚀 Introducing Qwen3-Next-80B-A3B — the FUTURE of efficient LLMs is here! 🔹 80B params, but only 3B activated per token → 10x cheaper training, 10x faster inference than Qwen3-32B.(esp. @ 32K+ context!) 🔹Hybrid Architecture: Gated DeltaNet + Gated Attention → best of speed &…

Alibaba_Qwen's tweet image. 🚀 Introducing Qwen3-Next-80B-A3B — the FUTURE of efficient LLMs is here!

🔹 80B params, but only 3B activated per token → 10x cheaper training, 10x faster inference than Qwen3-32B.(esp. @ 32K+ context!)
🔹Hybrid Architecture: Gated DeltaNet + Gated Attention → best of speed &…


Just had the most amazing Transformers (with flash attention) lecture from @danielhanchen — he broke down the guts of Transformers and walked us through the full backprop step-by-step, all by hand. Huge thanks to @TheZachMueller for organizing!


Repost di zkash

DO NOT buy a gpu to write kernels. use @modal notebooks. take 2 mins out of your day to learn this simple trick and kick off your work without paying a shit ton for electricity or cloud gpu run 24/7


Repost di zkash

🚨 career update i’ve joined @bulletxyz_ to build the growth engine driving the next million on-chain traders. excited to build a @solana native trading layer that brings CEX performance fully on-chain. more ↓


Repost di zkash

I get asked the same about terminals all the time. “How will you turn this into a business? What’s the monetization strategy?” The monetization strategy is that my bank account has 3 commas mate.


masterclass thread on gaslighting lmao

Do you seriously want to claim that if the majority of H1B beneficiaries turned out to be white Swedes, that this grievance would have gained traction to any comparable degree on the right?



Loading...

Something went wrong.


Something went wrong.