geekyramsha's profile picture. Curious mind into ML/DL | Mathematics | Exploring RL

Ramsha Khan

@geekyramsha

Curious mind into ML/DL | Mathematics | Exploring RL

Ramsha Khan さんがリポスト

sheesh mehel na mujhko suhaye tujh sang sukhi royi bhaaye


Fr, Optimus prime core!

everyone sees Batman in themselves, but i see Optimus Prime in me.

fardeentwt's tweet image. everyone sees Batman in themselves, but i see Optimus Prime in me.


The higher the self-information of your tweet, the more engagement it’s likely to pull. Getting curious? Wrote a blog on Probability and Information Theory while revising - and no, you won't get bored. Give it a quick read, tell me what surprised you :) shorturl.at/uaisU


Ramsha Khan さんがリポスト

🚀 Mixed Precision Training is one of those superhero upgrades that's completely redefined deep learning. It's how billion-parameter models fit on your GPU and train in a flash—all while keeping that high-fidelity accuracy 👀 For a 1 Billion parameter model in pure fp32:…

tm23twt's tweet image. 🚀 Mixed Precision Training is one of those superhero upgrades that's completely redefined deep learning.

It's how billion-parameter models fit on your GPU and train in a flash—all while keeping that high-fidelity accuracy 👀

For a 1 Billion parameter model in pure fp32:…

the only question i liked solving in yesterday's exam was: explain in detail why is recall 100% for bloom filter? (super easy and hadn't seen or solved questions like this in any prev year papers while studying)


Ramsha Khan さんがリポスト

I think everyone in life wants to achieve high recall and high precision. (few will understand this)


Ramsha Khan さんがリポスト

living somewhere in japan or switzerland can literally fix me


Currently have a lot on my plate and soo many "what-ifs" in my head. Just trying to stay calm!

How do you deal with burnout?



ML models are basically trying to be "less surprised" by the world. Getting curious? Wrote a blog on Probability and Information Theory while revising - and no, you won't get bored. Give it a quick read, tell me what surprised you :) shorturl.at/uaisU


Ramsha Khan さんがリポスト

writing is still my favourite thing

DrDavidVernon's tweet image. writing is still my favourite thing

This is basically a poetic description of "bias-variance tradeoff".

We chase the sweet spot, not "nicely consistent but consistently wrong", not "widely uncertain, endlessly flung". A reed in the gale just unyielding yet bowed, not rigid paths that crack, nor wild winds that confound.



Ramsha Khan さんがリポスト

basically the hottest thing u can do is be kind


And in the end, it's तुम आणि तुमचा नशीब!


Whoops! I almost convinced myself with Sea AI lab's results: the experiments were right in front of me, and it made perfect sense why switching to FP16 (provides precision with 10 mantissa bits) eliminates the training-inference mismatch, BUT on A100 (Ampere arch).

I was puzzled by why their paper claims "bfloat16" training crashes -- since we trained for 100,000 GPU hours and 7K+ training steps for both dense and MoEs in the ScaleRL paper stably without any crashes. I think it matters what kind of GPUs they used -- they mention in the…

agarwl_'s tweet image. I was puzzled by why their paper claims "bfloat16" training crashes -- since we trained for 100,000 GPU hours and 7K+ training steps for both dense and MoEs in the ScaleRL paper stably without any crashes. 

I think it matters what kind of GPUs they used -- they mention in the…


There were times my mind would just go blank looking at huge, scary codebases, but with better prompts, ai tools helps me untangle everything so much faster.

I totally agree. AI tools are amazing for code understanding!



United States トレンド

Loading...

Something went wrong.


Something went wrong.