geekyramsha's profile picture. Curious mind into ML/DL | Mathematics | Exploring RL

Ramsha Khan

@geekyramsha

Curious mind into ML/DL | Mathematics | Exploring RL

And in the end, it's तुम आणि तुमचा नशीब!


Whoops! I almost convinced myself with Sea AI lab's results: the experiments were right in front of me, and it made perfect sense why switching to FP16 (provides precision with 10 mantissa bits) eliminates the training-inference mismatch, BUT on A100 (Ampere arch).

I was puzzled by why their paper claims "bfloat16" training crashes -- since we trained for 100,000 GPU hours and 7K+ training steps for both dense and MoEs in the ScaleRL paper stably without any crashes. I think it matters what kind of GPUs they used -- they mention in the…

agarwl_'s tweet image. I was puzzled by why their paper claims "bfloat16" training crashes -- since we trained for 100,000 GPU hours and 7K+ training steps for both dense and MoEs in the ScaleRL paper stably without any crashes. 

I think it matters what kind of GPUs they used -- they mention in the…


Whattttt talwiinder came to mumbaiiiiiiiiiiiiiiii!!!!!!!!


There were times my mind would just go blank looking at huge, scary codebases, but with better prompts, ai tools helps me untangle everything so much faster.

I totally agree. AI tools are amazing for code understanding!



geekyramsha's tweet image.

Jaun Elia and his collection of poetry, nazms, shayaris ✨



ना सुना उसने तवज्जो से फसाना दिल का।

geekyramsha's tweet image. ना सुना उसने तवज्जो से फसाना दिल का।

currently in "analysis paralysis" state as per claude :/

geekyramsha's tweet image. currently in "analysis paralysis" state as per claude :/

A quick quiz! Evaluate the one liner (complex list comprehension) of a simple problem in seconds.

geekyramsha's tweet image. A quick quiz!
Evaluate the one liner (complex list comprehension) of a simple problem in seconds.

I've touched graphs, DP problems in C++ and now back to solve with python - starting from basic data structures for a warm-up. (can't promise to be consistent let's see how far it goes...) You can find DSA questions i solved in C++ here: github.com/Khan-Ramsha/DS…

we're back to DSA after so long.

geekyramsha's tweet image. we're back to DSA after so long.


Saavdhaan rahe, Satark rahe... Taking risk is another obsession of mine; it throws me into unpredictable spaces!

how it feels to yolo your frontier model training run on pytorch nightlies



Can you spot the subtle mistake in the training loop?

geekyramsha's tweet image. Can you spot the subtle mistake in the training loop?

we're back to DSA after so long.

geekyramsha's tweet image. we're back to DSA after so long.

United States 趨勢

Loading...

Something went wrong.


Something went wrong.