structuredrepr's profile picture.

Leello Tadesse Dadi

@structuredrepr

Leello Tadesse Dadi reposted

I've noticed there is some confusion about Dion since it mathematically looks so different from Muon and Spectral descent, so I wrote a small note expressing Dion in terms of the SVD and how it differs from PowerSGD 👇


Leello Tadesse Dadi reposted

Hyperplane projections are really powerful approach and seem to pop up everywhere in optimization. A quick overview which might be interesting to some 👇


Leello Tadesse Dadi reposted

When comparing optimization methods, we often change *multiple things at once*—geometry, normalization, etc.—possibly without realizing it. Let's disentangle these changes. 👇


United States Trends

Loading...

Something went wrong.


Something went wrong.