Mathilda426's profile picture. Ph.D. @NUSSingapore | Alumnus @cuhksz | Google PhD Fellowship

Yan Sun

@Mathilda426

Ph.D. @NUSSingapore | Alumnus @cuhksz | Google PhD Fellowship

置頂

🤔 Do we really need massive datasets for RL post-training — but how to do it 𝐰𝐢𝐭𝐡𝐨𝐮𝐭 𝐡𝐞𝐚𝐯𝐲 𝐨𝐯𝐞𝐫𝐡𝐞𝐚𝐝 𝐛𝐞𝐟𝐨𝐫𝐞 𝐠𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐧𝐠 𝐫𝐨𝐥𝐥𝐨𝐮𝐭𝐬? 🎉 We present 𝑷𝑹𝑬𝑷𝑶 for your data efficiency! 🔗 Explore our ongoing work: yansun-x.notion.site/data-efficienc…


Yan Sun 已轉發

🚨Training–inference mismatch in MoE RL? It gets even worse than we thought… But no worries—just grab an "IcePoP"🧊 and chill😉! Our new solution keeps MoE RL cool😎 & boosted🚀. Check it out! 📜Blog: ringtech.notion.site/icepop

Jia__Guo's tweet image. 🚨Training–inference mismatch in MoE RL? It gets even worse than we thought…

But no worries—just grab an "IcePoP"🧊 and chill😉!
Our new solution keeps MoE RL cool😎 & boosted🚀. Check it out!

📜Blog: ringtech.notion.site/icepop

United States 趨勢

Loading...

Something went wrong.


Something went wrong.