
Yan Sun
@Mathilda426
Ph.D. @NUSSingapore | Alumnus @cuhksz | Google PhD Fellowship
置頂
🤔 Do we really need massive datasets for RL post-training — but how to do it 𝐰𝐢𝐭𝐡𝐨𝐮𝐭 𝐡𝐞𝐚𝐯𝐲 𝐨𝐯𝐞𝐫𝐡𝐞𝐚𝐝 𝐛𝐞𝐟𝐨𝐫𝐞 𝐠𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐧𝐠 𝐫𝐨𝐥𝐥𝐨𝐮𝐭𝐬? 🎉 We present 𝑷𝑹𝑬𝑷𝑶 for your data efficiency! 🔗 Explore our ongoing work: yansun-x.notion.site/data-efficienc…
2
5
33
21
7千
Yan Sun 已轉發
🚨Training–inference mismatch in MoE RL? It gets even worse than we thought… But no worries—just grab an "IcePoP"🧊 and chill😉! Our new solution keeps MoE RL cool😎 & boosted🚀. Check it out! 📜Blog: ringtech.notion.site/icepop

3
17
94
63
13千
United States 趨勢
- 1. Good Sunday 57K posts
- 2. #sundayvibes 5,170 posts
- 3. #askfft N/A
- 4. #AskBetr N/A
- 5. Mason Taylor N/A
- 6. Muhammad Qasim 15.2K posts
- 7. Miary Zo 1,802 posts
- 8. #NationalFarmersDay N/A
- 9. Discussing Web3 N/A
- 10. KenPom N/A
- 11. Wordle 1,576 X N/A
- 12. #HealingFromMozambique 20.7K posts
- 13. Biden FBI 20.5K posts
- 14. Trump's FBI 13.6K posts
- 15. The CDC 32.6K posts
- 16. Blessed Sunday 18.6K posts
- 17. NFL Sunday 5,338 posts
- 18. Bourne 1,438 posts
- 19. Coco 47.4K posts
- 20. Danny Wolf N/A
Loading...
Something went wrong.
Something went wrong.