Gustav (@guidotrev) on Piclur: i don't understand the asynchronous rl claim for higher throughput. you can colocate training and generation on the same set of gpus and the switching bottleneck is minimal. this still achieves high throughput while avoiding off policy training. / Piclur

Gustav

Oct 1

i don't understand the asynchronous rl claim for higher throughput. you can colocate training and generation on the same set of gpus and the switching bottleneck is minimal. this still achieves high throughput while avoiding off policy training.

Tanishq Mathew Abraham, Ph.D.

@iScienceLuvr

Sep 28

practical, modern GRPO tweaks as described in Meta's Code World Models paper

United States Trends

1. Rams 26.3K posts
2. Jassi N/A
3. Seahawks 32.2K posts
4. Commanders 113K posts
5. #HereWeGo 2,682 posts
6. 49ers 21.8K posts
7. Lions 91.1K posts
8. Canada Dry 1,446 posts
9. DO NOT CAVE 14.3K posts
10. Jordan Walsh N/A
11. Stafford 10K posts
12. Niners 5,407 posts
13. Dan Campbell 3,708 posts
14. #OnePride 4,983 posts
15. Lenny Wilkens 3,867 posts
16. Bills 146K posts
17. Gizelle N/A
18. Chris Boswell N/A
19. Cardinals 11.3K posts
20. #RaiseHail 3,675 posts

Something went wrong.