rasdani
@rasdani_
hill climbing @PrimeIntellect
You might like
Why should humans have all the fun :) Thanks @ZhiSu22 for HITTER on g1!
A taste of the Los Angeles event a few days ago. @cookedbykimchi starting his professional robot fighting career with a 1-0. This trip has been a wild ride to put it lightly.
i cannot overstate how absurdly impressive primeintellect's rl infra is the people working on it clearly view it as art and probably forget they get paid if you like rl, there’s really no better place on earth to work on it
day 8 of RL: building open-source swe-grep by @cognition with just a bash tool got 37.6% increase in F1 from Qwen3-4B base by GRPOing for 120 steps. more data 👇 once i built my env with @PrimeIntellect verifiers, starting the rl run with prime-rl was super simple. thanks…
RL LEARNING WITH LORA: A DIVERSE DEEP DIVE
💯
> Train models end-to-end with RL in your own environment/application > RL facilitates building specialized models > RL is an infra challenge While all the big labs are doing everything they can to convince companies to build on their closed APIs/models, despite telling everyone…
if you want the tweet version and not the 10min video version: this is now all it takes to train with prime-rl after installing verifiers
verifiers v0.1.7 is released 🚀 this one's all about making RL training and experimentation waaaay easier: - single-command installation for prime-rl - single-command training w/ unified configs - overhauled vf.RLTrainer for hacking on new algorithms quick demo + links below :)
verifiers v0.1.7 is released 🚀 this one's all about making RL training and experimentation waaaay easier: - single-command installation for prime-rl - single-command training w/ unified configs - overhauled vf.RLTrainer for hacking on new algorithms quick demo + links below :)
Quick clip of the final full-speed crawl (no costume)
The PipelineRL paper getting rejected at NeurIPS reminds me of when the Megatron-LM paper got rejected from every conference back in 2020 scientific reviewers still don’t recognize a good systems paper when they see one openreview.net/forum?id=Eqlmp…
Don't sleep on PipelineRL -- this is one of the biggest jumps in compute efficiency of RL setups that we found in the ScaleRL paper (also validated by Magistral & others before)! What's the problem PipelineRL solves? In RL for LLMs, we need to send weight updates from trainer to…
Day 5 of RL: open source is beautiful i had a timeout issue in @PrimeIntellect sandboxes sdk so i forked, fixed it, symlinked in my swe-bench env with uv and got positive rewards. u can just do things with open source software. also made a PR with a test. also running a…
It finally happened: someone asked me if I work at @PrimeIntellect while wearing the @PrimeIntellect hat
Introducing Kimi CLI Technical Preview & Kimi For Coding! Kimi CLI powers your terminal: - Shell-like UI + shell command execution - Seamless Zsh integration - MCP support -Agent Client Protocol (now compatible with @zeddotdev) More features incoming!
From my tests, the new Cursor Composer 1 model is likely some variant of a DeepSeek model since it uses the same tokenizer. You can verify this by looking at the input tokens per request in your usage dashboard.
United States Trends
- 1. Clay Higgins 24.2K posts
- 2. Scotland 75.7K posts
- 3. Grisham 4,086 posts
- 4. Peggy 7,105 posts
- 5. Cashman 1,174 posts
- 6. Saudi 257K posts
- 7. Dominguez 3,231 posts
- 8. Bellinger 2,551 posts
- 9. Nicki 127K posts
- 10. Mary Bruce 5,904 posts
- 11. #UNBarbie 16.4K posts
- 12. Gemini 3 51.9K posts
- 13. Khashoggi 56K posts
- 14. Tierney 12.3K posts
- 15. The House 557K posts
- 16. Shota 9,776 posts
- 17. Dearborn 53.3K posts
- 18. Gleyber Torres 1,522 posts
- 19. Woodruff 1,446 posts
- 20. Salman 82.4K posts
Something went wrong.
Something went wrong.