rasdani_'s profile picture. hill climbing @PrimeIntellect

rasdani

@rasdani_

hill climbing @PrimeIntellect

rasdani reposted

Why should humans have all the fun :) Thanks @ZhiSu22 for HITTER on g1!


memetic convergence @extropic 🤝 @PrimeIntellect

rasdani_'s tweet image. memetic convergence @extropic 🤝 @PrimeIntellect

rasdani reposted

A taste of the Los Angeles event a few days ago. @cookedbykimchi starting his professional robot fighting career with a 1-0. This trip has been a wild ride to put it lightly.


rasdani reposted
samsja19's tweet image.

rasdani reposted

this job is way too fun lol


rasdani reposted

i cannot overstate how absurdly impressive primeintellect's rl infra is the people working on it clearly view it as art and probably forget they get paid if you like rl, there’s really no better place on earth to work on it


rasdani reposted

day 8 of RL: building open-source swe-grep by @cognition with just a bash tool got 37.6% increase in F1 from Qwen3-4B base by GRPOing for 120 steps. more data 👇 once i built my env with @PrimeIntellect verifiers, starting the rl run with prime-rl was super simple. thanks…

27upon2's tweet image. day 8 of RL: building open-source swe-grep by @cognition with just a bash tool

got 37.6% increase in F1 from Qwen3-4B base by GRPOing for 120 steps. more data 👇 

once i built my env with @PrimeIntellect verifiers, starting the rl run with prime-rl was super simple. thanks…
27upon2's tweet image. day 8 of RL: building open-source swe-grep by @cognition with just a bash tool

got 37.6% increase in F1 from Qwen3-4B base by GRPOing for 120 steps. more data 👇 

once i built my env with @PrimeIntellect verifiers, starting the rl run with prime-rl was super simple. thanks…

rasdani reposted

RL LEARNING WITH LORA: A DIVERSE DEEP DIVE

kalomaze's tweet image. RL LEARNING WITH LORA: A DIVERSE DEEP DIVE

💯

> Train models end-to-end with RL in your own environment/application > RL facilitates building specialized models > RL is an infra challenge While all the big labs are doing everything they can to convince companies to build on their closed APIs/models, despite telling everyone…

johannes_hage's tweet image. > Train models end-to-end with RL in your own environment/application
> RL facilitates building specialized models
> RL is an infra challenge

While all the big labs are doing everything they can to convince companies to build on their closed APIs/models, despite telling everyone…
johannes_hage's tweet image. > Train models end-to-end with RL in your own environment/application
> RL facilitates building specialized models
> RL is an infra challenge

While all the big labs are doing everything they can to convince companies to build on their closed APIs/models, despite telling everyone…


rasdani reposted

if you want the tweet version and not the 10min video version: this is now all it takes to train with prime-rl after installing verifiers

willccbb's tweet image. if you want the tweet version and not the 10min video version:

this is now all it takes to train with prime-rl after installing verifiers
willccbb's tweet image. if you want the tweet version and not the 10min video version:

this is now all it takes to train with prime-rl after installing verifiers
willccbb's tweet image. if you want the tweet version and not the 10min video version:

this is now all it takes to train with prime-rl after installing verifiers

verifiers v0.1.7 is released 🚀 this one's all about making RL training and experimentation waaaay easier: - single-command installation for prime-rl - single-command training w/ unified configs - overhauled vf.RLTrainer for hacking on new algorithms quick demo + links below :)



rasdani reposted

verifiers v0.1.7 is released 🚀 this one's all about making RL training and experimentation waaaay easier: - single-command installation for prime-rl - single-command training w/ unified configs - overhauled vf.RLTrainer for hacking on new algorithms quick demo + links below :)


rasdani reposted

Quick clip of the final full-speed crawl (no costume)


rasdani reposted

The PipelineRL paper getting rejected at NeurIPS reminds me of when the Megatron-LM paper got rejected from every conference back in 2020 scientific reviewers still don’t recognize a good systems paper when they see one openreview.net/forum?id=Eqlmp…

johannes_hage's tweet image. The PipelineRL paper getting rejected at NeurIPS reminds me of when the Megatron-LM paper got rejected from every conference back in 2020

scientific reviewers still don’t recognize a good systems paper when they see one

openreview.net/forum?id=Eqlmp…

Don't sleep on PipelineRL -- this is one of the biggest jumps in compute efficiency of RL setups that we found in the ScaleRL paper (also validated by Magistral & others before)! What's the problem PipelineRL solves? In RL for LLMs, we need to send weight updates from trainer to…

agarwl_'s tweet image. Don't sleep on PipelineRL -- this is one of the biggest jumps in compute efficiency of RL setups that we found in the ScaleRL paper (also validated by Magistral & others before)!

What's the problem PipelineRL solves? In RL for LLMs, we need to send weight updates from trainer to…
agarwl_'s tweet image. Don't sleep on PipelineRL -- this is one of the biggest jumps in compute efficiency of RL setups that we found in the ScaleRL paper (also validated by Magistral & others before)!

What's the problem PipelineRL solves? In RL for LLMs, we need to send weight updates from trainer to…
agarwl_'s tweet image. Don't sleep on PipelineRL -- this is one of the biggest jumps in compute efficiency of RL setups that we found in the ScaleRL paper (also validated by Magistral & others before)!

What's the problem PipelineRL solves? In RL for LLMs, we need to send weight updates from trainer to…


rasdani reposted

Frida Kalo

willccbb's tweet image. Frida Kalo

rasdani reposted

Day 5 of RL: open source is beautiful i had a timeout issue in @PrimeIntellect sandboxes sdk so i forked, fixed it, symlinked in my swe-bench env with uv and got positive rewards. u can just do things with open source software. also made a PR with a test. also running a…

27upon2's tweet image. Day 5 of RL: open source is beautiful

i had a timeout issue in @PrimeIntellect sandboxes sdk so i forked, fixed it, symlinked in my swe-bench env with uv and got positive rewards. u can just do things with open source software. also made a PR with a test.

also running a…
27upon2's tweet image. Day 5 of RL: open source is beautiful

i had a timeout issue in @PrimeIntellect sandboxes sdk so i forked, fixed it, symlinked in my swe-bench env with uv and got positive rewards. u can just do things with open source software. also made a PR with a test.

also running a…
27upon2's tweet image. Day 5 of RL: open source is beautiful

i had a timeout issue in @PrimeIntellect sandboxes sdk so i forked, fixed it, symlinked in my swe-bench env with uv and got positive rewards. u can just do things with open source software. also made a PR with a test.

also running a…

rasdani reposted

gonna be a big month


rasdani reposted

It finally happened: someone asked me if I work at @PrimeIntellect while wearing the @PrimeIntellect hat


rasdani reposted

Introducing Kimi CLI Technical Preview & Kimi For Coding! Kimi CLI powers your terminal: - Shell-like UI + shell command execution - Seamless Zsh integration - MCP support -Agent Client Protocol (now compatible with @zeddotdev) More features incoming!

Kimi_Moonshot's tweet image. Introducing Kimi CLI Technical Preview & Kimi For Coding!

Kimi CLI powers your terminal:
- Shell-like UI + shell command execution
- Seamless Zsh integration
- MCP support
-Agent Client Protocol (now compatible with @zeddotdev)

More features incoming!

rasdani reposted

From my tests, the new Cursor Composer 1 model is likely some variant of a DeepSeek model since it uses the same tokenizer. You can verify this by looking at the input tokens per request in your usage dashboard.

nrehiew_'s tweet image. From my tests, the new Cursor Composer 1 model is likely some variant of a DeepSeek model since it uses the same tokenizer. 

You can verify this by looking at the input tokens per request in your usage dashboard.

did you know that vLLM silently falls back to outlines when xgrammar fails?

rasdani_'s tweet image. did you know that vLLM silently falls back to outlines when xgrammar fails?

did you know that vLLM doesn’t support LoRA if DP > 1



United States Trends

Loading...

Something went wrong.


Something went wrong.