Syeda Nahida Akter

@__SyedaAkter

PhD student at @LTIatCMU @SCSatCMU and research intern @NVIDIA. Working on improving Reasoning of Generative Models! (@reasyaay.bsky.social)

Pittsburgh, PA

snat1505027.github.io

Joined May 2020

231Posts 578Followers 542Following

You might like

@_Hao_Zhu

@shrutirij

@tparekh97

@lasha_nlp

@snigdhac25

@danish037

@pliang279

@Sid_Arora_18

@XiaochuangHan

@EdwardSun0909

@siddalmia05

@HanGuo97

@cambridgenlp

@_Guuuuuuuu_

@sanketvmehta

Pinned

Syeda Nahida Akter

@__SyedaAkter

Sep 30

Most LLMs learn to think only after pretraining—via SFT or RL. But what if they could learn to think during it? 🤔 Introducing RLP: Reinforcement Learning Pre-training—a verifier-free objective that teaches models to “think before predicting.” 🔥 Result: Massive reasoning…

__SyedaAkter's tweet image. Most LLMs learn to think only after pretraining—via SFT or RL. But what if they could learn to think during it? 🤔

Introducing RLP: Reinforcement Learning Pre-training—a verifier-free objective that teaches models to “think before predicting.”

🔥 Result: Massive reasoning…

Syeda Nahida Akter reposted

Ali Hatamizadeh

@ahatamiz1

Oct 14

If you're a PhD student interested in doing an internship with me and @shrimai_ on RL–based pre-training/LLM reasoning, send an email ([email protected]) with: 1⃣: Short intro about you 2⃣: Link to your relevant paper I will read all emails but can't respond to all.

Syeda Nahida Akter reposted

Siva Reddy

@sivareddyg

Oct 10

Lot of insights in @YejinChoinka's talk on RL training. Rip for next token prediction training (NTP) and welcome to Reinforcement Learning Pretraining (RLP). #COLM2025 No place to even stand in the room.

sivareddyg's tweet image. Lot of insights in @YejinChoinka's talk on RL training. Rip for next token prediction training (NTP) and welcome to Reinforcement Learning Pretraining (RLP). #COLM2025

No place to even stand in the room.

Syeda Nahida Akter reposted

VentureBeat

@VentureBeat

Oct 10

By teaching models to reason during foundational training, RLP aims to reduce logical errors and boost reliability for complex reasoning workflows. venturebeat.com/ai/nvidia-rese…

venturebeat.com

Nvidia researchers boost LLMs reasoning skills by getting them to 'think' during pre-training

By teaching models to reason during foundational training, the verifier-free method aims to reduce logical errors and boost reliability for complex enterprise workflows.

Source: venturebeat.com

Syeda Nahida Akter reposted

Shrimai

@shrimai_

Oct 9

Thank you @rohanpaul_ai for highlighting our work!💫 Front-Loading Reasoning shows that inclusion of reasoning data in pretraining is beneficial, does not lead to overfitting after SFT, & has latent effect unlocked by SFT! Paper: arxiv.org/abs/2510.03264 Blog:…

Rohan Paul

@rohanpaul_ai

Oct 9

New @nvidia paper shows that teaching reasoning early during pretraining builds abilities that later fine-tuning cannot recover. Doing this early gives a 19% average boost on tough tasks after all post-training. Pretraining is the long first stage where the model learns to…

rohanpaul_ai's tweet image. New @nvidia paper shows that teaching reasoning early during pretraining builds abilities that later fine-tuning cannot recover.

Doing this early gives a 19% average boost on tough tasks after all post-training.

Pretraining is the long first stage where the model learns to…

Syeda Nahida Akter

@__SyedaAkter

Oct 9

Thank you @rohanpaul_ai for sharing our work! In "Front-Loading Reasoning", we show that injecting reasoning data into pretraining builds models that reach the frontier. On average, +22% (pretraining) → +91% (SFT) → +49% (RL) relative gains. 🚀 🔗Paper:…

Rohan Paul

@rohanpaul_ai

Oct 9

Syeda Nahida Akter reposted

AK

@_akhaliq

Oct 3

Nvidia presents RLP Reinforcement as a Pretraining Objective

Syeda Nahida Akter reposted

Shrimai

@shrimai_

Oct 2

When should LLMs learn to reason—early in pretraining or late in fine-tuning?🤔 Front-Loading Reasoning, shows that injecting reasoning data early creates durable, compounding gains that post-training alone cannot recover Paper:tinyurl.com/3tzkemtp Blog:research.nvidia.com/labs/adlr/Syne…

shrimai_'s tweet image. When should LLMs learn to reason—early in pretraining or late in fine-tuning?🤔
Front-Loading Reasoning, shows that injecting reasoning data early creates durable, compounding gains that post-training alone cannot recover
Paper:tinyurl.com/3tzkemtp
Blog:research.nvidia.com/labs/adlr/Syne…

Syeda Nahida Akter reposted

Rohan Paul

@rohanpaul_ai

Oct 2

New Nvidia paper introduces Reinforcement Learning Pretraining (RLP), a pretraining objective that rewards useful thinking before each next token prediction. On a 12B hybrid model, RLP lifted overall accuracy by 35% using 0.125% of the data. The big deal here is that it moves…

rohanpaul_ai's tweet image. New Nvidia paper introduces Reinforcement Learning Pretraining (RLP), a pretraining objective that rewards useful thinking before each next token prediction.

On a 12B hybrid model, RLP lifted overall accuracy by 35% using 0.125% of the data.

The big deal here is that it moves…

Syeda Nahida Akter reposted

Shrimai

@shrimai_

Sep 30

💫 Introducing RLP: Reinforcement Learning Pretraining—information-driven, verifier-free objective that teaches models to think before they predict 🔥+19% vs BASE on Qwen3-1.7B 🚀+35% vs BASE on Nemotron-Nano-12B 📄Paper: github.com/NVlabs/RLP/blo… 📝Blog: research.nvidia.com/labs/adlr/RLP/

shrimai_'s tweet image. 💫 Introducing RLP: Reinforcement Learning Pretraining—information-driven, verifier-free objective that teaches models to think before they predict
🔥+19% vs BASE on Qwen3-1.7B
🚀+35% vs BASE on Nemotron-Nano-12B
📄Paper: github.com/NVlabs/RLP/blo…
📝Blog: research.nvidia.com/labs/adlr/RLP/

Syeda Nahida Akter reposted

Ali Hatamizadeh

@ahatamiz1

Sep 30

Are you ready for web-scale pre-training with RL ? 🚀 🔥 New paper: RLP : Reinforcement Learning Pre‑training We flip the usual recipe for reasoning LLMs: instead of saving RL for post‑training, we bring exploration into pretraining. Core idea: treat chain‑of‑thought as an…