ZhepengCen's profile picture. PhD student @ CMU

Zhepeng Cen

@ZhepengCen

PhD student @ CMU

🚀 Scaling RL to Pretraining Levels with Webscale-RL RL for LLMs has been bottlenecked by tiny datasets (<10B tokens) vs pretraining (>1T). Our Webscale-RL pipeline converts pretraining text into diverse RL-ready QA data — scaling RL to pretraining levels! All codes and…

ZhepengCen's tweet image. 🚀 Scaling RL to Pretraining Levels with Webscale-RL

RL for LLMs has been bottlenecked by tiny datasets (&amp;lt;10B tokens) vs pretraining (&amp;gt;1T).
Our Webscale-RL pipeline converts pretraining text into diverse RL-ready QA data — scaling RL to pretraining levels!

All codes and…

Zhepeng Cen reposted

Today my team at @SFResearch drops CoDA-1.7B: a text diffusion coding model that outputs tokens bidirectionally in parallel. ⚡️ Faster inference, 1.7B rivaling 7B. 📊 54.3% HumanEval | 47.6% HumanEval+ | 55.4% EvalPlus 🤗HF: huggingface.co/Salesforce/CoD… Any questions, lmk!

iscreamnearby's tweet image. Today my team at @SFResearch drops CoDA-1.7B: a text diffusion coding model that outputs tokens bidirectionally in parallel.

⚡️ Faster inference, 1.7B rivaling 7B.
📊 54.3% HumanEval | 47.6% HumanEval+ | 55.4% EvalPlus

🤗HF: huggingface.co/Salesforce/CoD…

Any questions, lmk!

Zhepeng Cen reposted

🚨 Introducing LoCoBench: a comprehensive benchmark for evaluating long-context LLMs in complex software development 📄 Paper: bit.ly/4ponX3P 🔗 GitHub: bit.ly/4pvIfbZ ✨ Key Features: 📊 8,000 evaluation scenarios across 10 programming languages 🔍 Context…

SFResearch's tweet image. 🚨 Introducing LoCoBench: a comprehensive benchmark for evaluating long-context LLMs in complex software development

📄 Paper: bit.ly/4ponX3P
🔗 GitHub: bit.ly/4pvIfbZ

✨ Key Features:
📊 8,000 evaluation scenarios across 10 programming languages
🔍 Context…

Zhepeng Cen reposted

LLMs trained to memorize new facts can’t use those facts well.🤔 We apply a hypernetwork to ✏️edit✏️ the gradients for fact propagation, improving accuracy by 2x on a challenging subset of RippleEdit!💡 Our approach, PropMEND, extends MEND with a new objective for propagation.

ZEYULIU10's tweet image. LLMs trained to memorize new facts can’t use those facts well.🤔

We apply a hypernetwork to ✏️edit✏️ the gradients for fact propagation, improving accuracy by 2x on a challenging subset of RippleEdit!💡

Our approach, PropMEND, extends MEND with a new objective for propagation.
ZEYULIU10's tweet image. LLMs trained to memorize new facts can’t use those facts well.🤔

We apply a hypernetwork to ✏️edit✏️ the gradients for fact propagation, improving accuracy by 2x on a challenging subset of RippleEdit!💡

Our approach, PropMEND, extends MEND with a new objective for propagation.

🚀 Introducing BRIDGE — a task-agnostic data augmentation strategy to prepare LLMs for RL! 🤖 Why do LLMs often fail to benefit from RL fine-tuning? We pinpoint two key factors: 1) 🔍 Rollout Accuracy 2) 🔗 Data Co-Influence. 💡 BRIDGE injects both exploration & exploitation…

ZhepengCen's tweet image. 🚀 Introducing BRIDGE — a task-agnostic data augmentation strategy to prepare LLMs for RL!

🤖 Why do LLMs often fail to benefit from RL fine-tuning? We pinpoint two key factors: 1) 🔍 Rollout Accuracy 2) 🔗 Data Co-Influence. 💡 BRIDGE injects both exploration &amp;amp; exploitation…

Zhepeng Cen reposted

🙌 Happy New Year everyone! 🤖 New preprint: TinyHelen's First Curriculum: Training and Evaluating Tiny Language Models in a Simpler Language Environment 🤖 We train and evaluate tiny language models (LMs) using a novel text dataset with systematically simplified vocabularies and…

EmpathYang's tweet image. 🙌 Happy New Year everyone!
🤖 New preprint: TinyHelen&apos;s First Curriculum: Training and Evaluating Tiny Language Models in a Simpler Language Environment
🤖 We train and evaluate tiny language models (LMs) using a novel text dataset with systematically simplified vocabularies and…
EmpathYang's tweet image. 🙌 Happy New Year everyone!
🤖 New preprint: TinyHelen&apos;s First Curriculum: Training and Evaluating Tiny Language Models in a Simpler Language Environment
🤖 We train and evaluate tiny language models (LMs) using a novel text dataset with systematically simplified vocabularies and…
EmpathYang's tweet image. 🙌 Happy New Year everyone!
🤖 New preprint: TinyHelen&apos;s First Curriculum: Training and Evaluating Tiny Language Models in a Simpler Language Environment
🤖 We train and evaluate tiny language models (LMs) using a novel text dataset with systematically simplified vocabularies and…
EmpathYang's tweet image. 🙌 Happy New Year everyone!
🤖 New preprint: TinyHelen&apos;s First Curriculum: Training and Evaluating Tiny Language Models in a Simpler Language Environment
🤖 We train and evaluate tiny language models (LMs) using a novel text dataset with systematically simplified vocabularies and…

Loading...

Something went wrong.


Something went wrong.