wzhao_nlp's profile picture. reasoning & llms @Alibaba_Qwen
Opinions are my own

Wenting Zhao

@wzhao_nlp

reasoning & llms @Alibaba_Qwen Opinions are my own

People ask about how to be hired by frontier labs? Understand and be able to produce every detail👇

Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single,…

karpathy's tweet image. Excited to release new repo: nanochat!
(it's among the most unhinged I've written).

Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single,…


Wenting Zhao đã đăng lại

Talk from Wenting Zhao of Qwen on their plans during COLM. Seems like 1 word is the plan still: scaling training up! Let’s go.

natolambert's tweet image. Talk from Wenting Zhao of Qwen on their plans during COLM. Seems like 1 word is the plan still: scaling training up! Let’s go.

I was really looking forward to be at #COLM2025 with Junyang, but visa takes forever 😞 come ask me about Qwen: how is it like to work here, what features you’d like to see, what bugs you’d like us to fix, or anything!

Sorry about missing COLM due to my failure in my VISA application. @wzhao_nlp will be there and represent Qwen to give a speech and discuss on the panel about reasoning and agents!



Want to hear some hot takes about the future of language modeling, and share your takes too? Stop by the Visions of Language Modeling workshop at COLM on Friday, October 10 in room 519A! There will be over a dozen speakers working on all kinds of problems in modeling language and…

wzhao_nlp's tweet image. Want to hear some hot takes about the future of language modeling, and share your takes too? Stop by the Visions of Language Modeling workshop at COLM on Friday, October 10 in room 519A! There will be over a dozen speakers working on all kinds of problems in modeling language and…

Wenting Zhao đã đăng lại

When @ethansdyer and I joined Anthropic last Dec and spearheaded the discovery team, we decided to focus on unlocking computer-use as a bottleneck for scientific discovery. It has been incredible to work on improving computer-use and witness the fast progress. In OSWorld for…

bneyshabur's tweet image. When @ethansdyer and I joined Anthropic last Dec and spearheaded the discovery team, we decided to focus on unlocking computer-use as a bottleneck for scientific discovery. It has been incredible to work on improving computer-use and witness the fast progress. In OSWorld for…

Wenting Zhao đã đăng lại

🚨Modeling Abstention via Selective Help-seeking LLMs learn to use search tools to answer questions they would otherwise hallucinate on. But can this also teach them what they know vs not? @momergul_ introduces MASH that trains LLMs for search and gets abstentions for free!…

tanyaagoyal's tweet image. 🚨Modeling Abstention via Selective Help-seeking

LLMs learn to use search tools to answer questions they would otherwise hallucinate on. But can this also teach them what they know vs not?

@momergul_ introduces MASH that trains LLMs for search and gets abstentions for free!…

Really excited and proud to see Qwen models are in the first batch of supported models for the tinker service! 🤩 we will continue to release great models to grow research in the community 😎

wzhao_nlp's tweet image. Really excited and proud to see Qwen models are in the first batch of supported models for the tinker service! 🤩 we will continue to release great models to grow research in the community 😎

Introducing Tinker: a flexible API for fine-tuning language models. Write training loops in Python on your laptop; we'll run them on distributed GPUs. Private beta starts today. We can't wait to see what researchers and developers build with cutting-edge open models!…

thinkymachines's tweet image. Introducing Tinker: a flexible API for fine-tuning language models.

Write training loops in Python on your laptop; we'll run them on distributed GPUs.

Private beta starts today. We can't wait to see what researchers and developers build with cutting-edge open models!…


Wenting Zhao đã đăng lại

The most surprising thing working on this was that RL with LoRA completely matches full training and develops the same extended reasoning patterns. I think this is a great sign for custom agent training.

LoRA makes fine-tuning more accessible, but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post, we share our experimental results and recommendations for LoRA.…

thinkymachines's tweet image. LoRA makes fine-tuning more accessible, but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post, we share our experimental results and recommendations for LoRA.…


Wenting Zhao đã đăng lại

I quite enjoyed this and it covers a bunch of topics without good introductory resources! 1. A bunch of GPU hardware details in one place (warp schedulers, shared memory, etc.) 2. A breakdown/walkthrough of reading PTX and SASS. 3. Some details/walkthroughs of a number of other…

New in-depth blog post time: "Inside NVIDIA GPUs: Anatomy of high performance matmul kernels". If you want to deeply understand how one writes state of the art matmul kernels in CUDA read along. (Remember matmul is the single most important operation that transformers execute…

gordic_aleksa's tweet image. New in-depth blog post time: "Inside NVIDIA GPUs: Anatomy of high performance matmul kernels". If you want to deeply understand how one writes state of the art matmul kernels in CUDA read along.

(Remember matmul is the single most important operation that transformers execute…


Wenting Zhao đã đăng lại

LoRA makes fine-tuning more accessible, but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post, we share our experimental results and recommendations for LoRA.…

thinkymachines's tweet image. LoRA makes fine-tuning more accessible, but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post, we share our experimental results and recommendations for LoRA.…

Wenting Zhao đã đăng lại

Check out our new work on making reasoning models think broadly! 🤔 We find a minimalist, surprisingly effective recipe to THINK for CHAT: RLVR + a strong reward model, trained on real-world prompts. This project was fun and surprised me in a few ways 👇 📌 We can run RL…

Language models that think, chat better. We used longCoT (w/ reward model) for RLHF instead of math, and it just works. Llama-3.1-8B-Instruct + 14K ex beats GPT-4o (!) on chat & creative writing, & even Claude-3.7-Sonnet (thinking) on AlpacaEval2 and WildBench! Read on. 🧵 1/8

AdithyaNLP's tweet image. Language models that think, chat better.

We used longCoT (w/ reward model) for RLHF instead of math, and it just works. Llama-3.1-8B-Instruct + 14K ex beats GPT-4o (!) on chat & creative writing, & even Claude-3.7-Sonnet (thinking) on AlpacaEval2 and WildBench!

Read on. 🧵

1/8


Wenting Zhao đã đăng lại

I pulled some updated data for ATOM Project // Interconnects. Qwen has taken the crown, is accelerating away in market share. U.S. has signs of promise in GPT-OSS & Nvidia.

natolambert's tweet image. I pulled some updated data for ATOM Project // Interconnects.
Qwen has taken the crown, is accelerating away in market share.
U.S. has signs of promise in GPT-OSS & Nvidia.

Wenting Zhao đã đăng lại

Indeed, if you believe in this, follow @Jianlin_S . Research everything in pure math mindset.

crystalsssup's tweet image. Indeed, if you believe in this, follow @Jianlin_S . Research everything in pure math mindset.

math matters in scaling!



Wenting Zhao đã đăng lại

.@RichardSSutton, father of reinforcement learning, doesn’t think LLMs are bitter-lesson-pilled. My steel man of Richard’s position: we need some new architecture to enable continual (on-the-job) learning. And if we have continual learning, we don't need a special training…


Wenting Zhao đã đăng lại

while we are on this, rmb we also had: - Neural Architecture Search with Reinforcement Learning arxiv.org/abs/1611.01578 - Symbolic Discovery of Optimization Algorithms arxiv.org/abs/2302.06675 - Using Large Language Models for Hyperparameter Optimization arxiv.org/abs/2312.04528 -…

stop designing your RL algorithms

_fracapuano's tweet image. stop designing your RL algorithms


Congrats to the codellama team on the release! Some real good stuff

(🧵) Today, we release Meta Code World Model (CWM), a 32-billion-parameter dense LLM that enables novel research on improving code generation through agentic reasoning and planning with world models. ai.meta.com/research/publi…



Wenting Zhao đã đăng lại

QWEN-3 MAX is so good this level of details was only generated by gemini deepthink before one shottes 3d simulation of a procedurally generated mini planet

Tweet này không còn khả dụng.

I’m seriously so eager to learn kernel programming, but one thing I couldn’t decide is whether I should be an expert on it myself or teach AI to be really good at it…😶‍🌫️


Wenting Zhao đã đăng lại

I was a bit surprised it is less than case than I expected. Code is KING. It’s the primary means of processing digital information - long term I can’t imagine a more important domain for the AGI pilled. And it is highly valuable in the interim too - big TAM @ high salaries.…


We’ve been cooking 👩‍🍳🥘

🥸 Many new Qwen models are coming soon, all empowered with enhanced code capabilities.



Loading...

Something went wrong.


Something went wrong.