wzhao_nlp's profile picture. reasoning & llms @Alibaba_Qwen
Opinions are my own

Wenting Zhao

@wzhao_nlp

reasoning & llms @Alibaba_Qwen Opinions are my own

People ask about how to be hired by frontier labs? Understand and be able to produce every detail👇

Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single,…

karpathy's tweet image. Excited to release new repo: nanochat!
(it's among the most unhinged I've written).

Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single,…


Wenting Zhao أعاد

Talk from Wenting Zhao of Qwen on their plans during COLM. Seems like 1 word is the plan still: scaling training up! Let’s go.

natolambert's tweet image. Talk from Wenting Zhao of Qwen on their plans during COLM. Seems like 1 word is the plan still: scaling training up! Let’s go.

I was really looking forward to be at #COLM2025 with Junyang, but visa takes forever 😞 come ask me about Qwen: how is it like to work here, what features you’d like to see, what bugs you’d like us to fix, or anything!

Sorry about missing COLM due to my failure in my VISA application. @wzhao_nlp will be there and represent Qwen to give a speech and discuss on the panel about reasoning and agents!



Want to hear some hot takes about the future of language modeling, and share your takes too? Stop by the Visions of Language Modeling workshop at COLM on Friday, October 10 in room 519A! There will be over a dozen speakers working on all kinds of problems in modeling language and…

wzhao_nlp's tweet image. Want to hear some hot takes about the future of language modeling, and share your takes too? Stop by the Visions of Language Modeling workshop at COLM on Friday, October 10 in room 519A! There will be over a dozen speakers working on all kinds of problems in modeling language and…

Wenting Zhao أعاد

When @ethansdyer and I joined Anthropic last Dec and spearheaded the discovery team, we decided to focus on unlocking computer-use as a bottleneck for scientific discovery. It has been incredible to work on improving computer-use and witness the fast progress. In OSWorld for…

bneyshabur's tweet image. When @ethansdyer and I joined Anthropic last Dec and spearheaded the discovery team, we decided to focus on unlocking computer-use as a bottleneck for scientific discovery. It has been incredible to work on improving computer-use and witness the fast progress. In OSWorld for…

Wenting Zhao أعاد

🚨Modeling Abstention via Selective Help-seeking LLMs learn to use search tools to answer questions they would otherwise hallucinate on. But can this also teach them what they know vs not? @momergul_ introduces MASH that trains LLMs for search and gets abstentions for free!…

tanyaagoyal's tweet image. 🚨Modeling Abstention via Selective Help-seeking

LLMs learn to use search tools to answer questions they would otherwise hallucinate on. But can this also teach them what they know vs not?

@momergul_ introduces MASH that trains LLMs for search and gets abstentions for free!…

Really excited and proud to see Qwen models are in the first batch of supported models for the tinker service! 🤩 we will continue to release great models to grow research in the community 😎

wzhao_nlp's tweet image. Really excited and proud to see Qwen models are in the first batch of supported models for the tinker service! 🤩 we will continue to release great models to grow research in the community 😎

Introducing Tinker: a flexible API for fine-tuning language models. Write training loops in Python on your laptop; we'll run them on distributed GPUs. Private beta starts today. We can't wait to see what researchers and developers build with cutting-edge open models!…

thinkymachines's tweet image. Introducing Tinker: a flexible API for fine-tuning language models.

Write training loops in Python on your laptop; we'll run them on distributed GPUs.

Private beta starts today. We can't wait to see what researchers and developers build with cutting-edge open models!…


Wenting Zhao أعاد

The most surprising thing working on this was that RL with LoRA completely matches full training and develops the same extended reasoning patterns. I think this is a great sign for custom agent training.

LoRA makes fine-tuning more accessible, but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post, we share our experimental results and recommendations for LoRA.…

thinkymachines's tweet image. LoRA makes fine-tuning more accessible, but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post, we share our experimental results and recommendations for LoRA.…


Wenting Zhao أعاد

I quite enjoyed this and it covers a bunch of topics without good introductory resources! 1. A bunch of GPU hardware details in one place (warp schedulers, shared memory, etc.) 2. A breakdown/walkthrough of reading PTX and SASS. 3. Some details/walkthroughs of a number of other…

New in-depth blog post time: "Inside NVIDIA GPUs: Anatomy of high performance matmul kernels". If you want to deeply understand how one writes state of the art matmul kernels in CUDA read along. (Remember matmul is the single most important operation that transformers execute…

gordic_aleksa's tweet image. New in-depth blog post time: "Inside NVIDIA GPUs: Anatomy of high performance matmul kernels". If you want to deeply understand how one writes state of the art matmul kernels in CUDA read along.

(Remember matmul is the single most important operation that transformers execute…


Wenting Zhao أعاد

LoRA makes fine-tuning more accessible, but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post, we share our experimental results and recommendations for LoRA.…

thinkymachines's tweet image. LoRA makes fine-tuning more accessible, but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post, we share our experimental results and recommendations for LoRA.…

Wenting Zhao أعاد

Check out our new work on making reasoning models think broadly! 🤔 We find a minimalist, surprisingly effective recipe to THINK for CHAT: RLVR + a strong reward model, trained on real-world prompts. This project was fun and surprised me in a few ways 👇 📌 We can run RL…

Language models that think, chat better. We used longCoT (w/ reward model) for RLHF instead of math, and it just works. Llama-3.1-8B-Instruct + 14K ex beats GPT-4o (!) on chat & creative writing, & even Claude-3.7-Sonnet (thinking) on AlpacaEval2 and WildBench! Read on. 🧵 1/8

AdithyaNLP's tweet image. Language models that think, chat better.

We used longCoT (w/ reward model) for RLHF instead of math, and it just works. Llama-3.1-8B-Instruct + 14K ex beats GPT-4o (!) on chat & creative writing, & even Claude-3.7-Sonnet (thinking) on AlpacaEval2 and WildBench!

Read on. 🧵

1/8


Wenting Zhao أعاد

I pulled some updated data for ATOM Project // Interconnects. Qwen has taken the crown, is accelerating away in market share. U.S. has signs of promise in GPT-OSS & Nvidia.

natolambert's tweet image. I pulled some updated data for ATOM Project // Interconnects.
Qwen has taken the crown, is accelerating away in market share.
U.S. has signs of promise in GPT-OSS & Nvidia.

Wenting Zhao أعاد

Indeed, if you believe in this, follow @Jianlin_S . Research everything in pure math mindset.

crystalsssup's tweet image. Indeed, if you believe in this, follow @Jianlin_S . Research everything in pure math mindset.

math matters in scaling!



Wenting Zhao أعاد

.@RichardSSutton, father of reinforcement learning, doesn’t think LLMs are bitter-lesson-pilled. My steel man of Richard’s position: we need some new architecture to enable continual (on-the-job) learning. And if we have continual learning, we don't need a special training…


Wenting Zhao أعاد

while we are on this, rmb we also had: - Neural Architecture Search with Reinforcement Learning arxiv.org/abs/1611.01578 - Symbolic Discovery of Optimization Algorithms arxiv.org/abs/2302.06675 - Using Large Language Models for Hyperparameter Optimization arxiv.org/abs/2312.04528 -…

stop designing your RL algorithms

_fracapuano's tweet image. stop designing your RL algorithms


Wenting Zhao أعاد

QWEN-3 MAX is so good this level of details was only generated by gemini deepthink before one shottes 3d simulation of a procedurally generated mini planet

هذه التغريدة لم تعد متوفرة.

I’m seriously so eager to learn kernel programming, but one thing I couldn’t decide is whether I should be an expert on it myself or teach AI to be really good at it…😶‍🌫️


Wenting Zhao أعاد

I was a bit surprised it is less than case than I expected. Code is KING. It’s the primary means of processing digital information - long term I can’t imagine a more important domain for the AGI pilled. And it is highly valuable in the interim too - big TAM @ high salaries.…


We’ve been cooking 👩‍🍳🥘

🥸 Many new Qwen models are coming soon, all empowered with enhanced code capabilities.



Loading...

Something went wrong.


Something went wrong.