yong_zhengxin's profile picture. reasoning models @BrownCSDept || ex-intern/collab @AIatMeta @Cohere_Labs || sometimes write on http://yongzx.substack.com

Yong Zheng-Xin (Yong)

@yong_zhengxin

reasoning models @BrownCSDept || ex-intern/collab @AIatMeta @Cohere_Labs || sometimes write on http://yongzx.substack.com

Congrats on the job and thanks for sharing your experience! It's a great read and we need more articles like this. link for those interested: rona.substack.com/p/becoming-a-c…

i finally got a job as a compiler engineer!! it took months of grinding, so i wrote a biiig post about how i recruited, what the interviews are like, etc. link in bio 🥰

ronawang's tweet image. i finally got a job as a compiler engineer!! 

it took months of grinding, so i wrote a biiig post about how i recruited, what the interviews are like, etc. link in bio 🥰
ronawang's tweet image. i finally got a job as a compiler engineer!! 

it took months of grinding, so i wrote a biiig post about how i recruited, what the interviews are like, etc. link in bio 🥰


Yong Zheng-Xin (Yong) reposted

From multilingual models to diverse benchmarks and multimodal learning — Day 1 of Connect brings together researchers expanding what’s possible in global AI. 🖇️ Our lightning talks spotlight collaborative work that make AI more representative of the world’s languages. ⚡

Cohere_Labs's tweet image. From multilingual models to diverse benchmarks and multimodal learning — Day 1 of Connect brings together researchers expanding what’s possible in global AI. 🖇️

Our lightning talks spotlight collaborative work that make AI more representative of the world’s languages. ⚡

Yong Zheng-Xin (Yong) reposted

How is memorized data stored in a model? We disentangle MLP weights in LMs and ViTs into rank-1 components based on their curvature in the loss, and find representational signatures of both generalizing structure and memorized training data

jack_merullo_'s tweet image. How is memorized data stored in a model? We disentangle MLP weights in LMs and ViTs into rank-1 components based on their curvature in the loss, and find representational signatures of both generalizing structure and memorized training data

Yong Zheng-Xin (Yong) reposted

Can LLMs use tens of thousands of tools to navigate complex enterprise environments? In my @Microsoft internship work, we - introduce TheMCPCompany, a benchmark with 18,000+ tools - show that using a massive tool set is cheaper, faster, and more effective than web browsing…

reza_esfand's tweet image. Can LLMs use tens of thousands of tools to navigate complex enterprise environments?

In my @Microsoft internship work, we
  - introduce TheMCPCompany, a benchmark with 18,000+ tools
  - show that using a massive tool set is cheaper, faster, and more effective than web browsing…

Yong Zheng-Xin (Yong) reposted

📢Thrilled to introduce ATLAS 🗺️: scaling laws beyond English, for pretraining, finetuning, and the curse of multilinguality. The largest public, multilingual scaling study to-date—we ran 774 exps (10M-8B params, 400+ languages) to answer: 🌍Are scaling laws different by…

ShayneRedford's tweet image. 📢Thrilled to introduce ATLAS 🗺️: scaling laws beyond English, for pretraining, finetuning, and the curse of multilinguality.

The largest public, multilingual scaling study to-date—we ran 774 exps (10M-8B params, 400+ languages) to answer:

🌍Are scaling laws different by…
ShayneRedford's tweet image. 📢Thrilled to introduce ATLAS 🗺️: scaling laws beyond English, for pretraining, finetuning, and the curse of multilinguality.

The largest public, multilingual scaling study to-date—we ran 774 exps (10M-8B params, 400+ languages) to answer:

🌍Are scaling laws different by…

Yong Zheng-Xin (Yong) reposted

New research with @AdtRaghunathan, Nicholas Carlini and Anthropic! We built ImpossibleBench to measure reward hacking in LLM coding agents 🤖, by making benchmark tasks impossible and seeing whether models game tests or follow specs. (1/9)

fjzzq2002's tweet image. New research with @AdtRaghunathan, Nicholas Carlini and Anthropic!

We built ImpossibleBench to measure reward hacking in LLM coding agents 🤖, by making benchmark tasks impossible and seeing whether models game tests or follow specs. (1/9)

Yong Zheng-Xin (Yong) reposted

Published today in @ScienceMagazine: a landmark study led by Microsoft scientists with partners, showing how AI-powered protein design could be misused—and presenting first-of-its-kind red teaming & mitigations to strengthen biosecurity in the age of AI.


As I am working on presentation for our multilingual safety survey work at EMNLP, I came across this interesting recent report by OpenAI: "Disrupting malicious uses of AI: October 2025" At least 4 out of 7 case studies involve multilingual safety openai.com/global-affairs…

yong_zhengxin's tweet image. As I am working on presentation for our multilingual safety survey work at EMNLP, I came across this interesting recent report by OpenAI: "Disrupting malicious uses of AI: October 2025"

At least 4 out of 7 case studies involve multilingual safety

openai.com/global-affairs…

Loading...

Something went wrong.


Something went wrong.