Jianwei Yang

@jw2yang4ai

RS @Meta SuperIntelligence Lab; ex-MSR; Core contributor of Project Florence, Phi-3V, Omniparser; (Co-)Inventor of FocalNet, SEEM, SoM, DeepStack and Magma.

Redmond, WA

jwyang.github.io

انضم في يوليو 2016

441المنشورات 4Kالمتابعون 509المتابَعون

قد يعجبك

@YinCuiCV

@WenhuChen

@LINJIEFUN

@tydsh

@judyfhoffman

@yining_hong

@zhenjun_zhao

@yukez

@JingkangY

@kate_saenko_

@elliottszwu

@liuziwei7

@xiaolonw

@zhoubolei

@ChunyuanLi

مثبتة

Jianwei Yang

@jw2yang4ai

١٦ يونيوم

Life Update: Now that I have finished the presentation of last @MSFTResearch project Magma at @CVPR, I am excited to share that I have joined @AIatMeta as a research scientist to further push forward the boundary of multimodal foundation models! I have always been passionate…

jw2yang4ai's tweet image. Life Update: Now that I have finished the presentation of last @MSFTResearch project Magma at @CVPR, I am excited to share that I have joined @AIatMeta as a research scientist to further push forward the boundary of multimodal foundation models!

I have always been passionate…

Jianwei Yang أعاد

Chuang Gan

@gan_chuang

٦ أكتوبرم

To thrive in the industry, you must let go of your past work and quickly adapt to the latest trends that companies are investing in. This mindset stands in stark contrast to academia, where originality and perseverance are deeply valued...

Jianwei Yang

@jw2yang4ai

٢٣ سبتمبرم

🚀Excited to see Qwen3-VL released as the new SOTA open-source vision-language model! What makes it extra special is that it’s powered by DeepStack, a technique I co-developed with Lingchen, who is now a core contributor of Qwen3-VL. When Lingchen and I developed this technique…

Qwen

@Alibaba_Qwen

٢٣ سبتمبرم

🚀 We're thrilled to unveil Qwen3-VL — the most powerful vision-language model in the Qwen series yet! 🔥 The flagship model Qwen3-VL-235B-A22B is now open-sourced and available in both Instruct and Thinking versions: ✅ Instruct outperforms Gemini 2.5 Pro on key vision…

Alibaba_Qwen's tweet image. 🚀 We're thrilled to unveil Qwen3-VL — the most powerful vision-language model in the Qwen series yet!

🔥 The flagship model Qwen3-VL-235B-A22B is now open-sourced and available in both Instruct and Thinking versions:
✅ Instruct outperforms Gemini 2.5 Pro on key vision…

Jianwei Yang

@jw2yang4ai

٢٠ سبتمبرم

Great work! Building a coherent representation for the complicated world - visual and semantic, 2D and 3D, spatial and temporal, is challenging but critical. Having a single tokenizer for all is definitely a great step stone to next generation of multimodal models!

Jiasen Lu

@jiasenlu

١٩ سبتمبرم

Vision tokenizers are stuck in 2020🤔while language models revolutionized AI🚀 Language: One tokenizer for everything Vision: Fragmented across modalities & tasks Introducing AToken: The first unified visual tokenizer for images, videos & 3D that does BOTH reconstruction AND…

Jianwei Yang أعاد

XuDong Wang

@XDWang101

١٠ سبتمبرم

🎉 Excited to share RecA: Reconstruction Alignment Improves Unified Multimodal Models 🔥 Post-train w/ RecA: 8k images & 4 hours (8 GPUs) → SOTA UMMs: GenEval 0.73→0.90 | DPGBench 80.93→88.15 | ImgEdit 3.38→3.75 Code: github.com/HorizonWind200… 1/n

XDWang101's tweet image. 🎉 Excited to share RecA: Reconstruction Alignment Improves Unified Multimodal Models

🔥 Post-train w/ RecA: 8k images &amp; 4 hours (8 GPUs) → SOTA UMMs:

GenEval 0.73→0.90 | DPGBench 80.93→88.15 | ImgEdit 3.38→3.75

Code: github.com/HorizonWind200…

1/n

Jianwei Yang

@jw2yang4ai

٢٣ يوليوم

VLM struggles badly to interpret 3D from 2D observations, but what if it has a good mental model about the world? Checkout our MindJourney - A test-time scaling for spatial reasoning in 3D world. Without any specific training, MindJourney imagines (acts mentally) step-by-step…

Yuncong Yang

@YuncongYY

٢١ يوليوم

Test-time scaling nailed code & math—next stop: the real 3D world. 🌍 MindJourney pairs any VLM with a video-diffusion World Model, letting it explore an imagined scene before answering. One frame becomes a tour—and the tour leads to new SOTA in spatial reasoning. 🚀 🧵1/

Jianwei Yang أعاد

Yilun Du

@du_yilun

١٨ يوليوم

VLMs often struggle with physical reasoning tasks such as spatial reasoning. Excited to share how we can use world models + test-time search to zero-shot improve spatial reasoning in VLMs!

AK

@_akhaliq

١٨ يوليوم

MindJourney Test-Time Scaling with World Models for Spatial Reasoning

Jianwei Yang

@jw2yang4ai

٢٠ يونيوم

Wow, this is so cool! Have been dreaming of building agents that can interact with humans via language communications, and the world via physical interaction (locomotion, manipulation, etc). Definitely a great step-stone and playground!

Chuang Gan

@gan_chuang

٢٠ يونيوم

World Simulator, reimagined — now alive with humans, robots, and their vibrant society unfolding in 3D real-world geospatial scenes across the globe! 🚀 One day soon, humans and robots will co-exist in the same world. To prepare, we must address: 1️⃣ How can robots cooperate or…

Jianwei Yang أعاد

Jiasen Lu

@jiasenlu

١٣ يونيوم

check our poster at 240 on exhibition hall D at 10:30 today!

Jiasen Lu

@jiasenlu

٥ ديسمبرم

(1/10) 🔥Thrilled to introduce OneDiffusion—our latest work in unified diffusion modeling! 🚀 This model bridges the gap between image synthesis and understanding, excelling in a wide range of tasks: T2I, conditional generation, image understanding, identity preservation,…

Jianwei Yang

@jw2yang4ai

١١ يونيوم

Our afternoon session is about to start very soon with Prof. @RanjayKrishna at Room 101B!

Jianwei Yang

@jw2yang4ai

١١ يونيوم

🔥@CVPR2025 CVinW 2025 is about to take place very soon!! We have a plenty of great talks and spotlight talks upcoming (@BoqingGo, @RanjayKrishna @furongh @YunzhuLiYZ @sainingxie @CordeliaSchmid, Shizhe Chen). Look forward to seeing you all at 101B from 9am-5pm, June 11th!…

jw2yang4ai's tweet image. 🔥@CVPR2025 CVinW 2025 is about to take place very soon!! We have a plenty of great talks and spotlight talks upcoming (@BoqingGo, @RanjayKrishna @furongh @YunzhuLiYZ @sainingxie @CordeliaSchmid, Shizhe Chen). Look forward to seeing you all at 101B from 9am-5pm, June 11th!…

Jianwei Yang

@jw2yang4ai

١١ يونيوم

Jianwei Yang

@jw2yang4ai

٢٨ أبريلم

🚀 Excited to announce our 4th Workshop on Computer Vision in the Wild (CVinW) at @CVPR 2025! 🔗 computer-vision-in-the-wild.github.io/cvpr-2025/ ⭐We have invinted a great lineup of speakers: Prof. Kaiming He, Prof. @BoqingGo, Prof. @CordeliaSchmid, Prof. @RanjayKrishna, Prof. @sainingxie, Prof.…

jw2yang4ai's tweet image. 🚀 Excited to announce our 4th Workshop on Computer Vision in the Wild (CVinW) at @CVPR 2025!
🔗 computer-vision-in-the-wild.github.io/cvpr-2025/

⭐We have invinted a great lineup of speakers: Prof. Kaiming He, Prof. @BoqingGo, Prof. @CordeliaSchmid, Prof. @RanjayKrishna, Prof. @sainingxie, Prof.…

Jianwei Yang أعاد

Furong Huang

@furongh

١٠ يونيوم

Excited to speak at the Workshop on Computer Vision in the Wild @CVPR 2025! 🎥🌍 🗓️ June 11 | 📍 Room 101 B, Music City Center, Nashville, TN 🎸 🧠 Talk: From Perception to Action: Building World Models for Generalist Agents Let’s connect if you're around! #CVPR2025 #robotics…

furongh's tweet image. Excited to speak at the Workshop on Computer Vision in the Wild @CVPR 2025! 🎥🌍
🗓️ June 11 | 📍 Room 101 B, Music City Center, Nashville, TN 🎸
🧠 Talk: From Perception to Action: Building World Models for Generalist Agents
Let’s connect if you're around! #CVPR2025 #robotics…

Jianwei Yang أعاد

Cohere Labs

@Cohere_Labs

١٩ مايوم

Our community-led Computer Vision group is thrilled to host @jw2yang4ai, Principal Researcher at Microsoft Research for a session on "Magma: A Foundation Model for Multimodal AI Agents" Thanks to @cataluna84 and @Arkhymadhe for organizing this speaker session 👏

Cohere_Labs's tweet image. Our community-led Computer Vision group is thrilled to host @jw2yang4ai, Principal Researcher at Microsoft Research for a session on "Magma: A Foundation Model for Multimodal AI Agents"

Thanks to @cataluna84 and @Arkhymadhe for organizing this speaker session 👏

Jianwei Yang

@jw2yang4ai

١٦ مايوم

Hope you all had a great #NeurIPS2025 submissions and have a good rest! We are still open to submissions to our CVinW workshop at @CVPR! Welcome to share your work on our workshop with a few clicks! 👉Submit Portal: openreview.net/group?id=thecv…

Jianwei Yang

@jw2yang4ai

٢٨ أبريلم

Jianwei Yang أعاد

Richard Sutton

@RichardSSutton

٦ مايوم

The latest episode of the Derby Mill Podcast is just out and focused on the "Era of Experience" paper by David Silver and myself. Substack: insights.intrepidgp.com/p/welcome-to-t… Spotify: open.spotify.com/episode/254sxl… Apple: podcasts.apple.com/us/podcast/wel… YouTube: youtube.com/watch?v=dhfJfQ…

RichardSSutton's tweet card. #10 - Welcome to the Era of Experience

youtube.com

YouTube

#10 - Welcome to the Era of Experience

المصدر: youtube.com

Jianwei Yang أعاد

Ahmed Awadallah

@AhmedHAwadallah

١ مايوم

Introducing Phi-4-reasoning, adding reasoning models to the Phi family of SLMs. The model is trained with both supervised finetuning (using a carefully curated dataset of reasoning demonstration) and Reinforcement Learning. 📌Competitive results on reasoning benchmarks with…

AhmedHAwadallah's tweet image. Introducing Phi-4-reasoning, adding reasoning models to the Phi family of SLMs.

The model is trained with both supervised finetuning (using a carefully curated dataset of reasoning demonstration) and Reinforcement Learning.

📌Competitive results on reasoning benchmarks with…

Jianwei Yang أعاد

Yiping Wang

@ypwang61

٣٠ أبريلم

We only need ONE example for RLVR on LLMs to achieve significant improvement on math tasks! 📍RLVR with one training example can boost: - Qwen2.5-Math-1.5B: 36.0% → 73.6% - Qwen2.5-Math-7B: 51.0% → 79.2% on MATH500. 📄 Paper: arxiv.org/abs/2504.20571…

ypwang61's tweet image. We only need ONE example for RLVR on LLMs to achieve significant improvement on math tasks!

📍RLVR with one training example can boost:
- Qwen2.5-Math-1.5B: 36.0% → 73.6%
- Qwen2.5-Math-7B: 51.0% → 79.2%
on MATH500.

📄 Paper: arxiv.org/abs/2504.20571…

AK

@_akhaliq

Jim Fan

@DrJimFan

Yi Ma

@YiMaTweets

Lucas Beyer (bl16)

@giffmana

Ross Wightman

@wightmanr

Xiaolong Wang

@xiaolonw

Dhruv Batra

@DhruvBatra_

Jean de Nyandwi

@Jeande_d

Andrew Ng

@AndrewYNg

Jeremy Howard

@jeremyphoward

Xin Eric Wang

@xwang_lk

Yuliang Xiu

@yuliangxiu

Zhuang Liu

@liuzhuang1234

Aishwarya Kamath

@ashkamath20

Thomas G. Dietterich

@tdietterich

Ofir Press

@OfirPress

Bill Yuchen Lin

@billyuchenlin

Jaemin Cho

@jmin__cho

Min-Hung (Steve) Chen

@CMHungSteven

Thi Nhi Bui

@Yukit517

MaticF

@MaticF3

Callum Williams

@Callum_Seer

GamingLYcom🇪🇭

@GamingLYcom

Zack Li-Nexa AI

@zacklearner

Brianna

@aeacauayai

Emilylia Dieter

@EmilyliaD29322

Kazim Haidari

@kazimhaidari009

Sushil Pokhrel

@sushilpokhrel

Blasir

@Blasir7497985

C

@Carsonalignment

Cathy

@Cathy1012690401

ike

@ike1068917

Alycia Barnard

@NguumaF35562

Brim

@cullinan902

Barvuim

@Barvuim150566

Andrew

@sandreyw

Abhishek Chandel

@Abhishe01794149

Robert Scoble

@Scobleizer

Jagadeesh

@Jagadeesh_co98

Bryce Tomsic

@BryceTomsi22709

David Qiu

@qiucisco

lll

@lrglllrrr

Ubaid

@ubaidmume

Tan chao

@_tanchao

Mauro S.

@ma_sc_

Me

@indeepseo

Jinyu Chen

@jychen_cs

yulianghua

@wqx520ylh

ping

@9yAIE9nSA9hEqOJ

Haihao Shen

@HaihaoShen

nguyen tran quang

@nguyengon

Agni

@ShaunAgni

Yuya Fujiwara

@xpetapa

Bug

@CaxCaxCat

Blessing Agyei Kyem

@KyemAgyei

Liq

@liquid000

BM building AI

@BMAIengineer

nixpiper

@nixpiper

Balazs Bedo

@hillsidaz

(ง •̀_•́)ง

@uwu_____uwu

Yan Chen, PhD

@chenyan02

a

@techiePI3

Maximilian Schilling

@MaxSchilling420

Mesut De

@DemirciMesut

Weihang Xiao

@weihang_xiao

Casual Bits

@meta_autistic

Vaclav Cerny

@vacla_vcerny

AK

@_akhaliq

Yann LeCun

@ylecun

Andrej Karpathy

@karpathy

hardmaru

@hardmaru

Jim Fan

@DrJimFan

Jia-Bin Huang

@jbhuang0604

Kosta Derpanis

@CSProfKGD

Yi Ma

@YiMaTweets

Lucas Beyer (bl16)

@giffmana

Aran Komatsuzaki

@arankomatsuzaki

François Chollet

@fchollet

Michael Black

@Michael_J_Black

AI at Meta

@AIatMeta

Yuandong Tian

@tydsh

Judy Hoffman

@judyfhoffman

Google DeepMind

@GoogleDeepMind

Kyunghyun Cho

@kchonyc

Soumith Chintala

@soumithchintala

AI Pub

@ai__pub

Ross Wightman

@wightmanr

Bill Peebles

@billpeeb

jietang

@jietang

Jang Hyun (Vincent) Cho

@vincent_jh_cho

Heng Wang

@hengwang00

John Schulman

@johnschulman2

Barret Zoph

@barret_zoph

Qineng Wang

@qineng_wang

Kaiwen Zhang (Kevin)

@sze68zkw

Shengjia Zhao

@shengjia_zhao

Grok

@grok

Yilun Du

@du_yilun

Xiao Ma

@yusufma555

Michael Churchill

@ChurchillMic

Tao Yu

@taoyds

Kuan Fang

@KuanFang

Haozhe Xie

@zjhzxhz

Shusheng Yang

@shushengyang

Luming Tang

@lt453_

Chen Change Loy

@ccloy

Hongwei Yi

@HongweiYi2

Ligeng Zhu

@LigengZhu

CLS @COLM2025

@ChengleiSi

Xiang Yue

@xiangyue96

Xiyao Wang@ICCV 25

@XiyaoWang10

Martin Ziqiao Ma

@ziqiao_ma

Danfei Xu

@danfei_xu

Bolei Zhou

@zhoubolei

Trapit Bansal

@TrapitBansal

Thomas Wolf

@Thom_Wolf

Chris Paxton

@chris_j_paxton

Shuchao Bi

@shuchaobi

Victoria X Lin

@VictoriaLinML

Jiachen Li

@JiachenLi11

Hanhui Wang

@hhwang1108

Yeda Song

@__runamu__

Gabriel Sarch

@GabrielSarch

Moritz Laurer

@MoritzLaurer

Jianfeng Wang

@jianfw

$sarahookr's profile picture. Adaptive Intelligence. Built @Cohere_Labs, @GoogleBrain, @GoogleDeepmind. ML Efficiency, Multimodal\lingual. Changing spaces where breakthroughs happen.$