Yichen Sheng

@Coding_Black

Research scientist in NVIDIA. Working in graphics and vision. Opinions are my own.

West Lafayette, IN

shengcn.github.io

Joined July 2015

129Posts 281Followers 1KFollowing

You might like

@SGiebenhain

@xiuming_zhang

@zhanghesprinter

@_cheng_lin

@HaoTang_ai

@frankzydou

@jin_linyi

@Alex_wangjingbo

@YimingXie4

@giljoo_nam

@AvaXYao

@JiaweiYang118

@MajiSubhransu

@liuyuehcheng

@guangyuNoah

Pinned

Yichen Sheng

@Coding_Black

Jan 17

So happy to see #DL3DV-10K boradly used. People asked me the born of #DL3DV. DL3DV was a student-led dataset without any funding from professors or any company. Every data collection team member worked really hard. Proud of the team and can't wait to see the future!

Yichen Sheng

@Coding_Black

Dec 29, 2023

Thanks @_akhaliq. It takes a lot of effort to collect this large-scale video dataset (more than 10K video) for 3D vision community. We hope you enjoy the gift!

Yichen Sheng

@Coding_Black

Oct 19

Heading to #ICCV2025 ? Do not miss tomorrow's DLSS4 keynote talk by @edliu1105 Time: 1:30~2:15 pm, 20th Oct ROOM: B317

Yichen Sheng

@Coding_Black

Aug 18

🚀🚀🚀

Andrew Price

@andrewpprice

Aug 15

The 3 Coolest Things I saw at #Siggraph this year: 1. Nvidia's DLSS Denoiser in Blender This makes Cycles basically real-time. Meaning 3d artists can create with almost zero latency. Not yet available to the public, but it's expected to make it into a future Blender release.

Yichen Sheng reposted

Guilin Liu

@GuilinL

Jul 18

One of our roles in LLM/VLM research at NVIDIA is to explore effective data recipes for training large-scale models and share them to the public—an area where transparency has been limited, as seen with models like Gemini, GPT-4o, Qwen-VL models etc. The Eagle2 project aligns…

Andi Marafioti

@andimarafioti

Jun 4

The Eagle 2 paper from Nvidia is such a goldmine.

Yichen Sheng reposted

Andi Marafioti

@andimarafioti

Jun 4

The Eagle 2 paper from Nvidia is such a goldmine.

Yichen Sheng

@Coding_Black

Jun 14

@CVPR2025 Wei Xiong and I will host a coffee chat session(11. 30～12. 30) in Room211 for Nvidia university hiring. Feel free to come and say hi if you are interested in DLSS and next generation generative rendering. linkedin.com/posts/nvidia-u…

Coding_Black's tweet card. NVIDIA will be at #CVPR2025! Stop by Meeting Room #211 (Level 2) for casual coffee chats with our recruiting team and researchers—and don’t miss your chance to grab some swag. #NVIDIAlife 🗓️ June...

#cvpr2025 #nvidialife | NVIDIA University Recruiting

Source: linkedin.com

Yichen Sheng reposted

Lu Ling

@LuLing26466911

Jun 13

Congrats to "VGGT" wins the best paper award at #CVPR2025 ! We are happy that #DL3DV benefits "VGGT" and the community. We will host the #DL3DV Demo session this afternoon from 4:00-6:00 pm. Come by and see what is the new in DL3DV!

Visual Geometry Group (VGG)

@Oxford_VGG

Jun 13

Many Congratulations to @jianyuan_wang, @MinghaoChen23, @n_karaev, Andrea Vedaldi, Christian Rupprecht and @davnov134 for winning the Best Paper Award @CVPR for "VGGT: Visual Geometry Grounded Transformer" 🥇🎉 🙌🙌 #CVPR2025!!!!!!

Oxford_VGG's tweet image. Many Congratulations to @jianyuan_wang, @MinghaoChen23, @n_karaev, Andrea Vedaldi, Christian Rupprecht and @davnov134 for winning the Best Paper Award @CVPR for "VGGT: Visual Geometry Grounded Transformer" 🥇🎉 🙌🙌 #CVPR2025!!!!!!

Yichen Sheng reposted

Zian Wang

@zianwang97

Jun 13

🚀 We just open-sourced Cosmos DiffusionRenderer! This major upgrade brings significantly improved video de-lighting and re-lighting—powered by NVIDIA Cosmos and enhanced data curation. Released under Apache 2.0 and Open Model License. Try it out! 🔗 github.com/nv-tlabs/cosmo…

Zian Wang

@zianwang97

Jan 31

🚀 Introducing DiffusionRenderer, a neural rendering engine powered by video diffusion models. 🎥 Estimates high-quality geometry and materials from videos, synthesizes photorealistic light transport, enables relighting and material editing with realistic shadows and reflections

Yichen Sheng reposted

Andrew Tao

@drewtao

Jun 3

Vision Language Models can be amazing at document understanding. Please check out our Nano sized model. More to come!

NVIDIA AI Developer

@NVIDIAAIDev

Jun 3

🥇Our NVIDIA Llama Nemotron Nano VL model is #1 on the OCRBench V2 leaderboard. Designed for advanced intelligent document processing and understanding, this model extracts diverse info from complex documents with precision, all on a single GPU. 📗 Get the technical details…

NVIDIAAIDev's tweet image. 🥇Our NVIDIA Llama Nemotron Nano VL model is #1 on the OCRBench V2 leaderboard.

Designed for advanced intelligent document processing and understanding, this model extracts diverse info from complex documents with precision, all on a single GPU.

📗 Get the technical details…

Yichen Sheng reposted

Stanley H. Chan

@stanley_h_chan

Apr 16

Generative photography is our new physics-based camera-aware image generation tool. To appear @CVPR 2025 (Highlight paper 13.5%) arxiv.org/abs/2412.02168 - Respect camera controls - Photorealistic generation Credits to Yu Yuan @PurdueECE

Yichen Sheng reposted

Wei Ping

@_weiping

Apr 24

Introducing AceMath-RL-Nemotron-7B, an open math model trained with reinforcement learning from the SFT-only checkpoint: Deepseek-R1-Distilled-Qwen-7B. It achieves: - AIME24: 69.0 (+13.5 gain by RL) - AIME25: 53.6 (+14.4) - LiveCodeBench: 44.4 (surprisingly, +6.8 gain after…

_weiping's tweet image. Introducing AceMath-RL-Nemotron-7B, an open math model trained with reinforcement learning from the SFT-only checkpoint: Deepseek-R1-Distilled-Qwen-7B.
It achieves:
- AIME24: 69.0 (+13.5 gain by RL)
- AIME25: 53.6 (+14.4)
- LiveCodeBench: 44.4 (surprisingly, +6.8 gain after…

Yichen Sheng reposted

Guilin Liu

@GuilinL

Apr 22

Eagle2.5 natively supports long-context without using any compression module. Eagle2.5-8B has: • got 6 out of 10 SOTA on long video benchmarks • beat GPT-4o (0806) on 3/5 video tasks • beat Gemini 1.5 Pro on 4/6 video tasks • got SOTA result on Hour-long video benchmark.

AK

@_akhaliq

Apr 22

Nvidia just dropped Eagle 2.5 Boosting Long-Context Post-Training for Frontier Vision-Language Models

Yichen Sheng reposted

Bryan Catanzaro

@ctnzr

Apr 3

A long time ago, back before DLSS was in many games (and when my hair was shorter and less gray), I went to Nintendo HQ to show them an early prototype of DLSS 2, in the hopes that a future Switch console would use DLSS. I'm so proud that the Switch 2 will be DLSS powered!…

ctnzr's tweet image. A long time ago, back before DLSS was in many games (and when my hair was shorter and less gray), I went to Nintendo HQ to show them an early prototype of DLSS 2, in the hopes that a future Switch console would use DLSS. I'm so proud that the Switch 2 will be DLSS powered!…

Yichen Sheng reposted

Edward Liu

@edliu1105

Mar 13

Even though we haven't been publishing papers around DLSS, it takes a tremendous amount of hardcore research to bring AI models like DLSS4 into production. Super happy that we're sharing technical insights on DLSS4 in this report. So proud of the teams!

Bryan Catanzaro

@ctnzr

Mar 13

Technical insights into what makes DLSS4 so great, including MFG, transformer models, and Reflex Frame Warp: research.nvidia.com/labs/adlr/DLSS…

ctnzr's tweet image. Technical insights into what makes DLSS4 so great, including MFG, transformer models, and Reflex Frame Warp:
research.nvidia.com/labs/adlr/DLSS…

Yichen Sheng reposted

Yilun Xu

@xuyilun2

Feb 24

Tired of slow diffusion models? Our new paper introduces f-distill, enabling arbitrary f-divergence for one-step diffusion distillation. JS divergence gives SOTA results on text-to-image! Choose the divergence that suits your needs. Joint work with @wn8_nie @ArashVahdat 1/N

xuyilun2's tweet image. Tired of slow diffusion models? Our new paper introduces f-distill, enabling arbitrary f-divergence for one-step diffusion distillation. JS divergence gives SOTA results on text-to-image! Choose the divergence that suits your needs.

Joint work with @wn8_nie @ArashVahdat 1/N

Yichen Sheng reposted

Ming-Yu Liu

@liu_mingyu

Jan 7

github.com/NVIDIA/Cosmos Cosmos is a developer-first platform designed to help physical AI builders accelerate their development. It has pre-trained world foundation models (diffusion & autoregressive) in different sizes and video tokenizers. They are open models with permissive…

liu_mingyu's tweet card. New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos - NVIDIA/Cosmos

GitHub - NVIDIA/Cosmos: New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos

Source: github.com

Yichen Sheng

@Coding_Black

Jan 7

Big Congratulations to our team! We firmly believe AI will shape the future of the graphics. If you are also passionate and want to make a history, please join us!

Edward Liu

@edliu1105

Jan 7

DLSS 4 is here!🚀Multi-Frame Generation for smoother frame rates and Transformer-based models driving enhanced image quality. This milestone reflects the passion, creativity, and dedication of many people at NVIDIA. Excited to shape the future of graphics with AI? Join us!

Yichen Sheng reposted

Rafael Valle

@RafaelValleArt

Dec 17

Our team at NVIDIA is continuously looking for highly motivated interns to work on intelligence in audio understanding and synthesis. Please reach out if you would like to collaborate with us!

Yichen Sheng

@Coding_Black

Dec 9

We equip the diffusion model with a camera lens feature. You will feel safe when you *just* want to edit the focal length/shutter/bokeh/color temperature as the contents are preserved . Arxiv: arxiv.org/abs/2412.02168

Dreaming Tulpa 🥓👑

@dreamingtulpa

Dec 9

NVIDIA has found a way to add camera physics to diffusion models. Literally makes it possible to generate consistent images but with a different aperture, focal length, shutter speed or color temperature.

Yichen Sheng reposted

Jiahui Huang

@huangjh_hjh

Dec 4

📢Please check out our newest work on feed-forward reconstruction of dynamic monocular videos! With our bullet-time formulation, we reach great flexibility and state-of-the-art performance!

Jiawei Ren

@jiawei6_ren

Dec 4

Excited to introduce BTimer: a real-time solution for reconstructing dynamic scenes from monocular videos! 🚀 Generate per-frame (bullet-time) 3D GS scenes in 150ms on a single GPU, achieving quality that is on par with the optimization methods! Project: research.nvidia.com/labs/toronto-a…