Junsong_Chen

@lawrence_cjs

HKU Ph.D, NVIDIA Research Internship

Hong Kong

lawrence-cj.github.io

Joined February 2022

52Posts 196Followers 32Following

Junsong_Chen reposted

Enze Xie

@xieenze_jr

6 h

We (@lawrence_cjs, @yuyangzhao_ , @shanasaimoe) from the SANA team just posted a blog on the core of Linear Attention: how it achieves infinite context lengths with global awareness but constant memory usage! We explore state accumulation mechanics, the evolution from Softmax to…

xieenze_jr's tweet image. We (@lawrence_cjs, @yuyangzhao_ , @shanasaimoe) from the SANA team just posted a blog on the core of Linear Attention: how it achieves infinite context lengths with global awareness but constant memory usage! We explore state accumulation mechanics, the evolution from Softmax to…

Enze Xie

@xieenze_jr

Nov 1

The training/ Inference code and checkpoints are released. Welcome to try! github.com/NVlabs/Sana

xieenze_jr's tweet card. SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer - NVlabs/Sana

GitHub - NVlabs/Sana: SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion...

Source: github.com

Junsong_Chen

@lawrence_cjs

6 h

How Linear Attention and Softmax Attention differ in compute and KV-Cache for LLMs and long-video generation. Let's start with this blog. hanlab.mit.edu/blog/infinite-…

lawrence_cjs's tweet image. How Linear Attention and Softmax Attention differ in compute and KV-Cache for LLMs and long-video generation. Let's start with this blog.
hanlab.mit.edu/blog/infinite-…

Enze Xie

@xieenze_jr

6 h

Junsong_Chen reposted

Enze Xie

@xieenze_jr

Oct 6

Sora 2 is amazing!, But AI video generation inference speed is too slow. Try our Deep Compression Autoencoder + Linear Attention! 🚀🔥 nvlabs.github.io/Sana/Video github.com/dc-ai-projects…

github.com

GitHub - dc-ai-projects/DC-VideoGen: DC-VideoGen: Efficient Video Generation with Deep Compression...

DC-VideoGen: Efficient Video Generation with Deep Compression Video Autoencoder - dc-ai-projects/DC-VideoGen

Source: github.com

Enze Xie

@xieenze_jr

Sep 30

🚀 SANA-Video: Linear Attention + Constant-Memory KV Cache = Fast Long Videos 💥 Key Features 🌟 🧠 Linear DiT everywhere → O(N) complexity on video-scale tokens 🧰 Constant-memory Block KV cache → store cumulative states only (no growing KV) 🔄 🎯 Temporal Mix-FFN + 3D RoPE…

Junsong_Chen

@lawrence_cjs

Oct 3

Thanks so much @_akhaliq for sharing our recent work. Our homepage is here: nvlabs.github.io/Sana/Video/

AK

@_akhaliq

Sep 30

SANA-Video Efficient Video Generation with Block Linear Diffusion Transformer

Junsong_Chen reposted

Han Cai

@hancai_hm

Sep 30

Changing the autoencoder in latent diffusion models is easier than you think. 🚀 Introducing DC-Gen – a post-training acceleration framework that works with any pre-trained diffusion model, boosting efficiency by transferring it into a deeply compressed latent space with…

hancai_hm's tweet image. Changing the autoencoder in latent diffusion models is easier than you think.

🚀 Introducing DC-Gen – a post-training acceleration framework that works with any pre-trained diffusion model, boosting efficiency by transferring it into a deeply compressed latent space with…

Junsong_Chen reposted

Han Cai

@hancai_hm

Sep 30

We release DC-VideoGen, a new post-training framework for accelerating video diffusion models. Key features: 🎬 Supports video generation up to 2160×3840 (4K) resolution on a single H100 GPU ⚡ Delivers 14.8× faster inference than the base model while achieving comparable or…

hancai_hm's tweet image. We release DC-VideoGen, a new post-training framework for accelerating video diffusion models. Key features:
🎬 Supports video generation up to 2160×3840 (4K) resolution on a single H100 GPU
⚡ Delivers 14.8× faster inference than the base model while achieving comparable or…

Junsong_Chen reposted

Enze Xie

@xieenze_jr

Sep 30

Junsong_Chen

@lawrence_cjs

Sep 29

Explore recent work from our team. Long-Live generates minute-length videos and interacts as you want with real-time fast speed! Very cool project. 🎉

Yukang Chen

@yukangchen_

Sep 29

🚀 We open-sourced LongLive — interactive, real-time long-video generation. 👥Generates video in real time as users enter text prompts. ⚡️20.7 FPS on a single H100,⏱️up to 240s per clip. 🎬Fine-tunes SOTA short-video models (e.g., Wan) into long-video generators. 🌍One step…

Junsong_Chen reposted

Song Han

@songhan_mit

Aug 29

Explore Deep Compression Autoencoder (DC-AE) 1.5 with higher token compression ratio (64x) for faster visual generation:

Han Cai

@hancai_hm

Aug 29

🚀 Excited to announce DC-AE 1.5! With a spatial compression ratio boosted to f64, it accelerates high-res diffusion models while preserving text-to-image quality. Key innovation: channel-wise latent structure for faster convergence with many latent channels. 📍 Catch us at…

hancai_hm's tweet image. 🚀 Excited to announce DC-AE 1.5!

With a spatial compression ratio boosted to f64, it accelerates high-res diffusion models while preserving text-to-image quality. Key innovation: channel-wise latent structure for faster convergence with many latent channels.

📍 Catch us at…

Junsong_Chen reposted

Sayak Paul

@RisingSayak

Mar 14

The best few-step sampling model across the speed-memory frontier? 😱 Introducing SANA-Sprint in collaboration with the great SANA team! Beyond the results, perhaps more importantly, the work is about the recipe of SANA-Sprint. Code & model will be open ❤️ Let's go ⬇️

RisingSayak's tweet image. The best few-step sampling model across the speed-memory frontier? 😱

Introducing SANA-Sprint in collaboration with the great SANA team!

Beyond the results, perhaps more importantly, the work is about the recipe of SANA-Sprint. Code &amp; model will be open ❤️

Let's go ⬇️

Junsong_Chen reposted

AK

@_akhaliq

Mar 14

SANA-Sprint One-Step Diffusion with Continuous-Time Consistency Distillation

Junsong_Chen reposted

Song Han

@songhan_mit

Mar 14

Explore our one-step diffusion model, SANA-Sprint. Very fast:

This post is unavailable.

Junsong_Chen reposted

Cheng Lu

@clu_cheng

Mar 14

Still think consistency models are bad at scale? In fact, sCM can be stably scaled to modern text-to-image diffusion models and greatly improve the generation speed and 1-step generation quality!

This post is unavailable.

Junsong_Chen

@lawrence_cjs

Mar 14

Excited for 🏃SANA-Sprint. 🚀Code and weights will be released very soon along with diffusers. Study tuned!❤️

This post is unavailable.

Junsong_Chen

@lawrence_cjs

Feb 2

Introducing Sana-1.5. Model scaling up, then scaling down. Also inference time scaling is working as an auto end to end pipeline.

Enze Xie

@xieenze_jr

Feb 1

🔥 SANA 1.5: A linear Diffusion Transformer pushes SOTA in text-to-image generation! Key innovations: • Depth-growth training: 1.6B → 4.8B params • Memory-efficient 8-bit optimizer • Flexible model pruning • Inference scaling for better quality Achieves 0.80 on GenEval! 🚀