1jaskiratsingh's profile picture. Ph.D. Candidate at Australian National University | Intern @AIatMeta
 GenAI |  @AdobeResearch |  Multimodal Fusion Models and Agents | R2E-Gym | REPA-E

Jaskirat Singh @ ICCV2025🌴

@1jaskiratsingh

Ph.D. Candidate at Australian National University | Intern @AIatMeta GenAI | @AdobeResearch | Multimodal Fusion Models and Agents | R2E-Gym | REPA-E

고정된 트윗

Can we optimize both the VAE tokenizer and diffusion model together in an end-to-end manner? Short Answer: Yes. 🚨 Introducing REPA-E: the first end-to-end tuning approach for jointly optimizing both the VAE and the latent diffusion model using REPA loss 🚨 Key Idea: 🧠…

1jaskiratsingh's tweet image. Can we optimize both the VAE tokenizer and diffusion model together in an end-to-end manner? Short Answer: Yes.

🚨 Introducing REPA-E: the first end-to-end tuning approach for jointly optimizing both the VAE and the latent diffusion model using REPA loss 🚨

Key Idea:
🧠…

Jaskirat Singh @ ICCV2025🌴 님이 재게시함

Check out our work ThinkMorph, which thinks in multi-modalities, not just with them.

🚨Sensational title alert: we may have cracked the code to true multimodal reasoning. Meet ThinkMorph — thinking in modalities, not just with them. And what we found was... unexpected. 👀 Emergent intelligence, strong gains, and …🫣 🧵 arxiv.org/abs/2510.27492 (1/16)

Kuvvius's tweet image. 🚨Sensational title alert: we may have cracked the code to true multimodal reasoning.
Meet ThinkMorph — thinking in modalities, not just with them.
And what we found was... unexpected. 👀
Emergent intelligence, strong gains, and …🫣
🧵 arxiv.org/abs/2510.27492
(1/16)


Jaskirat Singh @ ICCV2025🌴 님이 재게시함

Tests certify functional behavior; they don’t judge intent. GSO, our code optimization benchmark, now combines tests with a rubric-driven HackDetector to identify models that game the benchmark. We found that up to 30% of a model’s attempts are non-idiomatic reward hacks, which…

slimshetty_'s tweet image. Tests certify functional behavior; they don’t judge intent. GSO, our code optimization benchmark, now combines tests with a rubric-driven HackDetector to identify models that game the benchmark.

We found that up to 30% of a model’s attempts are non-idiomatic reward hacks, which…

Jaskirat Singh @ ICCV2025🌴 님이 재게시함

We added LLM judge based hack detector to our code optimization evals and found models perform non-idiomatic code changes in upto 30% of the problems 🤯

Tests certify functional behavior; they don’t judge intent. GSO, our code optimization benchmark, now combines tests with a rubric-driven HackDetector to identify models that game the benchmark. We found that up to 30% of a model’s attempts are non-idiomatic reward hacks, which…

slimshetty_'s tweet image. Tests certify functional behavior; they don’t judge intent. GSO, our code optimization benchmark, now combines tests with a rubric-driven HackDetector to identify models that game the benchmark.

We found that up to 30% of a model’s attempts are non-idiomatic reward hacks, which…


Jaskirat Singh @ ICCV2025🌴 님이 재게시함

end-to-end training just makes latent diffusion transformers better! with repa-e, we showed the power of end-to-end training on imagenet. today we are extending it to text-to-image (T2I) generation. #ICCV2025 🌴 🚨 Introducing "REPA-E for T2I: family of end-to-end tuned VAEs for…

1jaskiratsingh's tweet image. end-to-end training just makes latent diffusion transformers better! with repa-e, we showed the power of end-to-end training on imagenet. today we are extending it to text-to-image (T2I) generation. #ICCV2025 🌴

🚨 Introducing "REPA-E for T2I: family of end-to-end tuned VAEs for…

Jaskirat Singh @ ICCV2025🌴 님이 재게시함

With simple changes, I was able to cut down @krea_ai's new real-time video gen's timing from 25.54s to 18.14s 🔥🚀 1. FA3 through `kernels` 2. Regional compilation 3. Selective (FP8) quantization Notes are in 🧵 below


Jaskirat Singh @ ICCV2025🌴 님이 재게시함

Tired to go back to the original papers again and again? Our monograph: a systematic and fundamental recipe you can rely on! 📘 We’re excited to release 《The Principles of Diffusion Models》— with @DrYangSong, @gimdong58085414, @mittu1204, and @StefanoErmon. It traces the core…

JCJesseLai's tweet image. Tired to go back to the original papers again and again? Our monograph: a systematic and fundamental recipe you can rely on!

📘 We’re excited to release 《The Principles of Diffusion Models》— with @DrYangSong, @gimdong58085414, @mittu1204, and @StefanoErmon.

It traces the core…

Jaskirat Singh @ ICCV2025🌴 님이 재게시함

Back in 2024, LMMs-Eval built a complete evaluation ecosystem for the MLLM/LMM community, with countless researchers contributing their models and benchmarks to raise the whole edifice. I was fortunate to be one of them: our series of video-LMM works (MovieChat, AuroraCap, VDC)…

Throughout my journey in developing multimodal models, I’ve always wanted a framework that lets me plug & play modality encoders/decoders on top of an auto-regressive LLM. I want to prototype fast, try new architectures, and have my demo files scale effortlessly — with full…



Jaskirat Singh @ ICCV2025🌴 님이 재게시함

I have one PhD intern opening to do research as a part of a model training effort at the FAIR CodeGen team (latest: Code World Model). If interested, email me directly and apply at metacareers.com/jobs/214557081…


Jaskirat Singh @ ICCV2025🌴 님이 재게시함

Arash and his team are fantastic! I highly recommend applying if you’re interested

📢 The Fundamental Generative AI Research (GenAIR) team at NVIDIA is looking for outstanding candidates to join us as summer 2026 interns. Apply via: nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAEx… Email: [email protected] Group website: research.nvidia.com/labs/genair/ 👇



Jaskirat Singh @ ICCV2025🌴 님이 재게시함

🚀 New preprint! We present NP-Edit, a framework for training an image editing diffusion model without paired supervision. We use differentiable feedback from Vision-Language Models (VLMs) combined with distribution-matching loss (DMD) to learn editing directly. webpage:…


Jaskirat Singh @ ICCV2025🌴 님이 재게시함

I am incredibly excited to introduce rLLM v0.2. Zooming back to a year ago: @OpenAI's o1-preview just dropped, and RL + test-time scaling suddenly became the hype. But no one knew how they did it. @kylepmont and I had this idea - what if we built a solver-critique loop for…

🚀 Introducing rLLM v0.2 - train arbitrary agentic programs with RL, with minimal code changes. Most RL training systems adopt the agent-environment abstraction. But what about complex workflows? Think solver-critique pairs collaborating, or planner agents orchestrating multiple…

rllm_project's tweet image. 🚀 Introducing rLLM v0.2 - train arbitrary agentic programs with RL, with minimal code changes.

Most RL training systems adopt the agent-environment abstraction. But what about complex workflows? Think solver-critique pairs collaborating, or planner agents orchestrating multiple…


Jaskirat Singh @ ICCV2025🌴 님이 재게시함

LiveCodeBench Pro remains one of the most challenging code benchmarks, but its evaluation and verification process is still a black box. We introduce AutoCode, which democratizes evaluation allowing anyone to locally run verification and perform RL training! For the first time,…

wenhaocha1's tweet image. LiveCodeBench Pro remains one of the most challenging code benchmarks, but its evaluation and verification process is still a black box.
We introduce AutoCode, which democratizes evaluation allowing anyone to locally run verification and perform RL training!
For the first time,…

Loading...

Something went wrong.


Something went wrong.