Aditya Arun

@adityaarun1

Minimalist, Research Scientist @ Adobe, PhD

Noida, India

adityaarun1.github.io

Joined March 2010

8KPosts 351Followers 1KFollowing

You might like

@PascalMettes

@tkasarla_

@PJNarayanan

@johnny_deep93

@anandmishra2012

@vikataravi

@puneetdokania

Pinned

Aditya Arun

@adityaarun1

Nov 26, 2018

At times you come across an essay or a discussion which articulates succinctly the fleeting thoughts that have been troubling you for years.

Aditya Arun reposted

Blue Origin

@blueorigin

Nov 13

BOOSTER TOUCHDOWN! New Glenn returns to its blue origin.

Aditya Arun reposted

BLIP3o-NEXT advances RL for image generation. By operating on discrete tokens, we can seamlessly integrate with the entire RL infrastructure developed for language models—enabling us to apply proven techniques to visual generation at scale. Simple, scalable, and it really works.…

Salesforce AI Research

@SFResearch

Nov 11

A closer look at BLIP3o-NEXT's RL innovation: 🎯 Launched this week, our model applies GRPO to the autoregressive backbone using discrete image tokens—seamlessly integrating with existing language model RL infrastructure. The result? SOTA image generation and editing. Paper:…

SFResearch's tweet image. A closer look at BLIP3o-NEXT's RL innovation: 🎯

Launched this week, our model applies GRPO to the autoregressive backbone using discrete image tokens—seamlessly integrating with existing language model RL infrastructure.

The result? SOTA image generation and editing.

Paper:…

Aditya Arun reposted

Pedro Domingos

@pmddomingos

Nov 11

History of AI at Meta: 2004-07: Clueless. Thinks AI is SQL queries. 2008-12: Has applied AI group only. 2013–18: Hires LeCun and vaults to AI leader. 2019-25: LeCun steps away from day-to-day management and things quickly go bad. 2025: Desperate to catch up, alienates LeCun and…

Aditya Arun reposted

Fleetwood

@fleetwood___

Nov 9

Amazing pairing to learn information theory Blog from Olah which gives great visual intuition: colah.github.io/posts/2015-09-… Video from 3b1b where you see the power by solving a real world example, Wordle: youtube.com/watch?v=v68zYy…

fleetwood___'s tweet image. Amazing pairing to learn information theory

Blog from Olah which gives great visual intuition:
colah.github.io/posts/2015-09-…

Video from 3b1b where you see the power by solving a real world example, Wordle: youtube.com/watch?v=v68zYy…

Aditya Arun reposted

Grant Snider

@grantdraws

Nov 7

Poetry Comics Month, Day 7

Aditya Arun reposted

ICC

@ICC

Nov 2

𝐂𝐇𝐀𝐌𝐏𝐈𝐎𝐍𝐒 𝐎𝐅 𝐓𝐇𝐄 𝐖𝐎𝐑𝐋𝐃 🇮🇳🏆 India clinch their maiden Women’s @cricketworldcup title at #CWC25 🤩

Aditya Arun reposted

Tanishq Mathew Abraham, Ph.D.

@iScienceLuvr

Oct 31

Defeating the Training-Inference Mismatch via FP16 Quick summary: A big problem in RL LLM training is that typical policy gradient methods expect the model generating the rollouts and the model being trained are exactly the same... but when you have a separate inference server…

iScienceLuvr's tweet image. Defeating the Training-Inference Mismatch via FP16

Quick summary: A big problem in RL LLM training is that typical policy gradient methods expect the model generating the rollouts and the model being trained are exactly the same... but when you have a separate inference server…

Aditya Arun reposted

Chieh-Hsin (Jesse) Lai

@JCJesseLai

Oct 29

Tired to go back to the original papers again and again? Our monograph: a systematic and fundamental recipe you can rely on! 📘 We’re excited to release 《The Principles of Diffusion Models》— with @DrYangSong, @gimdong58085414, @mittu1204, and @StefanoErmon. It traces the core…

JCJesseLai's tweet image. Tired to go back to the original papers again and again? Our monograph: a systematic and fundamental recipe you can rely on!

📘 We’re excited to release 《The Principles of Diffusion Models》— with @DrYangSong, @gimdong58085414, @mittu1204, and @StefanoErmon.

It traces the core…

Aditya Arun reposted

Huijie Zhang

@huijiezh

Oct 27

Few-step diffusion model field is wild, and there are many methods trying to train a high-quality few-step generator from scratch: Consistency Models, Shortcut Models, and MeanFlow. Turns out, they could be unified in a quite elegant way, which we did in our recent work.

huijiezh's tweet image. Few-step diffusion model field is wild, and there are many methods trying to train a high-quality few-step generator from scratch: Consistency Models, Shortcut Models, and MeanFlow.

Turns out, they could be unified in a quite elegant way, which we did in our recent work.

Aditya Arun reposted

sway

@SwayStar123

Oct 22

Another one 4 different papers coincidentally discovered the same thing at the same time

sway

@SwayStar123

Oct 20

Third paper to do this now lol "LATENT DIFFUSION MODEL WITHOUT VARIATIONAL AUTOENCODER" Using dino features and a residual connection to make a stronger decoder, and diffuse in dino feature space

SwayStar123's tweet image. Third paper to do this now lol

"LATENT DIFFUSION MODEL WITHOUT VARIATIONAL AUTOENCODER"
Using dino features and a residual connection to make a stronger decoder, and diffuse in dino feature space

Aditya Arun reposted

Robert Youssef

@rryssf_

Oct 20

Holy shit… Harvard just proved your base model might secretly be a genius. 🤯 Their new paper “Reasoning with Sampling” shows that you don’t need reinforcement learning to make LLMs reason better. They used a 'Markov chain sampling trick' that simply re-samples from the…

rryssf_'s tweet image. Holy shit… Harvard just proved your base model might secretly be a genius. 🤯

Their new paper “Reasoning with Sampling” shows that you don’t need reinforcement learning to make LLMs reason better.

They used a 'Markov chain sampling trick' that simply re-samples from the…

Aditya Arun reposted

Tanishq Mathew Abraham, Ph.D.

@iScienceLuvr

Oct 21

Maybe computer vision is all you need lol To be fair that's how humans work, it isn't like language is a separate sense lol

Tanishq Mathew Abraham, Ph.D.

@iScienceLuvr

Oct 20

DeepSeek released an OCR model today. Their motivation is really interesting: they want to use visual modality as an efficient compression medium for textual information, and use this to solve long-context challenges in LLMs. Of course, they are using it to get more training…

iScienceLuvr's tweet image. DeepSeek released an OCR model today.

Their motivation is really interesting: they want to use visual modality as an efficient compression medium for textual information, and use this to solve long-context challenges in LLMs.

Of course, they are using it to get more training…

Aditya Arun reposted

机器之心 JIQIZHIXIN

@jiqizhixin

Oct 21

Flow-based RL policies are unstable, and how can we fix them? Tsinghua, CMU, and others just uncovers the root cause: flow rollouts = residual RNNs, inheriting the same vanishing/exploding gradient problem. To stabilize training, the authors propose: - Flow-G – gated velocity…

jiqizhixin's tweet image. Flow-based RL policies are unstable, and how can we fix them?

Tsinghua, CMU, and others just uncovers the root cause: flow rollouts = residual RNNs, inheriting the same vanishing/exploding gradient problem.

To stabilize training, the authors propose:

- Flow-G – gated velocity…

Aditya Arun reposted

Andrej Karpathy

@karpathy

Oct 20

I quite like the new DeepSeek-OCR paper. It's a good OCR model (maybe a bit worse than dots), and yes data collection etc., but anyway it doesn't matter. The more interesting part for me (esp as a computer vision at heart who is temporarily masquerading as a natural language…

vLLM

@vllm_project

Oct 20

🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support. 🧠 Compresses visual contexts up to 20× while keeping…

vllm_project's tweet image. 🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support.

🧠 Compresses visual contexts up to 20× while keeping…

Aditya Arun reposted

AK

@_akhaliq

Oct 20

BLIP3o-NEXT Next Frontier of Native Image Generation

Aditya Arun reposted

Jessy Lin

@realJessyLin

Oct 21

🧠 How can we equip LLMs with memory that allows them to continually learn new things? In our new paper with @AIatMeta, we show how sparsely finetuning memory layers enables targeted updates for continual learning, w/ minimal interference with existing knowledge. While full…

realJessyLin's tweet image. 🧠 How can we equip LLMs with memory that allows them to continually learn new things?

In our new paper with @AIatMeta, we show how sparsely finetuning memory layers enables targeted updates for continual learning, w/ minimal interference with existing knowledge.

While full…

Aditya Arun reposted

Xintao Wang

@xinntao

Oct 21

🥳🥳DiT w/o VAE, but with Semantic Encoder, such as DINO! We introduce SVG (Self-supervised representation for Visual Generation) . Paper: huggingface.co/papers/2510.15… Code: github.com/shiml20/SVG

xinntao's tweet image. 🥳🥳DiT w/o VAE, but with Semantic Encoder, such as DINO!
We introduce SVG (Self-supervised representation for Visual Generation) .
Paper: huggingface.co/papers/2510.15…
Code: github.com/shiml20/SVG

Aditya Arun reposted

Jiachen Lei

@JiachenLei

Oct 15

No more reliance on VAE or DINO! Similar to the motivation of RAE by @sainingxie's team We propose EPG: SSL pre-training + end-to-end FT = SOTA FID on IN256/512! Works nicely for both DM and CM amap-ml.github.io/EPG/ (1/n)🧵 Next: Why training DMs on raw pixels is difficult?

JiachenLei's tweet image. No more reliance on VAE or DINO! Similar to the motivation of RAE by @sainingxie's team
We propose EPG: SSL pre-training + end-to-end FT = SOTA FID on IN256/512!

Works nicely for both DM and CM

amap-ml.github.io/EPG/
(1/n)🧵
Next: Why training DMs on raw pixels is difficult?

Aditya Arun reposted

PyTorch

@PyTorch

Oct 15

PyTorch 2.9 is now available, introducing key updates to performance, portability, and the developer experience. This release includes a stable libtorch ABI for C++/CUDA extensions, symmetric memory for multi-GPU kernels, expanded wheel support to include ROCm, XPU, and CUDA 13,…