Katrina Drozdov (Evtimova)

@stochasticdoggo

AI researcher | PhD from @NYUDataScience | Bulgarian yogurt, prime numbers, and dogs bring me joy | she/her

New York, USA

kevtimova.github.io

Joined September 2017

340Posts 391Followers 351Following

You might like

@jaseweston

@xkianteb

@cinjoncin

@zhansheng

@gstsdn

@marikgoldstein

@ShengLiu_

@wellecks

@taromakino

@RandomlyWalking

@NanWu__

@elmanmansimov

@nsubramani23

@chhaviyadav_

@wfwhitney

Katrina Drozdov (Evtimova) reposted

Bob McGrew

@bobmcgrewai

Oct 9

After spending billions of dollars of compute, GPT-5 learned that the most effective use of its token budget is to give itself a little pep talk every time it figures something out. Maybe you should do the same.

Alex Tabarrok

@ATabarrok

Oct 9

What?

Katrina Drozdov (Evtimova) reposted

John Schulman

@johnschulman2

Oct 1

Tinker provides an abstraction layer that is the right one for post-training R&D -- it's the infrastructure I've always wanted. I'm excited to see what people build with it. "Civilization advances by extending the number of important operations which we can perform without…

Thinking Machines

@thinkymachines

Oct 1

Introducing Tinker: a flexible API for fine-tuning language models. Write training loops in Python on your laptop; we'll run them on distributed GPUs. Private beta starts today. We can't wait to see what researchers and developers build with cutting-edge open models!…

thinkymachines's tweet image. Introducing Tinker: a flexible API for fine-tuning language models.

Write training loops in Python on your laptop; we'll run them on distributed GPUs.

Private beta starts today. We can't wait to see what researchers and developers build with cutting-edge open models!…

Katrina Drozdov (Evtimova) reposted

Diana Cai

@dianarycai

Sep 12

The application for a research fellowship at the Flatiron Institute in the Center for Computational Math is now live! This includes positions for ML and stats. The deadline is Dec 1. Links below with more details.

Katrina Drozdov (Evtimova)

@stochasticdoggo

Jul 8

Finally dipped my toes into RL post-training. I trained a code generation LLM with GRPO using open-r1. Here are my 9 takeaways: kevtimova.github.io/posts/grpo/

Katrina Drozdov (Evtimova) reposted

Gabriele Berton

@gabriberton

May 31, 2024

This simple pytorch trick will cut in half your GPU memory use / double your batch size (for real). Instead of adding losses and then computing backward, it's better to compute the backward on each loss (which frees the computational graph). Results will be exactly identical

gabriberton's tweet image. This simple pytorch trick will cut in half your GPU memory use / double your batch size (for real). Instead of adding losses and then computing backward, it's better to compute the backward on each loss (which frees the computational graph). Results will be exactly identical

Katrina Drozdov (Evtimova) reposted

Jack Morris

@jxmnop

May 21

excited to finally share on arxiv what we've known for a while now: All Embedding Models Learn The Same Thing embeddings from different models are SO similar that we can map between them based on structure alone. without *any* paired data feels like magic, but it's real:🧵

Jack Morris

@jxmnop

Feb 23

this is sick all i'll say is that these GIFs are proof that the biggest bet of my research career is gonna pay off excited to say more soon

Katrina Drozdov (Evtimova) reposted

Kyunghyun Cho

@kchonyc

May 20

it's been more than a decade since KD was proposed, and i've been using it all along .. but why does it work? too many speculations but no simple explanation. @_sungmin_cha and i decided to see if we can come up with the simplest working description of KD in this work. we ended…

kchonyc's tweet image. it's been more than a decade since KD was proposed, and i've been using it all along .. but why does it work? too many speculations but no simple explanation. @_sungmin_cha and i decided to see if we can come up with the simplest working description of KD in this work.

we ended…

Katrina Drozdov (Evtimova) reposted

NYU Center for Data Science

@NYUDataScience

May 7

CDS PhD Vlad Sobal (@vlad_is_ai) and Courant PhD Wancong (Kevin) Zhang show that when good data is scarce, planning beats traditional reinforcement learning. With @kchonyc, @timrudner, and @ylecun. nyudatascience.medium.com/when-good-data…

NYUDataScience's tweet card. When AI can’t rely on good data, planning ahead beats traditional reinforcement learning.

When Good Data Is Scarce, Planning Beats Reinforcement Learning in AI Decision-Making

Source: nyudatascience.medium.com

Katrina Drozdov (Evtimova)

@stochasticdoggo

Apr 18

We’re working on researching and designing world models. In the meantime, you definitely need RAG and FreshStack will help.

Nandan Thakur

@beirmug

Apr 18

Existing IR/RAG benchmarks are unrealistic: they’re often derived from easily retrievable topics, rather than grounded in solving real user problems. 🧵Introducing 𝐅𝐫𝐞𝐬𝐡𝐒𝐭𝐚𝐜𝐤, a challenging RAG benchmark on niche, recent topics. Work done during intern @databricks 🧱

Katrina Drozdov (Evtimova) reposted

Andrew Drozdov

@mrdrozdov

Mar 25

No labels. No problems. 😎 Check out the new impactful approach (TAO) from the Mosaic Research Team!

Jonathan Frankle

@jefrankle

Mar 25

The hardest part about finetuning LLMs is that people generally don't have high-quality labeled data. Today, @databricks introduced TAO, a new finetuning method that only needs inputs, no labels necessary. Best of all, it actually beats supervised finetuning on labeled data.

jefrankle's tweet image. The hardest part about finetuning LLMs is that people generally don't have high-quality labeled data. Today, @databricks introduced TAO, a new finetuning method that only needs inputs, no labels necessary. Best of all, it actually beats supervised finetuning on labeled data.

Katrina Drozdov (Evtimova)

@stochasticdoggo

Mar 7

Huge congratulations on the launch! @reflection_ai has an incredible team and an ambitious mission—excited to follow your progress!

Misha Laskin

@MishaLaskin

Mar 7

Today I’m launching @reflection_ai with my friend and co-founder @real_ioannis. Our team pioneered major advances in RL and LLMs, including AlphaGo and Gemini. At Reflection, we're building superintelligent autonomous systems. Starting with autonomous coding.

MishaLaskin's tweet image. Today I’m launching @reflection_ai with my friend and co-founder @real_ioannis.

Our team pioneered major advances in RL and LLMs, including AlphaGo and Gemini.

At Reflection, we're building superintelligent autonomous systems. Starting with autonomous coding.

Katrina Drozdov (Evtimova) reposted

Misha Laskin

@MishaLaskin

Mar 7

Katrina Drozdov (Evtimova) reposted

Thomas Wolf

@Thom_Wolf

Mar 6

I shared a controversial take the other day at an event and I decided to write it down in a longer format: I’m afraid AI won't give us a "compressed 21st century". The "compressed 21st century" comes from Dario's "Machine of Loving Grace" and if you haven’t read it, you probably…

Katrina Drozdov (Evtimova)

@stochasticdoggo

Mar 6

I asked ChatGPT, Gemini, and Claude for a clever joke. They all gave me the same one. Either AI is merging into a hive mind… or humor has officially been solved mathematically!

stochasticdoggo's tweet image. I asked ChatGPT, Gemini, and Claude for a clever joke. They all gave me the same one. Either AI is merging into a hive mind… or humor has officially been solved mathematically!

Katrina Drozdov (Evtimova) reposted

Hila Chefer

@hila_chefer

Feb 4

VideoJAM is our new framework for improved motion generation from @AIatMeta We show that video generators struggle with motion because the training objective favors appearance over dynamics. VideoJAM directly adresses this **without any extra data or scaling** 👇🧵

Katrina Drozdov (Evtimova) reposted

Andrew Ng

@AndrewYNg

Jan 30

The buzz over DeepSeek this week crystallized, for many people, a few important trends that have been happening in plain sight: (i) China is catching up to the U.S. in generative AI, with implications for the AI supply chain. (ii) Open weight models are commoditizing the…

Katrina Drozdov (Evtimova)

@stochasticdoggo

Jan 29

The principle of least effort, from psychology, describes how we favor efficiency over effort. It aligns with System 1 (fast, intuitive) vs. System 2 (slow, deliberate) reasoning. AI faces a similar challenge: knowing when to rely on heuristics vs. deeper reasoning.

Katrina Drozdov (Evtimova) reposted

Ian Goodfellow

@goodfellow_ian

Jan 28

The recording of the GAN test of time talk by @dwf is now publicly available: neurips.cc/virtual/2024/t…

Katrina Drozdov (Evtimova) reposted

Raghav Singhal

@_rk_singhal

Jan 21

Got a diffusion model? What if there were a way to: - Get SOTA text-to-image prompt fidelity, with no extra training! - Steer continuous and discrete (e.g. text) diffusions - Beat larger models using less compute - Outperform fine-tuning - And keep your stats friends happy !?

_rk_singhal's tweet image. Got a diffusion model?

What if there were a way to:
- Get SOTA text-to-image prompt fidelity, with no extra training!
- Steer continuous and discrete (e.g. text) diffusions
- Beat larger models using less compute
- Outperform fine-tuning
- And keep your stats friends happy !?

Katrina Drozdov (Evtimova) reposted

Jia-Bin Huang

@jbhuang0604

Jan 7

So…world model = video model?

Kyunghyun Cho

@kchonyc

Sam Bowman

@sleepinyourhat

Eugene Vinitsky 🦋

@EugeneVinitsky

Nan Wu

@NanWu__

Aishwarya Kamath

@ashkamath20

David Brandfonbrener

@brandfonbrener

Jaan Altosaar Li, PhD

@thejaan

Andrew Drozdov

@mrdrozdov

Roberta Raileanu

@robertarail

Jason Phang

@zhansheng

Pavel Izmailov

@Pavel_Izmailov

Behnam Neyshabur

@bneyshabur

aahlad puli

@aahladpuli

NYU Center for Data Science

@NYUDataScience

Arthur Spirling

@arthur_spirling

Naomi Saphra

@nsaphra

Ofir Press

@OfirPress

Rajarshi Das

@RajDasNYC

Torres Torres

@TorresHereWeGo

Hossein Sharifi N. 🍁

@Hossein_SHN

Yunqianliu

@Yunqianliu75601

xianning wang

@Frederickwangxn

Iris

@Iris_dot_exe

mr_usuff

@mr_usuff

Muhang Tian

@muhang_tian

Girish

@googrish

Hafez Ghaemi

@hafezghm

Kumar

@kumar__nn

Arya Grayeli

@AryaGrayeli

Edwin McKenzie

@EdwinMcKen91903

Lucas Maes

@lucasmaes_

Soasoughst

@SoasoughstSXp8

Younes basher

@bashiryounis98

Aakanksha Chowdhery

@achowdhery

Vishakh Padmakumar

@vishakh_pk

Shannon Yang

@shannonyangsky

Maksymilian Wolski

@maxiewolski

Li_F2_H2

@Li_F2_H2

Nadav Timor

@NadavTimor

Brandon Amos

@brandondamos

qwyy

@hyjhgszjjxhr

Purushottam Lal

@Purusho31622734

ahmed

@ahmedshoukr_

Jayoo Hwang

@JayooHwang

Andres Castillo

@a_castillo_v

Alexander Naumenko

@AlexanderNaume2

Shehzaad Dhuliawala

@shehzaadzd

Epsilon Guanlin Lee

@Epsilon_Lee

Connor Jennings

@cojennin

bagels.ai

@bagelsAI

substrat.ai

@substratai

Kwek Ming Hong

@kwekmh

Mariya I. Vasileva

@mariyaivasileva

Heada

@Heada254604

Jacob Portes

@JacobianNeuro

alireza siavashi

@siavashiar

Lucian Li

@lucianli123

Nova

@Tasmia_Nova

Tem

@TheMaximu5

Varun Gangal

@VarunGangal

Shamanth R Nayak

@nayak_shamanth

AG32

@burnerkit

Sean

@realseanyewest1

Dasun Athukorala

@Dasun_ath

Aditya Varma

@ddavarma_

Avrajit Ghosh

@GhoshAvrajit

Kyunghyun Cho

@kchonyc

Yann LeCun

@ylecun

hardmaru

@hardmaru

Alfredo Canziani

@alfcnz

Sam Bowman

@sleepinyourhat

Kyle Cranmer

@KyleCranmer

Andrej Karpathy

@karpathy

Andrew Gordon Wilson

@andrewgwils

Nan Wu

@NanWu__

Sasha Rush

@srush_nlp

Soumith Chintala

@soumithchintala

Tim Rocktäschel

@_rockt

Aishwarya Kamath

@ashkamath20

David Brandfonbrener

@brandfonbrener

Jaan Altosaar Li, PhD

@thejaan

Wojciech Zaremba

@woj_zaremba

$sarahookr's profile picture. Adaptive Intelligence. Built @Cohere_Labs, @GoogleBrain, @GoogleDeepmind. ML Efficiency, Multimodal\lingual. Changing spaces where breakthroughs happen.$