Katrina Drozdov (Evtimova)

@stochasticdoggo

AI researcher | PhD from @NYUDataScience | Bulgarian yogurt, prime numbers, and dogs bring me joy | she/her

New York, USA

kevtimova.github.io

於九月 2017 加入

340貼文 391位跟隨者 350個跟隨中

你可能會喜歡

@jaseweston

@xkianteb

@cinjoncin

@zhansheng

@gstsdn

@marikgoldstein

@ShengLiu_

@wellecks

@taromakino

@RandomlyWalking

@NanWu__

@elmanmansimov

@nsubramani23

@chhaviyadav_

@wfwhitney

Katrina Drozdov (Evtimova) 已轉發

Bob McGrew

@bobmcgrewai

年10月9日

After spending billions of dollars of compute, GPT-5 learned that the most effective use of its token budget is to give itself a little pep talk every time it figures something out. Maybe you should do the same.

Alex Tabarrok

@ATabarrok

年10月9日

What?

Katrina Drozdov (Evtimova) 已轉發

John Schulman

@johnschulman2

年10月1日

Tinker provides an abstraction layer that is the right one for post-training R&D -- it's the infrastructure I've always wanted. I'm excited to see what people build with it. "Civilization advances by extending the number of important operations which we can perform without…

Thinking Machines

@thinkymachines

年10月1日

Introducing Tinker: a flexible API for fine-tuning language models. Write training loops in Python on your laptop; we'll run them on distributed GPUs. Private beta starts today. We can't wait to see what researchers and developers build with cutting-edge open models!…

thinkymachines's tweet image. Introducing Tinker: a flexible API for fine-tuning language models.

Write training loops in Python on your laptop; we'll run them on distributed GPUs.

Private beta starts today. We can't wait to see what researchers and developers build with cutting-edge open models!…

Katrina Drozdov (Evtimova) 已轉發

Diana Cai

@dianarycai

年9月12日

The application for a research fellowship at the Flatiron Institute in the Center for Computational Math is now live! This includes positions for ML and stats. The deadline is Dec 1. Links below with more details.

Katrina Drozdov (Evtimova)

@stochasticdoggo

年7月8日

Finally dipped my toes into RL post-training. I trained a code generation LLM with GRPO using open-r1. Here are my 9 takeaways: kevtimova.github.io/posts/grpo/

Katrina Drozdov (Evtimova) 已轉發

Gabriele Berton

@gabriberton

2024年5月31日

This simple pytorch trick will cut in half your GPU memory use / double your batch size (for real). Instead of adding losses and then computing backward, it's better to compute the backward on each loss (which frees the computational graph). Results will be exactly identical

gabriberton's tweet image. This simple pytorch trick will cut in half your GPU memory use / double your batch size (for real). Instead of adding losses and then computing backward, it's better to compute the backward on each loss (which frees the computational graph). Results will be exactly identical

Katrina Drozdov (Evtimova) 已轉發

Jack Morris

@jxmnop

年5月21日

excited to finally share on arxiv what we've known for a while now: All Embedding Models Learn The Same Thing embeddings from different models are SO similar that we can map between them based on structure alone. without *any* paired data feels like magic, but it's real:🧵

Jack Morris

@jxmnop

年2月23日

this is sick all i'll say is that these GIFs are proof that the biggest bet of my research career is gonna pay off excited to say more soon

Katrina Drozdov (Evtimova) 已轉發

Kyunghyun Cho

@kchonyc

年5月20日

it's been more than a decade since KD was proposed, and i've been using it all along .. but why does it work? too many speculations but no simple explanation. @_sungmin_cha and i decided to see if we can come up with the simplest working description of KD in this work. we ended…

kchonyc's tweet image. it's been more than a decade since KD was proposed, and i've been using it all along .. but why does it work? too many speculations but no simple explanation. @_sungmin_cha and i decided to see if we can come up with the simplest working description of KD in this work.

we ended…

Katrina Drozdov (Evtimova) 已轉發

NYU Center for Data Science

@NYUDataScience

年5月7日

CDS PhD Vlad Sobal (@vlad_is_ai) and Courant PhD Wancong (Kevin) Zhang show that when good data is scarce, planning beats traditional reinforcement learning. With @kchonyc, @timrudner, and @ylecun. nyudatascience.medium.com/when-good-data…

NYUDataScience's tweet card. When AI can’t rely on good data, planning ahead beats traditional reinforcement learning.

When Good Data Is Scarce, Planning Beats Reinforcement Learning in AI Decision-Making

來源: nyudatascience.medium.com

Katrina Drozdov (Evtimova)

@stochasticdoggo

年4月18日

We’re working on researching and designing world models. In the meantime, you definitely need RAG and FreshStack will help.

Nandan Thakur

@beirmug

年4月18日

Existing IR/RAG benchmarks are unrealistic: they’re often derived from easily retrievable topics, rather than grounded in solving real user problems. 🧵Introducing 𝐅𝐫𝐞𝐬𝐡𝐒𝐭𝐚𝐜𝐤, a challenging RAG benchmark on niche, recent topics. Work done during intern @databricks 🧱

Katrina Drozdov (Evtimova) 已轉發

Andrew Drozdov

@mrdrozdov

年3月25日

No labels. No problems. 😎 Check out the new impactful approach (TAO) from the Mosaic Research Team!

Jonathan Frankle

@jefrankle

年3月25日

The hardest part about finetuning LLMs is that people generally don't have high-quality labeled data. Today, @databricks introduced TAO, a new finetuning method that only needs inputs, no labels necessary. Best of all, it actually beats supervised finetuning on labeled data.

jefrankle's tweet image. The hardest part about finetuning LLMs is that people generally don't have high-quality labeled data. Today, @databricks introduced TAO, a new finetuning method that only needs inputs, no labels necessary. Best of all, it actually beats supervised finetuning on labeled data.

Katrina Drozdov (Evtimova)

@stochasticdoggo

年3月7日

Huge congratulations on the launch! @reflection_ai has an incredible team and an ambitious mission—excited to follow your progress!

Misha Laskin

@MishaLaskin

年3月7日

Today I’m launching @reflection_ai with my friend and co-founder @real_ioannis. Our team pioneered major advances in RL and LLMs, including AlphaGo and Gemini. At Reflection, we're building superintelligent autonomous systems. Starting with autonomous coding.

MishaLaskin's tweet image. Today I’m launching @reflection_ai with my friend and co-founder @real_ioannis.

Our team pioneered major advances in RL and LLMs, including AlphaGo and Gemini.

At Reflection, we're building superintelligent autonomous systems. Starting with autonomous coding.

Katrina Drozdov (Evtimova) 已轉發

Misha Laskin

@MishaLaskin

年3月7日

Katrina Drozdov (Evtimova) 已轉發

Thomas Wolf

@Thom_Wolf

年3月6日

I shared a controversial take the other day at an event and I decided to write it down in a longer format: I’m afraid AI won't give us a "compressed 21st century". The "compressed 21st century" comes from Dario's "Machine of Loving Grace" and if you haven’t read it, you probably…

Katrina Drozdov (Evtimova)

@stochasticdoggo

年3月6日

I asked ChatGPT, Gemini, and Claude for a clever joke. They all gave me the same one. Either AI is merging into a hive mind… or humor has officially been solved mathematically!

stochasticdoggo's tweet image. I asked ChatGPT, Gemini, and Claude for a clever joke. They all gave me the same one. Either AI is merging into a hive mind… or humor has officially been solved mathematically!

Katrina Drozdov (Evtimova) 已轉發

Hila Chefer

@hila_chefer

年2月4日

VideoJAM is our new framework for improved motion generation from @AIatMeta We show that video generators struggle with motion because the training objective favors appearance over dynamics. VideoJAM directly adresses this **without any extra data or scaling** 👇🧵

Katrina Drozdov (Evtimova) 已轉發

Andrew Ng

@AndrewYNg

年1月30日

The buzz over DeepSeek this week crystallized, for many people, a few important trends that have been happening in plain sight: (i) China is catching up to the U.S. in generative AI, with implications for the AI supply chain. (ii) Open weight models are commoditizing the…

Katrina Drozdov (Evtimova)

@stochasticdoggo

年1月29日

The principle of least effort, from psychology, describes how we favor efficiency over effort. It aligns with System 1 (fast, intuitive) vs. System 2 (slow, deliberate) reasoning. AI faces a similar challenge: knowing when to rely on heuristics vs. deeper reasoning.

Katrina Drozdov (Evtimova) 已轉發

Ian Goodfellow

@goodfellow_ian

年1月28日

The recording of the GAN test of time talk by @dwf is now publicly available: neurips.cc/virtual/2024/t…

Katrina Drozdov (Evtimova) 已轉發

Raghav Singhal

@_rk_singhal

年1月21日

Got a diffusion model? What if there were a way to: - Get SOTA text-to-image prompt fidelity, with no extra training! - Steer continuous and discrete (e.g. text) diffusions - Beat larger models using less compute - Outperform fine-tuning - And keep your stats friends happy !?

_rk_singhal's tweet image. Got a diffusion model?

What if there were a way to:
- Get SOTA text-to-image prompt fidelity, with no extra training!
- Steer continuous and discrete (e.g. text) diffusions
- Beat larger models using less compute
- Outperform fine-tuning
- And keep your stats friends happy !?

Katrina Drozdov (Evtimova) 已轉發

Jia-Bin Huang

@jbhuang0604

年1月7日

So…world model = video model?

Kyunghyun Cho

@kchonyc

Sam Bowman

@sleepinyourhat

Eugene Vinitsky 🦋

@EugeneVinitsky

Nan Wu

@NanWu__

Aishwarya Kamath

@ashkamath20

David Brandfonbrener

@brandfonbrener

Jaan Altosaar Li, PhD

@thejaan

Andrew Drozdov

@mrdrozdov

Roberta Raileanu

@robertarail

Jason Phang

@zhansheng

Pavel Izmailov

@Pavel_Izmailov

Behnam Neyshabur

@bneyshabur

aahlad puli

@aahladpuli

NYU Center for Data Science

@NYUDataScience

Arthur Spirling

@arthur_spirling

Naomi Saphra

@nsaphra

Ofir Press

@OfirPress

Rajarshi Das

@RajDasNYC

Torres Torres

@TorresHereWeGo

Hossein Sharifi N. 🍁

@Hossein_SHN

Yunqianliu

@Yunqianliu75601

xianning wang

@Frederickwangxn

Iris

@Iris_dot_exe

mr_usuff

@mr_usuff

Muhang Tian

@muhang_tian

Girish

@googrish

Hafez Ghaemi

@hafezghm

Kumar

@kumar__nn

Arya Grayeli

@AryaGrayeli

Edwin McKenzie

@EdwinMcKen91903

Lucas Maes

@lucasmaes_

Soasoughst

@SoasoughstSXp8

Younes basher

@bashiryounis98

Aakanksha Chowdhery

@achowdhery

Vishakh Padmakumar

@vishakh_pk

Shannon Yang

@shannonyangsky

Maksymilian Wolski

@maxiewolski

Li_F2_H2

@Li_F2_H2

Nadav Timor

@NadavTimor

Brandon Amos

@brandondamos

qwyy

@hyjhgszjjxhr

Purushottam Lal

@Purusho31622734

ahmed

@ahmedshoukr_

Jayoo Hwang

@JayooHwang

Andres Castillo

@a_castillo_v

Alexander Naumenko

@AlexanderNaume2

Shehzaad Dhuliawala

@shehzaadzd

Epsilon Guanlin Lee

@Epsilon_Lee

Connor Jennings

@cojennin

bagels.ai

@bagelsAI

substrat.ai

@substratai

Kwek Ming Hong

@kwekmh

Mariya I. Vasileva

@mariyaivasileva

Heada

@Heada254604

Jacob Portes

@JacobianNeuro

alireza siavashi

@siavashiar

Lucian Li

@lucianli123

Nova

@Tasmia_Nova

Tem

@TheMaximu5

Varun Gangal

@VarunGangal

Shamanth R Nayak

@nayak_shamanth

AG32

@burnerkit

Sean

@realseanyewest1

Dasun Athukorala

@Dasun_ath

Aditya Varma

@ddavarma_

Avrajit Ghosh

@GhoshAvrajit

Kyunghyun Cho

@kchonyc

Yann LeCun

@ylecun

hardmaru

@hardmaru

Alfredo Canziani

@alfcnz

Sam Bowman

@sleepinyourhat

Kyle Cranmer

@KyleCranmer

Andrej Karpathy

@karpathy

Andrew Gordon Wilson

@andrewgwils

Nan Wu

@NanWu__

Sasha Rush

@srush_nlp

Soumith Chintala

@soumithchintala

Tim Rocktäschel

@_rockt

Aishwarya Kamath

@ashkamath20

David Brandfonbrener

@brandfonbrener

Jaan Altosaar Li, PhD

@thejaan

Wojciech Zaremba

@woj_zaremba

$sarahookr's profile picture. Adaptive Intelligence. Built @Cohere_Labs, @GoogleBrain, @GoogleDeepmind. ML Efficiency, Multimodal\lingual. Changing spaces where breakthroughs happen.$