Nimit Kalra @ ICML 2025

@qw3rtman

research @haizelabs @columbia, prev @citadel @utaustin currently feynman technique-ing my way through life

nimit.io

Dołączył w Październik 2011

190Wpisy 1KObserwujących 959Obserwowanych

Może Ci się spodobać

@rishistalati

@kevo1ution

@RobertKyslinger

Nimit Kalra @ ICML 2025 podał dalej

Liyan Tang

@LiyanTang4

19 wrz

Our paper "ChartMuseum 🖼️" is now accepted to #NeurIPS2025 Datasets and Benchmarks Track! Even the latest models, such as GPT-5 and Gemini-2.5-Pro, still cannot do well on challenging 📉chart understanding questions , especially on those that involve visual reasoning 👀!

Liyan Tang

@LiyanTang4

20 maj

Introducing ChartMuseum🖼️, testing visual reasoning with diverse real-world charts! ✍🏻Entirely human-written questions by 13 CS researchers 👀Emphasis on visual reasoning – hard to be verbalized via text CoTs 📉Humans reach 93% but 63% from Gemini-2.5-Pro & 38% from Qwen2.5-72B

LiyanTang4's tweet image. Introducing ChartMuseum🖼️, testing visual reasoning with diverse real-world charts!

✍🏻Entirely human-written questions by 13 CS researchers
👀Emphasis on visual reasoning – hard to be verbalized via text CoTs
📉Humans reach 93% but 63% from Gemini-2.5-Pro &amp; 38% from Qwen2.5-72B

Nimit Kalra @ ICML 2025 podał dalej

Jack Youstra

@JackYoustra

3 wrz

New Anthropic Research: Towards Safeguarding LLM Fine-tuning APIs against Cipher Attacks. Fine-tuning LLMs through APIs can be harmful even if the data used for fine-tuning does not appear to be, often because the data encodes a hidden message.

Nimit Kalra @ ICML 2025 podał dalej

Leonard Tang

@leonardtang_

3 wrz

born to do research forced to build b2b saas.... luckily at haize, you can do both (we are hiring)

Nimit Kalra @ ICML 2025 podał dalej

Y Combinator

@ycombinator

26 sie

JetBrains is no longer behind. @firebender_com just launched the first-ever background coding agents for all JetBrains IDEs. These coding agents are incredibly intelligent, have isolated workspaces, and don’t require any cloud setup.

Nimit Kalra @ ICML 2025 podał dalej

Y Combinator

@ycombinator

20 sie

.@RoundtableHQ_'s Proof of Human uses behavioral biometrics to stop bots and AI spam. With 87% accuracy (vs. Google’s 69% and Cloudflare’s 33%), it gives you frictionless, real-time authentication with a one-line API. ycombinator.com/launches/OEh-r… Congrats on the launch @_magrawal &…

Nimit Kalra @ ICML 2025 podał dalej

Joe Melkonian

@joemelko

19 sie

Are we really running out of data??? No. We're just not using it correctly. The solution: let the model learn which data it needs to learn!!! 1/n

joemelko's tweet image. Are we really running out of data??? No. We're just not using it correctly.

The solution: let the model learn which data it needs to learn!!!

1/n

Nimit Kalra @ ICML 2025 podał dalej

Aman Gottumukkala

@AmanGotchu

7 sie

GPT-5 is now live in Firebender for Android Studio, free for a limited time 🚀 It’s definitely the best coding model I’ve ever used. Try it today and tell us what you think.

Nimit Kalra @ ICML 2025 podał dalej

Kevin 👊🔥

@kevo1ution

7 sie

Android engineers have access to GPT 5 through Firebender. Enjoy

Nimit Kalra @ ICML 2025

@qw3rtman

16 lip

Flying out to #ICML2025 tonight! Always down to chat about unverifiable domains, evals, red-teaming, safeguards, or just meet cool people. I’ll be a panelist at the Methods and Opportunities at Small Scale workshop, sharing our work on tiny generalist reward models…

qw3rtman's tweet image. Flying out to #ICML2025 tonight! Always down to chat about unverifiable domains, evals, red-teaming, safeguards, or just meet cool people. I’ll be a panelist at the Methods and Opportunities at Small Scale workshop, sharing our work on tiny generalist reward models…

Nimit Kalra @ ICML 2025 podał dalej

Wing Lian (caseus)

@winglian

15 lip

The current state of the ecosystem for post-training using GRPO w/ vllm + flash attention is frustratingly brittle. - The most recent vllm only supports PyTorch==2.7.0 - vllm requires xformers, but specifically only v0.0.30 is supported for torch 2.7.0. Any prior version of…

Nimit Kalra @ ICML 2025

@qw3rtman

10 lip

can’t even escape the arxiv speak in the group chat

Nimit Kalra @ ICML 2025

@qw3rtman

7 lip

Vogent has a fantastic battle-tested inference stack, glad to see they opened it up + already have a finetuning product. From what I've seen, open-source voice models solve the 0 → 1 quite well but require a lot of post-hoc tuning to get right

Vogent

@vogentai

7 lip

Today we're launching Vogent Voicelab: an optimized API to run top open-source voice models, like Sesame's CSM-1B, Dia, Orpheus, and more.

Nimit Kalra @ ICML 2025

@qw3rtman

1 lip

chart crime so bad you gotta transcribe the values by hand and plot it yourself

Nimit Kalra @ ICML 2025

@qw3rtman

30 cze

evals evals evals

Brendan (can/do)

@BrendanFoody

30 cze

Mercor (@mercor_ai) is now working with 6 out of the Magnificent 7, all of the top 5 AI labs, and most of the top application layer companies. One trend is common across every customer: we are entering The Era of Evals. RL is becoming so effective that models will be able to…

Nimit Kalra @ ICML 2025 podał dalej

Leonard Tang

@leonardtang_

27 cze

New open-source alert! spoken: a unified abstraction over realtime speech-to-speech foundation models. Run any S2S model from OpenAI, Google, Amazon — one interface with one line of code.