Nimit Kalra @ ICML 2025

@qw3rtman

research @haizelabs @columbia, prev @citadel @utaustin currently feynman technique-ing my way through life

nimit.io

เข้าร่วมเมื่อ ตุลาคม 2011

195โพสต์ 1พันผู้ติดตาม 983กําลังติดตาม

คุณอาจชื่นชอบ

@rishistalati

@kevo1ution

@RobertKyslinger

Nimit Kalra @ ICML 2025 รีโพสต์แล้ว

Leonard Tang

@leonardtang_

28 ต.ค.

Customers building AI agents often lament at the difficulty of using off-the-shelf LLM Eval tools for their specific app. While there's no doubt that human supervision is required, not all supervision is the same. Why not transform the supervision problem to make it easier?

leonardtang_'s tweet image. Customers building AI agents often lament at the difficulty of using off-the-shelf LLM Eval tools for their specific app.

While there's no doubt that human supervision is required, not all supervision is the same.

Why not transform the supervision problem to make it easier?

Nimit Kalra @ ICML 2025 รีโพสต์แล้ว

Liyan Tang

@LiyanTang4

19 ก.ย.

Our paper "ChartMuseum 🖼️" is now accepted to #NeurIPS2025 Datasets and Benchmarks Track! Even the latest models, such as GPT-5 and Gemini-2.5-Pro, still cannot do well on challenging 📉chart understanding questions , especially on those that involve visual reasoning 👀!

Liyan Tang

@LiyanTang4

20 พ.ค.

Introducing ChartMuseum🖼️, testing visual reasoning with diverse real-world charts! ✍🏻Entirely human-written questions by 13 CS researchers 👀Emphasis on visual reasoning – hard to be verbalized via text CoTs 📉Humans reach 93% but 63% from Gemini-2.5-Pro & 38% from Qwen2.5-72B

LiyanTang4's tweet image. Introducing ChartMuseum🖼️, testing visual reasoning with diverse real-world charts!

✍🏻Entirely human-written questions by 13 CS researchers
👀Emphasis on visual reasoning – hard to be verbalized via text CoTs
📉Humans reach 93% but 63% from Gemini-2.5-Pro &amp; 38% from Qwen2.5-72B

Nimit Kalra @ ICML 2025 รีโพสต์แล้ว

Jack Youstra

@JackYoustra

3 ก.ย.

New Anthropic Research: Towards Safeguarding LLM Fine-tuning APIs against Cipher Attacks. Fine-tuning LLMs through APIs can be harmful even if the data used for fine-tuning does not appear to be, often because the data encodes a hidden message.

Nimit Kalra @ ICML 2025 รีโพสต์แล้ว

Leonard Tang

@leonardtang_

3 ก.ย.

born to do research forced to build b2b saas.... luckily at haize, you can do both (we are hiring)

Nimit Kalra @ ICML 2025 รีโพสต์แล้ว

Y Combinator

@ycombinator

26 ส.ค.

JetBrains is no longer behind. @firebender_com just launched the first-ever background coding agents for all JetBrains IDEs. These coding agents are incredibly intelligent, have isolated workspaces, and don’t require any cloud setup.

Nimit Kalra @ ICML 2025 รีโพสต์แล้ว

Y Combinator

@ycombinator

20 ส.ค.

.@RoundtableHQ_'s Proof of Human uses behavioral biometrics to stop bots and AI spam. With 87% accuracy (vs. Google’s 69% and Cloudflare’s 33%), it gives you frictionless, real-time authentication with a one-line API. ycombinator.com/launches/OEh-r… Congrats on the launch @_magrawal &…

Nimit Kalra @ ICML 2025 รีโพสต์แล้ว

Joe Melkonian

@joemelko

19 ส.ค.

Are we really running out of data??? No. We're just not using it correctly. The solution: let the model learn which data it needs to learn!!! 1/n

joemelko's tweet image. Are we really running out of data??? No. We're just not using it correctly.

The solution: let the model learn which data it needs to learn!!!

1/n

Nimit Kalra @ ICML 2025 รีโพสต์แล้ว

Aman Gottumukkala

@AmanGotchu

7 ส.ค.

GPT-5 is now live in Firebender for Android Studio, free for a limited time 🚀 It’s definitely the best coding model I’ve ever used. Try it today and tell us what you think.

Nimit Kalra @ ICML 2025 รีโพสต์แล้ว

Kevin 👊🔥

@kevo1ution

7 ส.ค.

Android engineers have access to GPT 5 through Firebender. Enjoy

Nimit Kalra @ ICML 2025

@qw3rtman

16 ก.ค.

Flying out to #ICML2025 tonight! Always down to chat about unverifiable domains, evals, red-teaming, safeguards, or just meet cool people. I’ll be a panelist at the Methods and Opportunities at Small Scale workshop, sharing our work on tiny generalist reward models…

qw3rtman's tweet image. Flying out to #ICML2025 tonight! Always down to chat about unverifiable domains, evals, red-teaming, safeguards, or just meet cool people. I’ll be a panelist at the Methods and Opportunities at Small Scale workshop, sharing our work on tiny generalist reward models…

Nimit Kalra @ ICML 2025 รีโพสต์แล้ว

Wing Lian (caseus)

@winglian

15 ก.ค.

The current state of the ecosystem for post-training using GRPO w/ vllm + flash attention is frustratingly brittle. - The most recent vllm only supports PyTorch==2.7.0 - vllm requires xformers, but specifically only v0.0.30 is supported for torch 2.7.0. Any prior version of…

Nimit Kalra @ ICML 2025

@qw3rtman

10 ก.ค.

can’t even escape the arxiv speak in the group chat

Nimit Kalra @ ICML 2025

@qw3rtman

7 ก.ค.

Vogent has a fantastic battle-tested inference stack, glad to see they opened it up + already have a finetuning product. From what I've seen, open-source voice models solve the 0 → 1 quite well but require a lot of post-hoc tuning to get right

Vogent

@vogentai

7 ก.ค.

Today we're launching Vogent Voicelab: an optimized API to run top open-source voice models, like Sesame's CSM-1B, Dia, Orpheus, and more.

Nimit Kalra @ ICML 2025

@qw3rtman

1 ก.ค.

chart crime so bad you gotta transcribe the values by hand and plot it yourself

Nimit Kalra @ ICML 2025

@qw3rtman

30 มิ.ย.

evals evals evals

Brendan (can/do)

@BrendanFoody

30 มิ.ย.

Mercor (@mercor_ai) is now working with 6 out of the Magnificent 7, all of the top 5 AI labs, and most of the top application layer companies. One trend is common across every customer: we are entering The Era of Evals. RL is becoming so effective that models will be able to…