Rohit Malhotra

@rohit_malh5

Openhands Maintainer | Ex-CTO @sitewizai | NLP @ CMU | Primarily interested in Agents | Secondary interests in creative design

malhotra5.github.io

Joined July 2018

159Posts 94Followers 67Following

Rohit Malhotra reposted

OpenHands

@OpenHandsDev

Dec 2

Have you heard of the inner and outer loops of development? It's a term coined in Microsoft in the late 2010s, and development was moving from single workstations to cloud-based collaboration.

OpenHandsDev's tweet image. Have you heard of the inner and outer loops of development?

It's a term coined in Microsoft in the late 2010s, and development was moving from single workstations to cloud-based collaboration.

OpenHands was founded on the belief that highly autonomous SWE agents should be open, accessible, and free for everyone. Our $18.8M Series A helps push this vision. Highly secure, highly autonomous, model‑agnostic, async cloud‑based SWE agents platform. ALL shipped openly!

OpenHands

@OpenHandsDev

Nov 18

Big news: OpenHands has raised an $18.8M Series A led by @MadronaVentures to build the open standard for autonomous software development. Open source, model agnostic, and already used across thousands of repos. Read more: openhands.dev/blog/weve-just…

OpenHandsDev's tweet image. Big news: OpenHands has raised an $18.8M Series A led by @MadronaVentures to build the open standard for autonomous software development.

Open source, model agnostic, and already used across thousands of repos.

Read more: openhands.dev/blog/weve-just…

Rohit Malhotra reposted

Xuhui Zhou@NeurIPS

@nlpxuhui

Nov 17

New blog post out! 📜 We share our latest research efforts to build more effective, human-centered AI collaboration. Months ago, I was genuinely surprised by how quickly AI agents were improving, and with that came a deep fear of being replaced, of humans slowly losing agency as…

nlpxuhui's tweet card. Exploring what makes AI agents truly effective for users, beyond benchmark performance.

The Quest of User-Effective AI Agents

Source: xuhuiz.com

Rohit Malhotra reposted

Graham Neubig

@gneubig

Nov 17

This is honestly my new favorite use case for agents, it feels pretty magical. See an error popped up in your SaaS -> copy-paste the error into a github workflow -> 8-10 minutes later you have a diagnosis and a patch (and for us, it's probably about 80% correct).

OpenHands

@OpenHandsDev

Nov 17

This is how we use agents to debug errors in our production web service: 1. Get @datadoghq logs and finding out when the error started 2. Look through the commit history, finding any suspicious code changes 3. If something is found, report back to human engineer w/ a patch

Rohit Malhotra reposted

Graham Neubig

@gneubig

Nov 7

The video for my talk "Lessons from the trenches in building usable coding agents" has been uploaded! youtube.com/watch?v=p7zebv… It's an overview of some of the problems we faced and research work we've done to fix them over the past 1.5 years, hope it's interesting!

gneubig's tweet card. Lessons from the Trenches on Building Usable Coding Agents - Graham...

youtube.com

YouTube

Lessons from the Trenches on Building Usable Coding Agents - Graham...

Source: youtube.com

Rohit Malhotra reposted

OpenHands

@OpenHandsDev

Nov 6

What do these tasks have in common? - Fixing security vulnerabilities - Data entry from messy unstructured forms - Version upgrades They're repetitive but important tasks that can be solved in a few lines of code by our new OpenHands Software Agent SDK.

OpenHandsDev's tweet image. What do these tasks have in common?

- Fixing security vulnerabilities
- Data entry from messy unstructured forms
- Version upgrades

They're repetitive but important tasks that can be solved in a few lines of code by our new OpenHands Software Agent SDK.

Rohit Malhotra reposted

Nilesh Trivedi

@nileshtrivedi

Nov 5

Strange that people are not giving credit to the CodeAct paper:

Rohit Malhotra reposted

Xuhui Zhou@NeurIPS

@nlpxuhui

Oct 31

Hoping your coding agents could understand you and adapt to your preferences? Meet TOM-SWE, our new framework for coding agents that don’t just write code, but model the user's mind persistently (ranging from general preferences to small details) arxiv: arxiv.org/abs/2510.21903…

Rohit Malhotra reposted

Yueqi Song

@yueqi_song

Oct 29

We just built and released the largest dataset for supervised fine-tuning of agentic LMs, 1.27M trajectories (~36B tokens)! Up until now, large-scale SFT for agents is rare - not for lack of data, but because of fragmentation across heterogeneous formats, tools, and interfaces.…

Rohit Malhotra

@rohit_malh5

Sep 24

SWE-Agents are crushing benchmarks like SWE-Bench but are still fragile in the wild. I argue A/B testing is the missing piece for evaluating and improving SWE-Agents. Proof in Production: Evaluating Effectiveness of SWE Agents with A/B Tests open.substack.com/pub/rohitmalh/…

Rohit Malhotra reposted

Jiseung Hong

@jiseungh99

Sep 22

We are excited to launch the ⚔️PR Arena⚔️ leaderboard! Full results will be revealed after a certain milestone of community votes. Fix your GitHub issues for free and vote for better fix! 👉Leaderboard & Setup Guide: prarena.web.app

jiseungh99's tweet image. We are excited to launch the ⚔️PR Arena⚔️ leaderboard!

Full results will be revealed after a certain milestone of community votes.

Fix your GitHub issues for free and vote for better fix!

👉Leaderboard &amp; Setup Guide: prarena.web.app

Rohit Malhotra reposted

Valerie Chen

@valeriechen_

Sep 16

A recent study by Becker et al. finds AI copilots like Cursor slowed expert OSS devs by 19%. But what happens when we compare copilots to more autonomous coding agents? Our study finds the opposite story: agents can boost productivity. 🧵

valeriechen_'s tweet image. A recent study by Becker et al. finds AI copilots like Cursor slowed expert OSS devs by 19%. But what happens when we compare copilots to more autonomous coding agents? Our study finds the opposite story: agents can boost productivity. 🧵

Rohit Malhotra reposted

Robert Brennan

@rbren_dev

Sep 9

I'll be speaking about automating large-scale refactors with OpenHands at AI Engineer Paris! It's amazing how much software agents can get done if you orchestrate them thoughtfully.

rbren_dev's tweet image. I'll be speaking about automating large-scale refactors with OpenHands at AI Engineer Paris!

It's amazing how much software agents can get done if you orchestrate them thoughtfully.

Rohit Malhotra reposted

Graham Neubig

@gneubig

Sep 3

Which LM is better at agentic coding? We have a bunch of useful academic benchmarks like SWE-Bench, but we don't have a good comparison of agentic coding LMs *in the wild*. To solve this, we released PR Arena: github.com/neulab/pr-arena

gneubig's tweet card. ⚔️ OpenHands PR Arena ⚔️ is a platform for evaluating and benchmarking agentic coding assistants through paired pull request (PR) generations. - neulab/pr-arena

GitHub - neulab/pr-arena: ⚔️ OpenHands PR Arena ⚔️ is a platform for evaluating and benchmarking...

Source: github.com

Jiseung Hong

@jiseungh99

Sep 3

Introducing ⚔️PR Arena⚔️ - free AI coding agents to fix real GitHub issues. Claude Sonnet 4 vs Gemini 2.5 Pro… Who writes better pull requests? 👉 Install here: github.com/apps/openhands… Powered by @allhands_ai

Rohit Malhotra reposted

Jiseung Hong

@jiseungh99

Sep 3

Rohit Malhotra reposted

OpenHands

@OpenHandsDev

Aug 25

Having appropriate tests makes a world of difference for agent-driven development. If your agent can write a test to localize a bug or exercise a new feature, the following implementation is much more solid. OpenHands+GPT-5 is now 🥇 on the SWT-Bench testing leaderboard!

OpenHandsDev's tweet image. Having appropriate tests makes a world of difference for agent-driven development.

If your agent can write a test to localize a bug or exercise a new feature, the following implementation is much more solid.

OpenHands+GPT-5 is now 🥇 on the SWT-Bench testing leaderboard!

Rohit Malhotra reposted

OpenHands

@OpenHandsDev

Aug 22

We built OpenHands in the open (~60K ⭐️ on GitHub). Now we’re giving back to the OSS ecosystem. Announcing the OpenHands Cloud OSS Credit Program → $100–$500 credits for maintainers. 👉 Learn how to apply!

Rohit Malhotra reposted

Robert Brennan

@rbren_dev

Jul 22

Nothing more frustrating than seeing "private scaffold" on public benchmark results I love that model providers like Qwen and Mistral are now reporting their results specifically using OpenHands as the scaffold--feels like we're becoming a standard here x.com/Alibaba_Qwen/s…

Qwen

@Alibaba_Qwen

Jul 22

>>> Qwen3-Coder is here! ✅ We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation. It achieves…