Rohit Malhotra
@rohit_malh5
Openhands Maintainer | Ex-CTO @sitewizai | NLP @ CMU | Primarily interested in Agents | Secondary interests in creative design
Have you heard of the inner and outer loops of development? It's a term coined in Microsoft in the late 2010s, and development was moving from single workstations to cloud-based collaboration.
OpenHands was founded on the belief that highly autonomous SWE agents should be open, accessible, and free for everyone. Our $18.8M Series A helps push this vision. Highly secure, highly autonomous, model‑agnostic, async cloud‑based SWE agents platform. ALL shipped openly!
Big news: OpenHands has raised an $18.8M Series A led by @MadronaVentures to build the open standard for autonomous software development. Open source, model agnostic, and already used across thousands of repos. Read more: openhands.dev/blog/weve-just…
New blog post out! 📜 We share our latest research efforts to build more effective, human-centered AI collaboration. Months ago, I was genuinely surprised by how quickly AI agents were improving, and with that came a deep fear of being replaced, of humans slowly losing agency as…
This is honestly my new favorite use case for agents, it feels pretty magical. See an error popped up in your SaaS -> copy-paste the error into a github workflow -> 8-10 minutes later you have a diagnosis and a patch (and for us, it's probably about 80% correct).
This is how we use agents to debug errors in our production web service: 1. Get @datadoghq logs and finding out when the error started 2. Look through the commit history, finding any suspicious code changes 3. If something is found, report back to human engineer w/ a patch
The video for my talk "Lessons from the trenches in building usable coding agents" has been uploaded! youtube.com/watch?v=p7zebv… It's an overview of some of the problems we faced and research work we've done to fix them over the past 1.5 years, hope it's interesting!
youtube.com
YouTube
Lessons from the Trenches on Building Usable Coding Agents - Graham...
What do these tasks have in common? - Fixing security vulnerabilities - Data entry from messy unstructured forms - Version upgrades They're repetitive but important tasks that can be solved in a few lines of code by our new OpenHands Software Agent SDK.
Strange that people are not giving credit to the CodeAct paper:
Hoping your coding agents could understand you and adapt to your preferences? Meet TOM-SWE, our new framework for coding agents that don’t just write code, but model the user's mind persistently (ranging from general preferences to small details) arxiv: arxiv.org/abs/2510.21903…
We just built and released the largest dataset for supervised fine-tuning of agentic LMs, 1.27M trajectories (~36B tokens)! Up until now, large-scale SFT for agents is rare - not for lack of data, but because of fragmentation across heterogeneous formats, tools, and interfaces.…
SWE-Agents are crushing benchmarks like SWE-Bench but are still fragile in the wild. I argue A/B testing is the missing piece for evaluating and improving SWE-Agents. Proof in Production: Evaluating Effectiveness of SWE Agents with A/B Tests open.substack.com/pub/rohitmalh/…
We are excited to launch the ⚔️PR Arena⚔️ leaderboard! Full results will be revealed after a certain milestone of community votes. Fix your GitHub issues for free and vote for better fix! 👉Leaderboard & Setup Guide: prarena.web.app
A recent study by Becker et al. finds AI copilots like Cursor slowed expert OSS devs by 19%. But what happens when we compare copilots to more autonomous coding agents? Our study finds the opposite story: agents can boost productivity. 🧵
I'll be speaking about automating large-scale refactors with OpenHands at AI Engineer Paris! It's amazing how much software agents can get done if you orchestrate them thoughtfully.
Which LM is better at agentic coding? We have a bunch of useful academic benchmarks like SWE-Bench, but we don't have a good comparison of agentic coding LMs *in the wild*. To solve this, we released PR Arena: github.com/neulab/pr-arena
Introducing ⚔️PR Arena⚔️ - free AI coding agents to fix real GitHub issues. Claude Sonnet 4 vs Gemini 2.5 Pro… Who writes better pull requests? 👉 Install here: github.com/apps/openhands… Powered by @allhands_ai
Introducing ⚔️PR Arena⚔️ - free AI coding agents to fix real GitHub issues. Claude Sonnet 4 vs Gemini 2.5 Pro… Who writes better pull requests? 👉 Install here: github.com/apps/openhands… Powered by @allhands_ai
Having appropriate tests makes a world of difference for agent-driven development. If your agent can write a test to localize a bug or exercise a new feature, the following implementation is much more solid. OpenHands+GPT-5 is now 🥇 on the SWT-Bench testing leaderboard!
We built OpenHands in the open (~60K ⭐️ on GitHub). Now we’re giving back to the OSS ecosystem. Announcing the OpenHands Cloud OSS Credit Program → $100–$500 credits for maintainers. 👉 Learn how to apply!
Nothing more frustrating than seeing "private scaffold" on public benchmark results I love that model providers like Qwen and Mistral are now reporting their results specifically using OpenHands as the scaffold--feels like we're becoming a standard here x.com/Alibaba_Qwen/s…
>>> Qwen3-Coder is here! ✅ We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation. It achieves…
>>> Qwen3-Coder is here! ✅ We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation. It achieves…
United States Trends
- 1. #AEWDynamite 18.9K posts
- 2. Giannis 76.3K posts
- 3. #Survivor49 2,444 posts
- 4. #TheChallenge41 1,843 posts
- 5. Claudio 28.3K posts
- 6. Jamal Murray 4,978 posts
- 7. Ryan Leonard N/A
- 8. Kevin Overton N/A
- 9. Ryan Nembhard 3,017 posts
- 10. #SistasOnBET 1,896 posts
- 11. Will Wade N/A
- 12. #iubb 1,182 posts
- 13. Achilles 5,153 posts
- 14. Kevin Knight 1,948 posts
- 15. Bucks 50.7K posts
- 16. Tyler Herro 1,648 posts
- 17. Dark Order 1,684 posts
- 18. Steve Cropper 4,474 posts
- 19. Yeremi N/A
- 20. Jericho Sims N/A
Something went wrong.
Something went wrong.