OpenAdaptAI
@OpenAdaptAI
Open source AI that automates tasks in desktop apps by observing human demonstrations. Mac/Win compatible. https://github.com/OpenAdaptAI/OpenAdapt
Here's the latest from @OpenAdaptAI, faster and more robust. No command line required! #AI #agent #OpenAI #GPT4 Free download at OpenAdapt.AI 🚀
We are super excited to release OpenCUA — the first from 0 to 1 computer-use agent foundation model framework and open-source SOTA model OpenCUA-32B, matching top proprietary models on OSWorld-Verified, with full infrastructure and data. 🔗 [Paper] arxiv.org/abs/2508.09123 📌…
🙌 Acknowledgement: We thank @ysu_nlp, @CaimingXiong , and the anonymous reviewers for their insightful discussions and valuable feedback. We are grateful to Moonshot AI for providing training infrastructure and annotated data. We also sincerely appreciate Jin Zhang, Hao Yang,…
Anybody looking for a GUI+ICL-->MCP library should definitely check out OmniMCP which puts Microsoft's Omniparser to use in generating GUI tool use APIs. Early days but pretty neat omnimcp.openadapt.ai
omnimcp.openadapt.ai
OpenAdapt.AI - Revolutionizing Task Automation
Discover OpenAdapt.AI, the leading AI-driven automation tool designed to streamline your workflow efficiently.
OpenAdapt AI-First Process Automation with Large Multimodal Models (LMMs). github.com/OpenAdaptAI/Op…
I prompted @openai's ChatGPT o3-mini-high and @DeepSeek's R1 to implement code to for deploying @alibaba_qwen's Qwen2.5-VL. Both agree that R1's implementation is "more comprehensive" and better "for production systems".
Qwen2.5-VL is the first open source multimodal model that appears to be able to accurately generate bounding box coordinates 🚀 Thank you @Alibaba_Qwen ! Excited to integrate this in @OpenAdaptAI x.com/Alibaba_Qwen/s…
2 Spatial Understanding This notebook showcases Qwen2.5-VL's advanced spatial localization abilities, including accurate object detection and specific target grounding within images.See how it integrates visual and linguistic understanding to interpret complex scenes…
Check out our latest GUI Agent -> UI-TARS 🥳 A vision-language model surpasses GPT-4o & Claude Computer-Use Paper, code, model ckpt, desktop APP are now open-sourced~ github.com/bytedance/UI-T… github.com/bytedance/UI-T…
github.com/MoonshotAI/Kim… > 🚀 Introducing Kimi k1.5 --- an o1-level multi-modal model 🤯
github.com/deepseek-ai/De… > DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks. huggingface.co/deepseek-ai/De… We can run frontier models at home now.
Another day, another breakthrough: Apply DCT to convert actions into frequency components, quantize them prioritizing low frequencies, then use autoregressive prediction in frequency order (low to high) to generate actions. From @physical_int. May generalize to @OpenAdaptAI.
With @OpenAdaptAI you start and stop recording demonstrations of repetitive tasks via the tray icon. Show, don't tell. Perform, don't prompt.
Sure does! Mac and Win compatible.
(venv) % python client.py http://34.206.53.77:7861 ~/Desktop/screenshot.png Loaded as API: http://34.206.53.77:7861/ ✔ Parsed content: ... 2024-10-29 11:13:07.414 | INFO | __main__:predict:84 - Output image saved to: output_image.png
Deploy @Microsoft OmniParser to @AWS #EC2 automatically via @Docker and #GitHub actions: github.com/microsoft/Omni…
within the next year, AI will be able to ingest everything that ever happened on your computer check out this cool video about tools enabling this: @OpenAdaptAI @tooluseai @MikeBirdTech @OpenInterpreter @FieroTy @abrichr and @screen_pipe :) youtube.com/watch?v=VgJ0Cg…
youtube.com
YouTube
Automate Your Desktop to Save Time With These FREE Tools (ft Richard...
If I was starting a company today, I would look into productizing this model 👀
Microsoft released a groundbreaking model that can be used for web automation, with MIT license 🔥👏 OmniParser is a state-of-the-art UI parsing/understanding model that outperforms GPT4V in parsing. 👏
Automate your desktop for free with @OpenAdaptAI @abrichr joined us this week to share his open source tool that saved one user $75,000
Screenshot from youtube.com/watch?v=01g_Ef…. One way to think about @OpenAdaptAI is as a way to learn "Flows" automatically by observing human demonstrations.
It's a bit sad and confusing that LLMs ("Large Language Models") have little to do with language; It's just historical. They are highly general purpose technology for statistical modeling of token streams. A better name would be Autoregressive Transformers or something. They…
United States Trends
- 1. Grammy 248K posts
- 2. Clipse 14.9K posts
- 3. Dizzy 8,790 posts
- 4. Kendrick 53.2K posts
- 5. olivia dean 12.3K posts
- 6. addison rae 19.4K posts
- 7. AOTY 17.6K posts
- 8. Katseye 102K posts
- 9. Leon Thomas 15.4K posts
- 10. gaga 90.9K posts
- 11. #FanCashDropPromotion 3,485 posts
- 12. Kehlani 30.5K posts
- 13. ravyn lenae 2,797 posts
- 14. lorde 11K posts
- 15. Durand 4,522 posts
- 16. Alfredo 2 N/A
- 17. Album of the Year 56K posts
- 18. The Weeknd 10.4K posts
- 19. Alex Warren 6,160 posts
- 20. Burning Blue 1,508 posts
Something went wrong.
Something went wrong.