OpenAdaptAI's profile picture. Open source AI that automates tasks in desktop apps by observing human demonstrations. Mac/Win compatible. https://github.com/OpenAdaptAI/OpenAdapt

OpenAdaptAI

@OpenAdaptAI

Open source AI that automates tasks in desktop apps by observing human demonstrations. Mac/Win compatible. https://github.com/OpenAdaptAI/OpenAdapt

Pinned

Here's the latest from @OpenAdaptAI, faster and more robust. No command line required! #AI #agent #OpenAI #GPT4 Free download at OpenAdapt.AI 🚀


OpenAdaptAI reposted

We are super excited to release OpenCUA — the first from 0 to 1 computer-use agent foundation model framework and open-source SOTA model OpenCUA-32B, matching top proprietary models on OSWorld-Verified, with full infrastructure and data. 🔗 [Paper] arxiv.org/abs/2508.09123 📌…

xywang626's tweet image. We are super excited to release OpenCUA — the first from 0 to 1 computer-use agent foundation model framework and open-source SOTA model OpenCUA-32B, matching top proprietary models on OSWorld-Verified, with full infrastructure and data.

🔗 [Paper] arxiv.org/abs/2508.09123 
📌…

OpenAdaptAI reposted

🙌 Acknowledgement: We thank @ysu_nlp, @CaimingXiong , and the anonymous reviewers for their insightful discussions and valuable feedback. We are grateful to Moonshot AI for providing training infrastructure and annotated data. We also sincerely appreciate Jin Zhang, Hao Yang,…


OpenAdaptAI reposted

Anybody looking for a GUI+ICL-->MCP library should definitely check out OmniMCP which puts Microsoft's Omniparser to use in generating GUI tool use APIs. Early days but pretty neat omnimcp.openadapt.ai

omnimcp.openadapt.ai

OpenAdapt.AI - Revolutionizing Task Automation

Discover OpenAdapt.AI, the leading AI-driven automation tool designed to streamline your workflow efficiently.


OpenAdaptAI reposted

I prompted @openai's ChatGPT o3-mini-high and @DeepSeek's R1 to implement code to for deploying @alibaba_qwen's Qwen2.5-VL. Both agree that R1's implementation is "more comprehensive" and better "for production systems".

abrichr's tweet image. I prompted @openai's ChatGPT o3-mini-high and @DeepSeek's R1 to implement code to  for deploying @alibaba_qwen's Qwen2.5-VL.

Both agree that R1's implementation is "more comprehensive" and better "for production systems".
abrichr's tweet image. I prompted @openai's ChatGPT o3-mini-high and @DeepSeek's R1 to implement code to  for deploying @alibaba_qwen's Qwen2.5-VL.

Both agree that R1's implementation is "more comprehensive" and better "for production systems".

OpenAdaptAI reposted

Qwen2.5-VL is the first open source multimodal model that appears to be able to accurately generate bounding box coordinates 🚀 Thank you @Alibaba_Qwen ! Excited to integrate this in @OpenAdaptAI x.com/Alibaba_Qwen/s…

abrichr's tweet image. Qwen2.5-VL is the first open source multimodal model that appears to be able to accurately generate bounding box coordinates 🚀

Thank you @Alibaba_Qwen ! Excited to integrate this in @OpenAdaptAI 

x.com/Alibaba_Qwen/s…

2 Spatial Understanding This notebook showcases Qwen2.5-VL's advanced spatial localization abilities, including accurate object detection and specific target grounding within images.See how it integrates visual and linguistic understanding to interpret complex scenes…

Alibaba_Qwen's tweet image. 2 Spatial Understanding

This notebook showcases Qwen2.5-VL's advanced spatial localization abilities, including accurate object detection and specific target grounding within images.See how it integrates visual and linguistic understanding to interpret complex scenes…


OpenAdaptAI reposted

Check out our latest GUI Agent -> UI-TARS 🥳 A vision-language model surpasses GPT-4o & Claude Computer-Use Paper, code, model ckpt, desktop APP are now open-sourced~ github.com/bytedance/UI-T… github.com/bytedance/UI-T…

TsingYoga's tweet image. Check out our latest GUI Agent -> UI-TARS 🥳
A vision-language model surpasses GPT-4o & Claude Computer-Use

Paper, code, model ckpt, desktop APP are now open-sourced~
github.com/bytedance/UI-T…
github.com/bytedance/UI-T…
TsingYoga's tweet image. Check out our latest GUI Agent -> UI-TARS 🥳
A vision-language model surpasses GPT-4o & Claude Computer-Use

Paper, code, model ckpt, desktop APP are now open-sourced~
github.com/bytedance/UI-T…
github.com/bytedance/UI-T…

OpenAdaptAI reposted

github.com/MoonshotAI/Kim… > 🚀 Introducing Kimi k1.5 --- an o1-level multi-modal model 🤯


OpenAdaptAI reposted

github.com/deepseek-ai/De… > DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks. huggingface.co/deepseek-ai/De… We can run frontier models at home now.


OpenAdaptAI reposted

Another day, another breakthrough: Apply DCT to convert actions into frequency components, quantize them prioritizing low frequencies, then use autoregressive prediction in frequency order (low to high) to generate actions. From @physical_int. May generalize to @OpenAdaptAI.

abrichr's tweet image. Another day, another breakthrough:

Apply DCT to convert actions into frequency components, quantize them prioritizing low frequencies, then use autoregressive prediction in frequency order (low to high) to generate actions.

From @physical_int. May generalize to @OpenAdaptAI.
abrichr's tweet image. Another day, another breakthrough:

Apply DCT to convert actions into frequency components, quantize them prioritizing low frequencies, then use autoregressive prediction in frequency order (low to high) to generate actions.

From @physical_int. May generalize to @OpenAdaptAI.

OpenAdaptAI reposted

With @OpenAdaptAI you start and stop recording demonstrations of repetitive tasks via the tray icon. Show, don't tell. Perform, don't prompt.


OpenAdaptAI reposted

Nice taxonomy of Agents from @huggingface smolagents

abrichr's tweet image. Nice taxonomy of Agents from @huggingface smolagents

OpenAdaptAI reposted

Sure does! Mac and Win compatible.

abrichr's tweet image. Sure does! Mac and Win compatible.

OpenAdaptAI reposted

(venv) % python client.py http://34.206.53.77:7861 ~/Desktop/screenshot.png Loaded as API: http://34.206.53.77:7861/ ✔ Parsed content: ... 2024-10-29 11:13:07.414 | INFO | __main__:predict:84 - Output image saved to: output_image.png

abrichr's tweet image. (venv) % python client.py http://34.206.53.77:7861 ~/Desktop/screenshot.png
Loaded as API: http://34.206.53.77:7861/ ✔
Parsed content:
...
2024-10-29 11:13:07.414 | INFO     | __main__:predict:84 - Output image saved to: output_image.png

OpenAdaptAI reposted

within the next year, AI will be able to ingest everything that ever happened on your computer check out this cool video about tools enabling this: @OpenAdaptAI @tooluseai @MikeBirdTech @OpenInterpreter @FieroTy @abrichr and @screen_pipe :) youtube.com/watch?v=VgJ0Cg…

louis030195's tweet card. Automate Your Desktop to Save Time With These FREE Tools (ft Richard...

youtube.com

YouTube

Automate Your Desktop to Save Time With These FREE Tools (ft Richard...


OpenAdaptAI reposted

If I was starting a company today, I would look into productizing this model 👀

Microsoft released a groundbreaking model that can be used for web automation, with MIT license 🔥👏 OmniParser is a state-of-the-art UI parsing/understanding model that outperforms GPT4V in parsing. 👏



OpenAdaptAI reposted

Automate your desktop for free with @OpenAdaptAI @abrichr joined us this week to share his open source tool that saved one user $75,000


OpenAdaptAI reposted

Screenshot from youtube.com/watch?v=01g_Ef…. One way to think about @OpenAdaptAI is as a way to learn "Flows" automatically by observing human demonstrations.

abrichr's tweet image. Screenshot from youtube.com/watch?v=01g_Ef….

One way to think about @OpenAdaptAI is as a way to learn "Flows" automatically by observing human demonstrations.

OpenAdaptAI reposted

It's a bit sad and confusing that LLMs ("Large Language Models") have little to do with language; It's just historical. They are highly general purpose technology for statistical modeling of token streams. A better name would be Autoregressive Transformers or something. They…


This account does not follow anyone

United States Trends

Loading...

Something went wrong.


Something went wrong.