OpenAdaptAI

@OpenAdaptAI

Open source AI that automates tasks in desktop apps by observing human demonstrations. Mac/Win compatible. https://github.com/OpenAdaptAI/OpenAdapt

openadapt.ai

Joined May 2023

68Posts 419Followers 0Following

Pinned

OpenAdaptAI

@OpenAdaptAI

May 11, 2024

Here's the latest from @OpenAdaptAI, faster and more robust. No command line required! #AI #agent #OpenAI #GPT4 Free download at OpenAdapt.AI 🚀

OpenAdaptAI reposted

We are super excited to release OpenCUA — the first from 0 to 1 computer-use agent foundation model framework and open-source SOTA model OpenCUA-32B, matching top proprietary models on OSWorld-Verified, with full infrastructure and data. 🔗 [Paper] arxiv.org/abs/2508.09123 📌…

xywang626's tweet image. We are super excited to release OpenCUA — the first from 0 to 1 computer-use agent foundation model framework and open-source SOTA model OpenCUA-32B, matching top proprietary models on OSWorld-Verified, with full infrastructure and data.

🔗 [Paper] arxiv.org/abs/2508.09123
📌…

OpenAdaptAI reposted

Xinyuan Wang

@xywang626

Aug 15

🙌 Acknowledgement: We thank @ysu_nlp, @CaimingXiong , and the anonymous reviewers for their insightful discussions and valuable feedback. We are grateful to Moonshot AI for providing training infrastructure and annotated data. We also sincerely appreciate Jin Zhang, Hao Yang,…

OpenAdaptAI reposted

Rico Pagliuca

@pagilgukey

Mar 31

Anybody looking for a GUI+ICL-->MCP library should definitely check out OmniMCP which puts Microsoft's Omniparser to use in generating GUI tool use APIs. Early days but pretty neat omnimcp.openadapt.ai

omnimcp.openadapt.ai

OpenAdapt.AI - Revolutionizing Task Automation

Discover OpenAdapt.AI, the leading AI-driven automation tool designed to streamline your workflow efficiently.

Source: omnimcp.openadapt.ai

OpenAdaptAI reposted

Python Hub

@PythonHub

Aug 16, 2024

OpenAdapt AI-First Process Automation with Large Multimodal Models (LMMs). github.com/OpenAdaptAI/Op…

PythonHub's tweet card. Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models - Ope...

GitHub - OpenAdaptAI/OpenAdapt: Open Source Generative Process Automation (i.e. Generative RPA)....

Source: github.com

OpenAdaptAI reposted

Richard Abrich

@abrichr

Feb 1

I prompted @openai's ChatGPT o3-mini-high and @DeepSeek's R1 to implement code to for deploying @alibaba_qwen's Qwen2.5-VL. Both agree that R1's implementation is "more comprehensive" and better "for production systems".

abrichr's tweet image. I prompted @openai's ChatGPT o3-mini-high and @DeepSeek's R1 to implement code to for deploying @alibaba_qwen's Qwen2.5-VL.

Both agree that R1's implementation is "more comprehensive" and better "for production systems".

OpenAdaptAI reposted

Richard Abrich

@abrichr

Jan 30

Qwen2.5-VL is the first open source multimodal model that appears to be able to accurately generate bounding box coordinates 🚀 Thank you @Alibaba_Qwen ! Excited to integrate this in @OpenAdaptAI x.com/Alibaba_Qwen/s…

Qwen

@Alibaba_Qwen

Jan 30

2 Spatial Understanding This notebook showcases Qwen2.5-VL's advanced spatial localization abilities, including accurate object detection and specific target grounding within images.See how it integrates visual and linguistic understanding to interpret complex scenes…

Alibaba_Qwen's tweet image. 2 Spatial Understanding

This notebook showcases Qwen2.5-VL's advanced spatial localization abilities, including accurate object detection and specific target grounding within images.See how it integrates visual and linguistic understanding to interpret complex scenes…

OpenAdaptAI reposted

Yujia Qin

@TsingYoga

Jan 21

Check out our latest GUI Agent -> UI-TARS 🥳 A vision-language model surpasses GPT-4o & Claude Computer-Use Paper, code, model ckpt, desktop APP are now open-sourced~ github.com/bytedance/UI-T… github.com/bytedance/UI-T…

TsingYoga's tweet image. Check out our latest GUI Agent -&gt; UI-TARS 🥳
A vision-language model surpasses GPT-4o &amp; Claude Computer-Use

Paper, code, model ckpt, desktop APP are now open-sourced~
github.com/bytedance/UI-T…
github.com/bytedance/UI-T…

OpenAdaptAI reposted

Richard Abrich

@abrichr

Jan 20

github.com/MoonshotAI/Kim… > 🚀 Introducing Kimi k1.5 --- an o1-level multi-modal model 🤯

abrichr's tweet card. Contribute to MoonshotAI/Kimi-k1.5 development by creating an account on GitHub.

GitHub - MoonshotAI/Kimi-k1.5

Source: github.com

OpenAdaptAI reposted

Richard Abrich

@abrichr

Jan 20

github.com/deepseek-ai/De… > DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks. huggingface.co/deepseek-ai/De… We can run frontier models at home now.

deepseek-ai/DeepSeek-R1-Distill-Llama-70B · Hugging Face

Source: huggingface.co

OpenAdaptAI reposted

Richard Abrich

@abrichr

Jan 17

Another day, another breakthrough: Apply DCT to convert actions into frequency components, quantize them prioritizing low frequencies, then use autoregressive prediction in frequency order (low to high) to generate actions. From @physical_int. May generalize to @OpenAdaptAI.

abrichr's tweet image. Another day, another breakthrough:

Apply DCT to convert actions into frequency components, quantize them prioritizing low frequencies, then use autoregressive prediction in frequency order (low to high) to generate actions.

From @physical_int. May generalize to @OpenAdaptAI.

OpenAdaptAI reposted

Richard Abrich

@abrichr

Jan 16

With @OpenAdaptAI you start and stop recording demonstrations of repetitive tasks via the tray icon. Show, don't tell. Perform, don't prompt.

OpenAdaptAI reposted

Richard Abrich

@abrichr

Jan 9

Nice taxonomy of Agents from @huggingface smolagents

OpenAdaptAI reposted

Richard Abrich

@abrichr

Nov 7, 2024

Sure does! Mac and Win compatible.

OpenAdaptAI reposted

Richard Abrich

@abrichr

Oct 29, 2024

(venv) % python client.py http://34.206.53.77:7861 ~/Desktop/screenshot.png Loaded as API: http://34.206.53.77:7861/ ✔ Parsed content: ... 2024-10-29 11:13:07.414 | INFO | __main__:predict:84 - Output image saved to: output_image.png

abrichr's tweet image. (venv) % python client.py http://34.206.53.77:7861 ~/Desktop/screenshot.png
Loaded as API: http://34.206.53.77:7861/ ✔
Parsed content:
...
2024-10-29 11:13:07.414 | INFO | __main__:predict:84 - Output image saved to: output_image.png

OpenAdaptAI reposted

Richard Abrich

@abrichr

Oct 29, 2024

Deploy @Microsoft OmniParser to @AWS #EC2 automatically via @Docker and #GitHub actions: github.com/microsoft/Omni…

abrichr's tweet card. Summary This PR implements functionality to automatically deploy OmniParser to EC2 on AWS via Docker, Github Actions, and boto3. It adds a Dockerfile which calls a download.py script for downloadin...

Add Dockerfile and client.py; deploy to EC2 on AWS via Github Actions by abrichr · Pull Request #52...

Source: github.com

OpenAdaptAI reposted

louis030195

@louis030195

Oct 22, 2024

within the next year, AI will be able to ingest everything that ever happened on your computer check out this cool video about tools enabling this: @OpenAdaptAI @tooluseai @MikeBirdTech @OpenInterpreter @FieroTy @abrichr and @screen_pipe :) youtube.com/watch?v=VgJ0Cg…

louis030195's tweet card. Automate Your Desktop to Save Time With These FREE Tools (ft Richard...

youtube.com

YouTube

Automate Your Desktop to Save Time With These FREE Tools (ft Richard...

Source: youtube.com

OpenAdaptAI reposted

Julien Chaumond

@julien_c

Oct 25, 2024

If I was starting a company today, I would look into productizing this model 👀

merve

@mervenoyann

Oct 25, 2024

Microsoft released a groundbreaking model that can be used for web automation, with MIT license 🔥👏 OmniParser is a state-of-the-art UI parsing/understanding model that outperforms GPT4V in parsing. 👏

OpenAdaptAI reposted

Tool Use

@ToolUsePodcast

Oct 22, 2024

Automate your desktop for free with @OpenAdaptAI @abrichr joined us this week to share his open source tool that saved one user $75,000

OpenAdaptAI reposted

Richard Abrich

@abrichr

Oct 14, 2024

Screenshot from youtube.com/watch?v=01g_Ef…. One way to think about @OpenAdaptAI is as a way to learn "Flows" automatically by observing human demonstrations.

abrichr's tweet image. Screenshot from youtube.com/watch?v=01g_Ef….

One way to think about @OpenAdaptAI is as a way to learn "Flows" automatically by observing human demonstrations.

OpenAdaptAI reposted

Andrej Karpathy

@karpathy

Sep 14, 2024

It's a bit sad and confusing that LLMs ("Large Language Models") have little to do with language; It's just historical. They are highly general purpose technology for statistical modeling of token streams. A better name would be Autoregressive Transformers or something. They…