logic_star_ai's profile picture. Building agentic Application Maintainance

LogicStar AI

@logic_star_ai

Building agentic Application Maintainance

LogicStar AI reposted

Great dinner with @lovable_dev, @localstack, and @logic_star_ai AI (three “Lo*”s) in Zurich - discussing the future of agentic coding 🤖, cloud DevX 💻, and building delightful apps ✨. Can't wait for more collaborations and partnerships among the three “Lo*”s and beyond! 🚀

w_hummer's tweet image. Great dinner with @lovable_dev, @localstack, and @logic_star_ai AI (three “Lo*”s) in Zurich - discussing the future of  agentic coding 🤖, cloud DevX 💻, and building delightful apps ✨.

Can't wait for more collaborations and partnerships among the three  “Lo*”s and beyond! 🚀

LogicStar AI reposted

🚨 AI agents wrote 7% of all GitHub PRs in June. But can we trust their code? We built Agents in the Wild – a live dashboard tracking autonomous AI agents across GitHub to answer that question: insights.logicstar.ai Here’s what we learned from analyzing 10M+ PRs 👇 1/n 🧵


We are excited to see the community use our SWT-Bench and work on the crucial topic of test generation!

🚨 New SWT-Bench Submission! 🤖 Amazon Q Developer Agent leads the SWT-Bench leaderboard 🥇 with an impressive 49% of successfully tested issues and a coverage improvement of 57% on SWT-Bench Verified.

nielstron's tweet image. 🚨 New SWT-Bench Submission! 🤖   

Amazon Q Developer Agent leads the SWT-Bench leaderboard 🥇 with an impressive 49% of successfully tested issues and a coverage improvement of 57% on SWT-Bench Verified.


LogicStar AI reposted

SOTA code agent OpenHands (top-1 for SWE-full) achieves only 22% accuracy in unit test generation on SWT-lite (half its SWE performance), only slightly outperforming SWE-agent. What is going on? We dug through the data to find a simple trick and achieve almost 30%! 👇🧵 1/9


LogicStar AI reposted

SRI Lab at #NeurIPS2024 - 1/8 SWT-Bench: Testing and Validating Real-World Bug-Fixes with Code Agents Niels Mündler (@nielstron), Mark Niklas Mueller, Jingxuan He (@jingxuan_he), Martin Vechev (@mvechev) ⏰ /📍 Wed 11th, 11AM - 2PM, West Ballroom A-D #5406 📝 We explore software…


LogicStar AI reposted

SRI Lab is proud to present 8 of our works on Privacy and AI Safety at #NeurIPS2024 this year (7 main conference, 1 workshop). Check out the overview below as well as individual posts for each. Looking forward to seeing you at the conference and come by to chat! Open for more ⬇️…


Exiting to see our work on benchmarking the test-generation capabilities of LLMs being picked up by the community!

Super cool work by @nielstron et al: SWT-Bench is SWE-bench for test generation! They give the model a repo and an issue and it has to write a test for the issue. They show that SWE-agent is able to write good tests for 19% of the issues in the benchmark! 🧵(1/3)

OfirPress's tweet image. Super cool work by @nielstron et al: SWT-Bench is SWE-bench for test generation! 
They give the model a repo and an issue and it has to write a test for the issue.

They show that SWE-agent is able to write good tests for 19% of the issues in the benchmark!

🧵(1/3)


United States Trends

Loading...

Something went wrong.


Something went wrong.