LogicStar AI

@logic_star_ai

Building agentic Application Maintainance

Zurich, Switzerland

Joined July 2024

16Posts 20Followers 3Following

LogicStar AI reposted

Waldemar Hummer

@w_hummer

Nov 4

Great dinner with @lovable_dev, @localstack, and @logic_star_ai AI (three “Lo*”s) in Zurich - discussing the future of agentic coding 🤖, cloud DevX 💻, and building delightful apps ✨. Can't wait for more collaborations and partnerships among the three “Lo*”s and beyond! 🚀

w_hummer's tweet image. Great dinner with @lovable_dev, @localstack, and @logic_star_ai AI (three “Lo*”s) in Zurich - discussing the future of agentic coding 🤖, cloud DevX 💻, and building delightful apps ✨.

Can't wait for more collaborations and partnerships among the three “Lo*”s and beyond! 🚀

LogicStar AI reposted

Mark Müller

@mnmueller

Jul 8

🚨 AI agents wrote 7% of all GitHub PRs in June. But can we trust their code? We built Agents in the Wild – a live dashboard tracking autonomous AI agents across GitHub to answer that question: insights.logicstar.ai Here’s what we learned from analyzing 10M+ PRs 👇 1/n 🧵

Agents in the Wild

Source: insights.logicstar.ai

LogicStar AI

@logic_star_ai

Apr 11

We are excited to see the community use our SWT-Bench and work on the crucial topic of test generation!

Niels Mündler

@nielstron

Apr 11

🚨 New SWT-Bench Submission! 🤖 Amazon Q Developer Agent leads the SWT-Bench leaderboard 🥇 with an impressive 49% of successfully tested issues and a coverage improvement of 57% on SWT-Bench Verified.

nielstron's tweet image. 🚨 New SWT-Bench Submission! 🤖

Amazon Q Developer Agent leads the SWT-Bench leaderboard 🥇 with an impressive 49% of successfully tested issues and a coverage improvement of 57% on SWT-Bench Verified.

LogicStar AI reposted

Niels Mündler

@nielstron

Feb 18

SOTA code agent OpenHands (top-1 for SWE-full) achieves only 22% accuracy in unit test generation on SWT-lite (half its SWE performance), only slightly outperforming SWE-agent. What is going on? We dug through the data to find a simple trick and achieve almost 30%! 👇🧵 1/9

LogicStar AI reposted

SRI Lab

@the_sri_lab

Dec 11

SRI Lab at #NeurIPS2024 - 1/8 SWT-Bench: Testing and Validating Real-World Bug-Fixes with Code Agents Niels Mündler (@nielstron), Mark Niklas Mueller, Jingxuan He (@jingxuan_he), Martin Vechev (@mvechev) ⏰ /📍 Wed 11th, 11AM - 2PM, West Ballroom A-D #5406 📝 We explore software…

LogicStar AI reposted

SRI Lab

@the_sri_lab

Dec 11

SRI Lab is proud to present 8 of our works on Privacy and AI Safety at #NeurIPS2024 this year (7 main conference, 1 workshop). Check out the overview below as well as individual posts for each. Looking forward to seeing you at the conference and come by to chat! Open for more ⬇️…

LogicStar AI

@logic_star_ai

Nov 15

Exiting to see our work on benchmarking the test-generation capabilities of LLMs being picked up by the community!

Ofir Press

@OfirPress

Nov 14

Super cool work by @nielstron et al: SWT-Bench is SWE-bench for test generation! They give the model a repo and an issue and it has to write a test for the issue. They show that SWE-agent is able to write good tests for 19% of the issues in the benchmark! 🧵(1/3)