AfterQuery's profile picture. Investigating the boundaries of AI capabilities

AfterQuery

@AfterQuery

Investigating the boundaries of AI capabilities

AfterQuery 已轉發

The frontier begets the frontier. I highly recommend reading @jaminball's latest Clouded Judgement article which spells out the AfterQuery thesis (thread)

spencermateega's tweet image. The frontier begets the frontier.

I highly recommend reading @jaminball's latest Clouded Judgement article which spells out the AfterQuery thesis

(thread)

Excited for UI-Bench to be the leading benchmark for UI/web design! Congrats to the @figma make team for claiming the #2 ranking

while leaderboards are fun and motivating, this is just the start for figma make. can't wait to share all the improvements we are making over coming days / weeks / months!

zoink's tweet image. while leaderboards are fun and motivating, this is just the start for figma make. can't wait to share all the improvements we are making over coming days / weeks / months!


AfterQuery 已轉發

Introducing UI-Bench by @afterquery. The first and only rigorous eval of vibe coding tools. > 4,000+ blinded pairwise judgments > @orchidsapp, @figma make, and @lovable_dev take the lead > @v0 and @replit ranked dead last > performance gaps = differences in LLM orchestration,…


AfterQuery 已轉發

finally got the @afterquery team to touch grass

spencermateega's tweet image. finally got the @afterquery team to touch grass
spencermateega's tweet image. finally got the @afterquery team to touch grass
spencermateega's tweet image. finally got the @afterquery team to touch grass
spencermateega's tweet image. finally got the @afterquery team to touch grass

AfterQuery 已轉發

🇺🇸 249 years ago, America declared that innovation belongs to the bold. Today, we're writing the next chapter—one dataset at a time. At @AfterQuery, we believe AI's future isn't just about algorithms. It's about the human ingenuity that teaches machines to think, reason, and…

spencermateega's tweet image. 🇺🇸 249 years ago, America declared that innovation belongs to the bold.

Today, we're writing the next chapter—one dataset at a time.

At @AfterQuery, we believe AI's future isn't just about algorithms. It's about the human ingenuity that teaches machines to think, reason, and…

AfterQuery 已轉發

Today, we’re pulling back the curtains. After collecting thousands of original, human-written coding problems, @AfterQuery created internal, contamination-free evals to test LLM code generation. No leaderboard tricks. No test-set leakage. Just raw task execution. Thread 🧵


AfterQuery 已轉發

Excited to share VADER, AfterQuery's new, human-evaluated benchmark for evaluating LLMs on real-world vulnerability handling! Paper: lnkd.in/g7EfAi2cAll data, evaluation tools & results are open-sourced at: lnkd.in/gYPUKwub [1/4]

ethantsliu's tweet image. Excited to share VADER, AfterQuery's new, human-evaluated benchmark for evaluating LLMs on real-world vulnerability handling! 

Paper: lnkd.in/g7EfAi2cAll data, evaluation tools & results are open-sourced at: lnkd.in/gYPUKwub

[1/4]

AfterQuery 已轉發

Hey devs! Trae’s back in SF! We’re proud to be a lead partner of AGENTHACKS hackathon 📍 Join us in San Francisco and meet our amazing hosts and candidate. 🗓️ May 23–24 @ AGI House SF 💰 $10K+ in prizes, free AI credits, bounties & more! 🌐 Join now: agenthacks.org

gary_qz's tweet image. Hey devs! Trae’s back in SF!

We’re proud to be a lead partner of AGENTHACKS hackathon 

📍 Join us in San Francisco and meet our amazing hosts and candidate.
🗓️ May 23–24 @ AGI House SF
💰 $10K+ in prizes, free AI credits, bounties & more!
🌐 Join now: agenthacks.org…

United States 趨勢

Loading...

Something went wrong.


Something went wrong.