LangWatch

@LangWatchAI

Open Source platform for LLM observability, evaluation and agent https://github.com/langwatch/langwatch ➡️ DSPy Optimizations ➡️ Scenario Agent Simulations

Software Company

langwatch.ai

Joined April 2024

121Posts 607Followers 212Following

LangWatch

@LangWatchAI

Sep 29

In agent demos, everything’s smooth. In prod? You get messy inputs, long chains, weird edge cases — that’s when things snap. We treat agents like code → write scenario tests first, simulate full workflows, then iterate until green. Think TDD, but for LLMs. More on how we do it…

LangWatchAI's tweet card. Discover Scenario: a domain-driven framework for AI agent testing & LLM evaluations with real-world simulations.

From Scenario to Finished: How to Test AI Agents with Domain-Driven TDD

Source: langwatch.ai

LangWatch

@LangWatchAI

Sep 25

“Do I really need evals?” The real q: how do you know your AI agents will behave in prod? Prototypes don’t need them. Scaling products do. That’s why we built Agent Simulations; Unit tests for AI. The only way to know if you can ship reliably. OSS: github.com/langwatch/scen…

LangWatchAI's tweet card. Agentic testing for agentic codebases. Contribute to langwatch/scenario development by creating an account on GitHub.

GitHub - langwatch/scenario: Agentic testing for agentic codebases

Source: github.com

LangWatch

@LangWatchAI

Sep 3

We’re hosting a Meetup in our office in Amsterdam on Sept 18 all about agentic AI. 👀 👀 👀 Talks from: • @_rchaves_ (CTO, LangWatch) → Beyond Unit Tests: why agent simulations are redefining AI agent testing. • Deepak Grewal (Kong) → Agentic AI -> powering the next wave…

LangWatchAI's tweet image. We’re hosting a Meetup in our office in Amsterdam on Sept 18 all about agentic AI. 👀 👀 👀

Talks from:
• @_rchaves_ (CTO, LangWatch) → Beyond Unit Tests: why agent simulations are redefining AI agent testing.

• Deepak Grewal (Kong) → Agentic AI -&gt; powering the next wave…

LangWatch reposted

Kong Inc.

@kong

Aug 26

In Amsterdam and want to spend an evening networking and learning all about agentic AI? Come to our @Meetup with @LangWatchAI on September 18th! RSVP to save your spot > bit.ly/3JyBwgY

LangWatch

@LangWatchAI

Aug 8

The gap between model release hype and production reality is always bigger than it looks. OpenAI’s new GPT-5 headlines focus on the measurable: fewer hallucinations, better reasoning, faster responses. All great gains. But the real story? How it works in your workflows, with…

LangWatchAI's tweet card. OpenAI has released its newest flagship model, GPT-5 - Start evaluating the performance within LangWatch available now.

GPT-5 Release: From Benchmarks to production reality

Source: langwatch.ai

LangWatch

@LangWatchAI

Jul 31

Start tracing AI SDK 5 with LangWatch today: docs.langwatch.ai/integration/ty…

LangWatchAI's tweet card. LangWatch Vercel AI SDK integration guide

Vercel AI SDK - LangWatch

Source: docs.langwatch.ai

AI SDK

@aisdk

Jul 31

AI SDK 5 Introducing type-safe chat, agentic loop controls, data parts, speech generation and transcription, Zod 4 support, global provider, and raw request access.

aisdk's tweet image. AI SDK 5

Introducing type-safe chat, agentic loop controls, data parts, speech generation and transcription, Zod 4 support, global provider, and raw request access.

LangWatch reposted

Rogerio Chaves

@_rchaves_

Jul 29

We've won second place in the Power of Europe Hackathon in Amsterdam with this one ;) more on it soon!

Kilo Code

@kilocode

Jul 26

First up - built with Kilo Code is an MCP tool that visually tests real applications with computer use and scenarios powered by @LangWatchAI

kilocode's tweet image. First up - built with Kilo Code is an MCP tool that visually tests real applications with computer use and scenarios powered by @LangWatchAI

LangWatch reposted

Kilo Code

@kilocode

Jul 26

First up - built with Kilo Code is an MCP tool that visually tests real applications with computer use and scenarios powered by @LangWatchAI

LangWatch reposted

Rogerio Chaves

@_rchaves_

Jul 25

Here is how to test Voice Agents, using Scenario simulations 👇

LangWatch reposted

Rogerio Chaves

@_rchaves_

Jul 17

notes on agent testing discussion with the team

LangWatch reposted

Rogerio Chaves

@_rchaves_

Jul 10

always so satisfying to watch a DSPy optimization happening

LangWatch reposted

Rogerio Chaves

@_rchaves_

Jul 10

First impressions of Grok 4 ✅ it passes all the Scenario agent simulation tests on the 13 different agent frameworks in create-agent-app ❌ probably because of the reasoning, but facing quite high latency using it as an agent 🤔 on our vibe coding test, the website it designs…

_rchaves_'s tweet image. First impressions of Grok 4

✅ it passes all the Scenario agent simulation tests on the 13 different agent frameworks in create-agent-app

❌ probably because of the reasoning, but facing quite high latency using it as an agent

🤔 on our vibe coding test, the website it designs…

LangWatch

@LangWatchAI

Jun 26

Now you can ship AI agents faster with developer-first testing. LangWatch Scenario allows you to test your agents like you test your code. That’s because: ❌Manual testing doesn't scale. ❌"Vibe checking" isn't systematic. ❌Hope isn't a strategy. That's why we’re building…

LangWatchAI's tweet card. Open-source testing platform for AI agents. Run simulations, catch regressions, and ship autonomous agents with confidence. Built for developers who treat AI like software. Agent simulations are the...

LangWatch Agent Simulations: Agentic testing for agentic codebases | Product Hunt

Source: producthunt.com

Catriona

@Pralaser5666

ScienceUnwrapped

@KehHorizon_4362

s

@s01265994

Wiekwo

@Wiekwo37818

jay hemnani

@jodnani10

David Webster

@WebstarDavid

Shafik Tanbir

@ShafikTanbir

Heidi

@Caukvub7733

Melanie

@flolokugle13534

Fwootuc

@Fwootuc0154016

Defang

@DefangLabs

Marina

@ARaza867169438

RubyHabakkuk

@0eAk8jyA5A773A

Elizabeth

@r80UkW9E01Wlxt

MonicaBarnard

@07irVTR5823KKA9

Iebermor

@Iebermor5773

Max Huang

@topmaxdata

Selena

@ABradstone22896

Theodora

@je4JZmh6Fv6r2e

Navya Yadav

@heyitspenz

Eleanora

@Flalkaw086

priya rajini

@Priyar5165

Catriona

@BufordWilk75777

Guirui

@Guirui16509

Esme

@ViX3UcKLYN32883

Hawqer

@Hawqer0365286

tryvibe

@tryvibee

komkom

@skoma17

ForgeFluencer

@ForgeFluencer

Hildegarde

@1eyr8P3e5izsZXX

Juraj Pohanka

@juraj_pohanka

BerniceHaywood

@MY2hDfG5Eivbw50

Guilherme

@gpmarques1993

Manouk Draisma

@ManoukDrai47041

Uitouuxa

@Uitouuxa12475

Raffi Tsan

@RaffiTsan

Y

@mng524353696432

Me retiro, cabroncetes

@Bale4President

nixpiper

@nixpiper

Hidenobu Ishikawa@技術書典19 ス38

@hidenobu_is

Kris

@lambdakris

Gerald

@Gerald42530825

LydiaEzekiel

@7nrXtRB2q9Af6

Markus Eicher

@MarkusEicher70

Shreyasi

@Shreyasi_P

Lydia

@Tleeko300

FredericaBess

@45yg88kvy6bFspu

Uplieonuf

@Uplieonuf16473

Amar Sukhi

@SukhiSecurity

Josh Schuller 🧠

@joshua_schuller

Droidrun

@droid_run

Tamara Gielen

@tamaragielen

Sebastian Castillo

@castillobuiles

Jack Bridger

@jacksbridger

Nirant

@NirantK

Eve

@eve_silb

Stefano Tonini

@tonini_stef

Mistral AI

@MistralAI

Weaviate vector database

@weaviate_io

Mervin Praison

@MervinPraison

Coding with Neeraj

@Neeraj12799662

ExecuteAutomation

@ExecuteAuto

daily.dev

@dailydotdev

Thomas Wolf

@Thom_Wolf

Rahul Goel

@rahul_nlu

Legacy Guy

@heylegacyguy

かけだし⭐️LLMOps

@kakedashi_xyz

Rohit Kumar Tiwari

@_rohit_tiwari_

Mustafa

@Mustafa_vec

fmerian/launch

@fmerian

Adam DuVander

@adamd

Julia Neagu

@julianeagu

swyx

@swyx

$John_Papa's profile picture. Family, Catholic, Disney, StoryTeller 🎭, technology leader, Siena College grad. DevRel @ MSFT - he/him {@IG _john.papa}$