Manouk Draisma

@ManoukDrai47041

Agent Simulations are the new unit testing.

Amsterdam

langwatch.ai

Joined May 2024

6Posts 4Followers 27Following

Manouk Draisma reposted

LangWatch

@LangWatchAI

Sep 29

In agent demos, everything’s smooth. In prod? You get messy inputs, long chains, weird edge cases — that’s when things snap. We treat agents like code → write scenario tests first, simulate full workflows, then iterate until green. Think TDD, but for LLMs. More on how we do it…

LangWatchAI's tweet card. Discover Scenario: a domain-driven framework for AI agent testing & LLM evaluations with real-world simulations.

From Scenario to Finished: How to Test AI Agents with Domain-Driven TDD

Source: langwatch.ai

Manouk Draisma reposted

LangWatch

@LangWatchAI

Sep 25

“Do I really need evals?” The real q: how do you know your AI agents will behave in prod? Prototypes don’t need them. Scaling products do. That’s why we built Agent Simulations; Unit tests for AI. The only way to know if you can ship reliably. OSS: github.com/langwatch/scen…

LangWatchAI's tweet card. Agentic testing for agentic codebases. Contribute to langwatch/scenario development by creating an account on GitHub.

GitHub - langwatch/scenario: Agentic testing for agentic codebases

Source: github.com

Manouk Draisma reposted

AI SDK

@aisdk

Jan 10

AI SDK Observability Integration: @LangWatchAI Observability and evals are crucial when developing AI applications. You can use LangWatch with the AI SDK to monitor and evaluate your LLM calls: