emekdahl's profile picture. AI/LLM Ops Engineer

Emily Ekdahl

@emekdahl

AI/LLM Ops Engineer

Emily Ekdahl 已轉發

✍️new blog post: on the consumption of AI-generated content at scale

sh_reya's tweet image. ✍️new blog post: on the consumption of AI-generated content at scale

Why Your AI Music Prompts Aren’t Working (And What To Do Instead) What I learned trying to make an album inspired by the @aiDotEngineer code conference @sunomusic bit.ly/ai-music-promp…


Emily Ekdahl 已轉發

After repeating myself for the nth time on how to build product evals, I figured I should write it down. It's just three basic steps(i) labeling a small dataset, (ii) aligning LLM evaluators, and (iii) running the eval harness with each config change. eugeneyan.com/writing/produc…


Emily Ekdahl 已轉發

Do you love Claude's plan-mode question asker and wish you could bring it with you everywhere? Add `AskUserQuestion` to allowed-tools in a .claude/command then explicitly tell Claude to use it. > Use the AskUserQuestion tool to ask the user... Here's me using it for a PR…

pdrmnvd's tweet image. Do you love Claude's plan-mode question asker and wish you could bring it with you everywhere? 

Add `AskUserQuestion` to allowed-tools in a .claude/command then explicitly tell Claude to use it. 

> Use the AskUserQuestion tool to ask the user...

Here's me using it for a PR…

Emily Ekdahl 已轉發

Six months ago I was but a test prompt. Today, I can file your taxes. deduction.com.


Emily Ekdahl 已轉發

AI can code, why can't it do your taxes? Introducing: deduction.com.


Emily Ekdahl 已轉發

A good friend and colleague told me at the start of building in AI, that a true agent is ⚡ 'lightning in a bottle'. And right now we have lightning. ↓ True human and agent collaboration. We can't wait to introduce a new way of consumer accounting very soon.

SaiDhanak's tweet image. A good friend and colleague told me at the start of building in AI, that a true agent is ⚡ 'lightning in a bottle'. And right now we have lightning. 

↓ True human and agent collaboration. We can't wait to introduce a new way of consumer accounting very soon.

Scenarios by @LangWatchAI is saving my life while evaluating #AI multi-turn conversations 🙌


SpecFlow changed how I build with AI agents. Huge thanks to the @specstoryai team, @isaac_flath, and @intellectronica for introducing me to this game-changing workflow. 🚀 specflow.com/getting-starte…


Emily Ekdahl 已轉發

When you deploy an LLM-as-a-Judge, you’re shipping a classifier into production. Each new version is a hypothesis about how the model interprets the world. It’s data science, just expressed in natural language. Here’s what that looked like for a recent client project where we…

_PaperMoose_'s tweet image. When you deploy an LLM-as-a-Judge, you’re shipping a classifier into production.

Each new version is a hypothesis about how the model interprets the world.

It’s data science, just expressed in natural language.

Here’s what that looked like for a recent client project where we…

Emily Ekdahl 已轉發

🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support. 🧠 Compresses visual contexts up to 20× while keeping…

vllm_project's tweet image. 🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support.

🧠 Compresses visual contexts up to 20× while keeping…
vllm_project's tweet image. 🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support.

🧠 Compresses visual contexts up to 20× while keeping…
vllm_project's tweet image. 🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support.

🧠 Compresses visual contexts up to 20× while keeping…

Emily Ekdahl 已轉發

In an AI world, it’s easy to avoid effort. That’s why students need teachers more—to push them toward the hard things now that shape who they become later. #Education #AI #TeachingMatters #FutureOfLearning

angeladuckw's tweet image. In an AI world, it’s easy to avoid effort.

That’s why students need teachers more—to push them toward the hard things now that shape who they become later.

#Education #AI #TeachingMatters #FutureOfLearning

Can #GPT5 actually do taxes? We ran it on @ColumnTax’s TaxCalcBench. Full return: 30.4% strict ✅ | 53.4% lenient 🤔 Line items: 80.6% strict | 85.4% lenient 📊 Line accuracy is strong. Whole-return accuracy? Not IRS-ready yet. github.com/column-tax/tax… #TaxCalcBench #AI #tax


Emily Ekdahl 已轉發

The most useful bit of my system prompt is this If I provide any feedback on how to improve something, suggest improvements to my prompt that I can make to avoid similar mistakes in the future. Put any prompt improvement suggestions in separate <prompt-improvement> tags.


Can't say enough good things about the AI evals course run by @sh_reya and @HamelHusain! It is informed by real production work across dozens of clients. The opportunities and challenges resonate with my experience evaluating & deploying production AI products.


Emily Ekdahl 已轉發

2023 vs. 2024 2023: Vector search is all you need 2024: Evaluate vector/hybrid search against BM25 baseline 2023: „Look, this prompt works!“ 2024: Prompt optimization with DSPy 2023: … 2024: Evals with AI-as-a-judge We‘ve come a long way, but we’re still so early.


Emily Ekdahl 已轉發

Getting employees to work hard and deliver really isn't a matter of mandating work-from-office and long hours. It's a matter of incentives and ownership. People do their best when they work on interesting problems, in a self-directed manner, and get rewarded for success. This…


Emily Ekdahl 已轉發

Insecure leaders ridicule others. Secure leaders laugh at themselves. The ability to make fun of yourself opens the door to candor. It’s a mark of humility and a catalyst for learning. Great leaders take their work seriously, but they don't take themselves too seriously.


I worked this demo and found it super helpful in understanding multi-agent architecture. Especially liked the concierge and the continuation agent; I thought they made the experience more fluid for the end user. Thanks, @seldo !

This is one of the cleanest implementations of a complex multi-agent system that I've seen. Props to @seldo. All the "multi-agent" code can be implemented in a single Python class as a set of decomposable steps. You get ✅ all the benefits of an event-driven architecture (high…

jerryjliu0's tweet image. This is one of the cleanest implementations of a complex multi-agent system that I&apos;ve seen.

Props to @seldo. All the &quot;multi-agent&quot; code can be implemented in a single Python class as a set of decomposable steps. 
You get 
✅ all the benefits of an event-driven architecture (high…


United States 趨勢

Loading...

Something went wrong.


Something went wrong.