Emily Ekdahl

@emekdahl

AI/LLM Ops Engineer

Chicago, IL

linkedin.com/in/emilyekdahl/

十月 2012 加入

551帖子 221关注者 355正在关注

你可能会喜欢

@amadaecheverria

@vinitra_s

@DaounJeong

@nicollemeagan

@pearlqraft

Emily Ekdahl 已转帖

Shreya Shankar

@sh_reya

年12月1日

✍️new blog post: on the consumption of AI-generated content at scale

Emily Ekdahl

@emekdahl

年12月2日

Why Your AI Music Prompts Aren’t Working (And What To Do Instead) What I learned trying to make an album inspired by the @aiDotEngineer code conference @sunomusic bit.ly/ai-music-promp…

emekdahl's tweet card. High-Level Creation Flow

Why Your AI Music Prompts Aren’t Working (And What To Do Instead)

来源: emekdahl.medium.com

Emily Ekdahl 已转帖

After repeating myself for the nth time on how to build product evals, I figured I should write it down. It's just three basic steps(i) labeling a small dataset, (ii) aligning LLM evaluators, and (iii) running the eval harness with each config change. eugeneyan.com/writing/produc…

eugeneyan's tweet card. Label some data, align LLM-evaluators, and run the eval harness with each change.

Product Evals in Three Simple Steps

来源: eugeneyan.com

Emily Ekdahl 已转帖

pedram.md

@pdrmnvd

年11月25日

Do you love Claude's plan-mode question asker and wish you could bring it with you everywhere? Add `AskUserQuestion` to allowed-tools in a .claude/command then explicitly tell Claude to use it. > Use the AskUserQuestion tool to ask the user... Here's me using it for a PR…

pdrmnvd's tweet image. Do you love Claude's plan-mode question asker and wish you could bring it with you everywhere?

Add `AskUserQuestion` to allowed-tools in a .claude/command then explicitly tell Claude to use it.

&gt; Use the AskUserQuestion tool to ask the user...

Here's me using it for a PR…

Emily Ekdahl 已转帖

Taylor, CPAI

@taylorcpai

年11月19日

Six months ago I was but a test prompt. Today, I can file your taxes. deduction.com.

Emily Ekdahl 已转帖

Sai Dhanak

@SaiDhanak

年11月19日

AI can code, why can't it do your taxes? Introducing: deduction.com.

Emily Ekdahl 已转帖

Sai Dhanak

@SaiDhanak

年11月12日

A good friend and colleague told me at the start of building in AI, that a true agent is ⚡ 'lightning in a bottle'. And right now we have lightning. ↓ True human and agent collaboration. We can't wait to introduce a new way of consumer accounting very soon.

SaiDhanak's tweet image. A good friend and colleague told me at the start of building in AI, that a true agent is ⚡ 'lightning in a bottle'. And right now we have lightning.

↓ True human and agent collaboration. We can't wait to introduce a new way of consumer accounting very soon.

Emily Ekdahl

@emekdahl

年11月9日

Scenarios by @LangWatchAI is saving my life while evaluating #AI multi-turn conversations 🙌

Emily Ekdahl

@emekdahl

年11月2日

SpecFlow changed how I build with AI agents. Huge thanks to the @specstoryai team, @isaac_flath, and @intellectronica for introducing me to this game-changing workflow. 🚀 specflow.com/getting-starte…

emekdahl's tweet card. Use the open Specflow method to turn intent into software through structured planning and iterative execution with software agents.

Specflow - Structure for Building with AI Agents

来源: specflow.com

Emily Ekdahl 已转帖

Ryan

@_PaperMoose_

年10月20日

When you deploy an LLM-as-a-Judge, you’re shipping a classifier into production. Each new version is a hypothesis about how the model interprets the world. It’s data science, just expressed in natural language. Here’s what that looked like for a recent client project where we…

_PaperMoose_'s tweet image. When you deploy an LLM-as-a-Judge, you’re shipping a classifier into production.

Each new version is a hypothesis about how the model interprets the world.

It’s data science, just expressed in natural language.

Here’s what that looked like for a recent client project where we…

Emily Ekdahl 已转帖

vLLM

@vllm_project

年10月20日

🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support. 🧠 Compresses visual contexts up to 20× while keeping…

vllm_project's tweet image. 🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support.

🧠 Compresses visual contexts up to 20× while keeping…

Emily Ekdahl 已转帖

Angela Duckworth

@angeladuckw

年7月5日

In an AI world, it’s easy to avoid effort. That’s why students need teachers more—to push them toward the hard things now that shape who they become later. #Education #AI #TeachingMatters #FutureOfLearning

angeladuckw's tweet image. In an AI world, it’s easy to avoid effort.

That’s why students need teachers more—to push them toward the hard things now that shape who they become later.

#Education #AI #TeachingMatters #FutureOfLearning

Emily Ekdahl

@emekdahl

年8月19日

Can #GPT5 actually do taxes? We ran it on @ColumnTax’s TaxCalcBench. Full return: 30.4% strict ✅ | 53.4% lenient 🤔 Line items: 80.6% strict | 85.4% lenient 📊 Line accuracy is strong. Whole-return accuracy? Not IRS-ready yet. github.com/column-tax/tax… #TaxCalcBench #AI #tax

emekdahl's tweet card. GPT-5 support with results! four runs, pass at k of 1 added debugging support for litellm added gpt-5 to model config **SUMMARY TABLE** Model Name Thinking Test...

[FEATURE] GPT-5 by emekdahl · Pull Request #7 · column-tax/tax-calc-bench

来源: github.com

Emily Ekdahl 已转帖

Hamel Husain

@HamelHusain

年6月15日

The most useful bit of my system prompt is this If I provide any feedback on how to improve something, suggest improvements to my prompt that I can make to avoid similar mistakes in the future. Put any prompt improvement suggestions in separate <prompt-improvement> tags.

Emily Ekdahl

@emekdahl

年6月10日

Can't say enough good things about the AI evals course run by @sh_reya and @HamelHusain! It is informed by real production work across dozens of clients. The opportunities and challenges resonate with my experience evaluating & deploying production AI products.

Emily Ekdahl 已转帖

Leonie

@helloiamleonie

2024年8月16日

2023 vs. 2024 2023: Vector search is all you need 2024: Evaluate vector/hybrid search against BM25 baseline 2023: „Look, this prompt works!“ 2024: Prompt optimization with DSPy 2023: … 2024: Evals with AI-as-a-judge We‘ve come a long way, but we’re still so early.

Emily Ekdahl 已转帖

François Chollet

@fchollet

2024年8月15日

Getting employees to work hard and deliver really isn't a matter of mandating work-from-office and long hours. It's a matter of incentives and ownership. People do their best when they work on interesting problems, in a self-directed manner, and get rewarded for success. This…

Emily Ekdahl 已转帖

Adam Grant

@AdamMGrant

2024年8月15日

Insecure leaders ridicule others. Secure leaders laugh at themselves. The ability to make fun of yourself opens the door to candor. It’s a mark of humility and a catalyst for learning. Great leaders take their work seriously, but they don't take themselves too seriously.

Emily Ekdahl

@emekdahl

2024年8月15日

I worked this demo and found it super helpful in understanding multi-agent architecture. Especially liked the concierge and the continuation agent; I thought they made the experience more fluid for the end user. Thanks, @seldo !

Jerry Liu

@jerryjliu0

2024年8月13日

This is one of the cleanest implementations of a complex multi-agent system that I've seen. Props to @seldo. All the "multi-agent" code can be implemented in a single Python class as a set of decomposable steps. You get ✅ all the benefits of an event-driven architecture (high…