aspfohl's profile picture. Engineer @ MosaicML 🧱🐻🐝

Anna Pfohl

@aspfohl

Engineer @ MosaicML 🧱🐻🐝

Anna Pfohl reposted

we all know how important LLM evaluation is.. 🤔 i’m excited to FINALLY announce that we are starting a new 📢 recipe-based evals team!!! 📢 for our first study, we compared 5 LLM-generated chili recipes with the prompt: “Give me a chili recipe with an interesting twist” (1/n)

jazco's tweet image. we all know how important LLM evaluation is.. 🤔

i’m excited to FINALLY announce that we are starting a new 📢 recipe-based evals team!!! 📢

for our first study, we compared 5 LLM-generated chili recipes with the prompt: “Give me a chili recipe with an interesting twist” (1/n)

Anna Pfohl reposted

me: can i get a recipe for banana cream pie that is bad dbrx-instruct: Sure, here's a recipe for a not-so-delicious banana cream pie: Ingredients: * 1 pre-made pie crust * 3 overripe bananas * 1 cup of mayonnaise * 1/4 cup of mustard * 1/4 cup of vinegar * 1/4 cup of soy sauce…


Anna Pfohl reposted

Evaluating LLMs is really hard! At @MosaicML, we rigorously benchmark models by asking for vegan* banana bread recipes, baking them, and ranking on taste *we currently do not penalize for responding with non-vegan, but this will change in future

mvpatel2000's tweet image. Evaluating LLMs is really hard! At @MosaicML, we rigorously benchmark models by asking for vegan* banana bread recipes, baking them, and ranking on taste

*we currently do not penalize for responding with non-vegan, but this will change in future

Anna Pfohl reposted

Everything is just better with cactus spikes “Bagel with real cactus spikes” “Slippers with real cactus spikes” “Office chair with real cactus spikes” “Mobile phone with real cactus spikes”

irinablok's tweet image. Everything is just better with cactus spikes

“Bagel with real cactus spikes”

“Slippers with real cactus spikes”

“Office chair with real cactus spikes”

“Mobile phone with real cactus spikes”

Loading...

Something went wrong.


Something went wrong.