Anes Valentic

@Matrix_Memories

| AI Developer | ex: MunichRe, TU Munich | Cigarettes, Coffee, and Math at all hours |

Dubai

於十二月 2024 加入

4K貼文 2K位跟隨者 556個跟隨中

置頂

Anes Valentic

@Matrix_Memories

年8月12日

Ok @elonmusk here you go. I collected quickly a couple of the benchmarks I know. Criteria: run by uni or non-profit, have active leaderboard and don't list Grok 4 and might benefit from compute tokens for their evaluation. Note: Last part are from groups doing active research…

Elon Musk

@elonmusk

年8月12日

Ok, who?

Anes Valentic

@Matrix_Memories

年10月11日

New paradox unlocked: having to wake up before your bedtime.

Anes Valentic

@Matrix_Memories

年10月10日

Seriously, when will Anthropic stop with these clickbait studies?

Eric Topol

@EricTopol

年10月8日

Nothing to see here ;-) @Nature feature, by @SilverJacket nature.com/articles/d4158…

Anes Valentic

@Matrix_Memories

年10月9日

Who agrees that among developers Qwen Image Edit is by far the number 1 image editing model? Why? The number of possibilities that the open source nature of the model provides is worth 100x more than a couple of benchmark points. For any given use case you can just fine tune it…

Qwen

@Alibaba_Qwen

年10月9日

Thank you @ArtificialAnlys ! 🙏 Qwen Image Edit 2509 ranks #3 overall and leads all open-weight models — enabling multi-image editing with precise control. Try it now: chat.qwen.ai/?inputFeature=…

Anes Valentic

@Matrix_Memories

年10月8日

For all intents and purposes, and despite what half of the timeline claims, you are NOT arguing with either of these two.

Matrix_Memories's tweet image. For all intents and purposes, and despite what half of the timeline claims, you are NOT arguing with either of these two.

Anes Valentic

@Matrix_Memories

年10月7日

General approach to AI Agents: "An Agent that is able to solve every imaginable problem and is running only on one SOTA LLM." Ground truth: "Deploying swarms of specialized Agents running on specialized SLMs is more reliable, achieves better results and is easier to maintain."

Anes Valentic

@Matrix_Memories

年10月7日

OpenAI doing a “bait and switch” again. People already started to complain about Sora quality dropping. Didn’t notice it myself as don’t watch the videos. That’s why I’m sticking to xAI. Grok only ever gets better.

Teknium (e/λ)

@Teknium1

年10月7日

I feel like sora quality has gotten worse every day…

Anes Valentic

@Matrix_Memories

年10月7日

Now we know why OpenAI hasn’t been able to fix the model router for 2 months, they’re trying to vibe code it 🤣

Peter Gostev

@petergostev

年10月6日

92% of OpenAI engineers are using Codex - up from 50%. Nearly all PRs are reviewed now with Codex

Anes Valentic

@Matrix_Memories

年10月6日

The author states that he was charged for around 9k output tokens and that GPT-5 pro took ~6min to generate the output. This is through the new API. This means the model is generating ~25t/s. Anyone can confirm this? If this is true, the API is basically useless.

Simon Willison

@simonw

年10月6日

I got the new GPT-5 pro API model to "Generate me an SVG of a pelican riding a bicycle". This pelican took 6m8s to generate and cost me $1.10! simonwillison.net/2025/Oct/6/gpt…

Anes Valentic

@Matrix_Memories

年10月6日

Why does Ani keep yapping and obsessing about this, as of now nonexistent, open source model from @xai called Grok Nano? She claims 9B params and 128k context window. Is she hallucinating again or does she know something more than us?

Anes Valentic

@Matrix_Memories

年10月5日

Definitely worth checking out. A fully open source diffusion coding model from @SFResearch

Weiran Yao

@iscreamnearby

年10月4日

Today my team at @SFResearch drops CoDA-1.7B: a text diffusion coding model that outputs tokens bidirectionally in parallel. ⚡️ Faster inference, 1.7B rivaling 7B. 📊 54.3% HumanEval | 47.6% HumanEval+ | 55.4% EvalPlus 🤗HF: huggingface.co/Salesforce/CoD… Any questions, lmk!