kevinschawinski's profile picture. Astrophysicist Turned AI Entrepreneur | CEO at Modulos AG | Bridging Science and Responsible AI | Yelling at Claudes

Kevin Schawinski

@kevinschawinski

Astrophysicist Turned AI Entrepreneur | CEO at Modulos AG | Bridging Science and Responsible AI | Yelling at Claudes

Pinned

Me too, o1-pro, me too.

kevinschawinski's tweet image. Me too, o1-pro, me too.

Opus 4.5 doesn't feel like a "4.5" version of 4.1. It feels like... something distilled from an even more powerful model? Anthropic is cooking.


A happy Thanksgiving to all my friends in the US! 🇺🇸


👀

Introducing Claude Opus 4.5: the best model in the world for coding, agents, and computer use. Opus 4.5 is a step forward in what AI systems can do, and a preview of larger changes to how work gets done.

claudeai's tweet image. Introducing Claude Opus 4.5: the best model in the world for coding, agents, and computer use.

Opus 4.5 is a step forward in what AI systems can do, and a preview of larger changes to how work gets done.


👀👀👀👀👀

The EU Parliament's JURI committee will vote next week on whether to challenge the Commission’s decision to withdraw the AI Liability Directive before the EU Court of Justice - in a move that could replicate the path followed by the SEP proposal. mlex.com/mlex/articles/…



It's a deepseek fine tune

kevinschawinski's tweet image. It's a deepseek fine tune

Today, we are releasing the best open-weight LLM by a US company: Cogito v2.1 671B. On most industry benchmarks and our internal evals, the model performs competitively with frontier closed and open models, while being ahead of any US open model (such as the best versions of…

drishanarora's tweet image. Today, we are releasing the best open-weight LLM by a US company: Cogito v2.1 671B.

On most industry benchmarks and our internal evals, the model performs competitively with frontier closed and open models, while being ahead of any US open model (such as the best versions of…
drishanarora's tweet image. Today, we are releasing the best open-weight LLM by a US company: Cogito v2.1 671B.

On most industry benchmarks and our internal evals, the model performs competitively with frontier closed and open models, while being ahead of any US open model (such as the best versions of…
drishanarora's tweet image. Today, we are releasing the best open-weight LLM by a US company: Cogito v2.1 671B.

On most industry benchmarks and our internal evals, the model performs competitively with frontier closed and open models, while being ahead of any US open model (such as the best versions of…


OpenAI model names are out of control.

kevinschawinski's tweet image. OpenAI model names are out of control.

Kevin Schawinski reposted

Can’t wait to show you guys what I’m building with Interintellect AI 😍 Really nothing like this in the market & we get to test it 1-2x/wk in Interintellect (in the semi-wild) which is huge. Next demos: Sunday, Wednesday HIRING: have open opportunities for character engineers!


Yes ChatGPT, that’s exactly what I said. 🏴󠁧󠁢󠁷󠁬󠁳󠁿

kevinschawinski's tweet image. Yes ChatGPT, that’s exactly what I said. 🏴󠁧󠁢󠁷󠁬󠁳󠁿

I notice that Gemini 3 in Antigravity is worse than Sonnet 4.5 in following some instructions....


I guess the principle of legal certainty means nothing to the EU commission. "We're moving the deadline, but maybe we'll move it back, who knows? YOLO" Absolute bananas and making the Commission look ridiculous.

BREAKING: The Commission’s digital omnibus on AI proposes delaying the AI Act’s high-risk rules under Annex III to Dec 2027 and Annex I to Aug 2028. But the Commission could anticipate the deadline, giving a six-month or 12-month notice, respectively. digital-strategy.ec.europa.eu/en/library/dig…



Now this looks interesting. Getting errors because there's not enough capacity right now....

kevinschawinski's tweet image. Now this looks interesting. Getting errors because there's not enough capacity right now....

Has been driving me crazy since the ChatGPT moment. Most people “hallucinate” most of their answers all day long.

It's interesting how people hold AI models to a higher standard than people. Eg try asking people 2-4 questions in one go -- they tend to answer ~60% of them (and that's for educated people!). A 60% score on that eval would be considered a crippling deficiency for a model!



We built our whole society around the idea that you need expertise to write things (in this case, planning objections) and this is no longer true for $20/month.

kevinschawinski's tweet image. We built our whole society around the idea that you need expertise to write things (in this case, planning objections) and this is no longer true for $20/month.

Future generations will rightfully condemn us for the immeasurable cruelty we visit upon animals in animal agriculture today.

The pig area at the Royal Ag Fair shows a mother pig in a cage so small she can barely move. Approximately nobody cares, and she is not hidden (she is a prominent part of their display).

ryancbriggs's tweet image. The pig area at the Royal Ag Fair shows a mother pig in a cage so small she can barely move. Approximately nobody cares, and she is not hidden (she is a prominent part of their display).


I am having a serious case of Gell-Mann amnesia right now.


Did... did the UK government miss a zero there with that salary?

Government advertises Chief Technical Officer role to "make the UK the world's leading digital government" Offers £162.5k to internal civil servants, £100k to external candidates They believe people from the private sector should take a 40% pay cut to come into government



The style difference between Claude code and Openai codex is striking. Can you imagine Claude answering like this?

kevinschawinski's tweet image. The style difference between Claude code and Openai codex is striking. Can you imagine Claude answering like this?

👀

We will introduce “made in Europe” criteria for public procurement in certain strategic sectors. Ensure that new foreign investments in our industry are truly in Europe’s interest. And step up support for strategic industries, like cars and batteries



United States Trends

Loading...

Something went wrong.


Something went wrong.