Ege Onur Güleç

@EgeOnurGulec

Boğaziçi University Economics Ideas worth sharing

於三月 2020 加入

394貼文 98位跟隨者 1K個跟隨中

你可能會喜歡

@softmancho

@KitaRenji

@NimbleRemote

@junichiro_21

@RapidResponse19

@AvRdurgun

Ege Onur Güleç

@EgeOnurGulec

年11月25日

Now you can share your mrr directly from stripe let’s see the real mrrs of companies :D

Stripe

@stripe

年11月25日

You can now share MRR snapshots directly from your dashboard.

Ege Onur Güleç

@EgeOnurGulec

年11月25日

Ilya says ages of scaling is over and research is back. We need to get ready for new ideas to enhance the AI instead of just compute according to Ilya .

The @ilyasut episode 0:00:00 – Explaining model jaggedness 0:09:39 - Emotions and value functions 0:18:49 – What are we scaling? 0:25:13 – Why humans generalize better than models 0:35:45 – Straight-shotting superintelligence 0:46:47 – SSI’s model will learn from deployment…

Ege Onur Güleç

@EgeOnurGulec

年11月24日

Seems like a pretty good model for coding still expensive but much cheaper to its predecessor Opus 4.1. Claude’s almost all focus on currently on coding hence let’s see how it performs on real life tasks .

Claude

@claudeai

年11月24日

Introducing Claude Opus 4.5: the best model in the world for coding, agents, and computer use. Opus 4.5 is a step forward in what AI systems can do, and a preview of larger changes to how work gets done.

claudeai's tweet image. Introducing Claude Opus 4.5: the best model in the world for coding, agents, and computer use.

Opus 4.5 is a step forward in what AI systems can do, and a preview of larger changes to how work gets done.

Ege Onur Güleç

@EgeOnurGulec

年11月23日

After the success of training models specifically for a given task, like GPT Codex for coding and Sonnet with its strong focus on coding, there should be more domain specific models like GPT Finance or GPT Math where they excel at a specific task.

Ege Onur Güleç

@EgeOnurGulec

年11月20日

Instead of clearly defined problem datasets , these kind of real world engineering benchmarks should be the future for evaluating quality of LLMs. Real world is messy and LLMs should be able to operate in that mess.

pash

@pashmerepat

年11月20日

We are announcing cline-bench, a real world open source benchmark for agentic coding. cline-bench is built from real world engineering tasks from participating developers where frontier models failed and humans had to step in. Each accepted task becomes a fully reproducible…

pashmerepat's tweet image. We are announcing cline-bench, a real world open source benchmark for agentic coding.

cline-bench is built from real world engineering tasks from participating developers where frontier models failed and humans had to step in.

Each accepted task becomes a fully reproducible…

Ege Onur Güleç

@EgeOnurGulec

年11月19日

This might be biggest bottleneck for training models. Having an efficient data pipeline to train is the biggest moat a company can have.

Nando de Freitas

@NandoDF

年11月19日

AI Models are valuable, but datasets and evals to train AI models are more valuable. Datasets are valuable, but automated data pipelines that generate the datasets are more valuable. *** Model < data < pipeline *** At least until the models start building pipelines. Still far…

Ege Onur Güleç

@EgeOnurGulec

年11月18日

Big transfer for Thinking Machines. They transferred creator of PyTorch. I hope they also contribute more to open source .

Soumith Chintala

@soumithchintala

年11月18日

thinking machines....the people are incredible

Ege Onur Güleç

@EgeOnurGulec

年11月16日

Karpathy explained the new paradigms of the current software era. Verification: if you can verify an output, like passing a unit test or making sure the mathematical output of a model is correct, that is all you need. Real challenge is to find the verifiable domains and specify…

Andrej Karpathy

@karpathy

年11月16日

Sharing an interesting recent conversation on AI's impact on the economy. AI has been compared to various historical precedents: electricity, industrial revolution, etc., I think the strongest analogy is that of AI as a new computing paradigm (Software 2.0) because both are…