gradient_step's profile picture.

∇f

@gradient_step

More and more it seems artificial intelligence differs from human intelligence. Hinton noted how you can perfectly clone and distribute learning. Not to mention the ability to merge models and blocks. A new and very interesting paradigm is very large context lengths.


∇f รีโพสต์แล้ว

Sora from OpenAI and Gemini 1.5 from Google are both super impressive. Both rely on long context transformers. I think the lesson is that semi parametric models are great - condition on all past data, whether real or self generated, just like a good Bayesian :)


LMs are trained under so much ambiguity. It's remarkable what they learn to work with


It's amazing how much more intelligent models become when trained over longer contexts


That the loss goes down is merely a unit test


Trapped behind the bitter lesson is the original lesson


Loading...

Something went wrong.


Something went wrong.