siddhadev

@siddhadev

github.com/kpe

Joined May 2013

OB-1 Coding Agent — Early Access

Experience the future of autonomous AI. OB-1 applies our reinforcement learning research to create agents that think, learn, and evolve.

Source: openblocklabs.com

OpenBlock

@openblocklabs

Sep 4

Introducing OB-1: the new #1 coding agent on Terminal Bench. After a year of R&D, our agent now outperforms Codex and Claude Code. Early access is rolling out to waitlist users now.

siddhadev

@siddhadev

Aug 12

a new post about Hierarchical Reasoning, Energy Based Transformers and Streaming Deep RL medium.com/p/hierarchical…

siddhadev reposted

I converted one of my favorite talks I've given over the past year into a blog post. "On the Tradeoffs of SSMs and Transformers" (or: tokens are bullshit) In a few days, we'll release what I believe is the next major advance for architectures.

_albertgu's tweet image. I converted one of my favorite talks I've given over the past year into a blog post.

"On the Tradeoffs of SSMs and Transformers"
(or: tokens are bullshit)

In a few days, we'll release what I believe is the next major advance for architectures.

siddhadev

@siddhadev

May 23

I was just about to get into fitness, but then side tracked into vibe coding - kpe.github.io/workout-track/…

siddhadev

@siddhadev

Jan 11

The Robust And Secure Machine Learning Podcast is here youtube.com/playlist?list=… #RobustML #SecureML #MachineLearning #DeepLearning #AdversarialML #DeepLearning #ML #AI #LLMs

siddhadev reposted

The Nobel Prize

@NobelPrize

Oct 8, 2024

BREAKING NEWS The Royal Swedish Academy of Sciences has decided to award the 2024 #NobelPrize in Physics to John J. Hopfield and Geoffrey E. Hinton “for foundational discoveries and inventions that enable machine learning with artificial neural networks.”

NobelPrize's tweet image. BREAKING NEWS
The Royal Swedish Academy of Sciences has decided to award the 2024 #NobelPrize in Physics to John J. Hopfield and Geoffrey E. Hinton “for foundational discoveries and inventions that enable machine learning with artificial neural networks.”

siddhadev

@siddhadev

Aug 13, 2024

Wow...

Vasu Shyam

@vasud3vshyam

Aug 12, 2024

Ever looked at the attention operation and said "hang on, that's a one-point function!"?

siddhadev

@siddhadev

Aug 10, 2024

Blind Vaysha is a 2016 animated short by Theodore Ushev based on a story by Georgi Gospodinov. The film tells the story of a girl who sees the past out of her left eye and the future from her right—and so is unable to live in the present. youtu.be/WxZfg-r11vU?si…

siddhadev's tweet card. Blind Vaysha | 2016 | Acclaimed Animated Short Film | Theodore Ushev

youtube.com

YouTube

Blind Vaysha | 2016 | Acclaimed Animated Short Film | Theodore Ushev

Source: youtube.com

siddhadev reposted

Chris Olah

@ch402

May 21, 2024

I'm really excited about these results for many reasons, but the most important is that we're starting to connect mechanistic interpretability to questions about the safety of large language models.

Anthropic

@AnthropicAI

May 21, 2024

New Anthropic research paper: Scaling Monosemanticity. The first ever detailed look inside a leading large language model. Read the blog post here: anthropic.com/research/mappi…

AnthropicAI's tweet image. New Anthropic research paper: Scaling Monosemanticity.

The first ever detailed look inside a leading large language model.

Read the blog post here: anthropic.com/research/mappi…

siddhadev reposted

elvis

@omarsar0

May 7, 2024

AlphaMath Almost Zero Enhances LLMs with Monte Carlo Tree Search (MCTS) to improve mathematical reasoning capabilities. The MCTS framework extends the LLM to achieve a more effective balance between exploration and exploitation. For this work, the idea is to generate…

omarsar0's tweet image. AlphaMath Almost Zero

Enhances LLMs with Monte Carlo Tree Search (MCTS) to improve mathematical reasoning capabilities.

The MCTS framework extends the LLM to achieve a more effective balance between exploration and exploitation.

For this work, the idea is to generate…

siddhadev reposted

Valeriy M., PhD, MBA, CQF

@predict_addict

May 1, 2024

All you need is Kolmogorov–Arnold Network! 🔥🔥🔥 complete with GitHub repo 🚀🚀🚀🚀🚀 'KAN: Kolmogorov–Arnold Networks' from @MIT and @Caltech h/t @illumattnati

predict_addict's tweet image. All you need is Kolmogorov–Arnold Network! 🔥🔥🔥 complete with GitHub repo 🚀🚀🚀🚀🚀

'KAN: Kolmogorov–Arnold Networks' from @MIT and @Caltech

h/t @illumattnati

siddhadev reposted

Yi Tay

@YiTayML

Apr 15, 2024

It's been a wild ride. Just 20 of us, burning through thousands of H100s over the past months, we're glad to finally share this with the world! 💪 One of the goals we’ve had when starting Reka was to build cool innovative models at the frontier. Reaching GPT-4/Opus level was a…

Reka

@RekaAILabs

Apr 15, 2024

Meet Reka Core, our best and most capable multimodal language model yet. 🔮 It’s been a busy few months training this model and we are glad to finally ship it! 💪 Core has a lot of capabilities, and one of them is understanding video --- let’s see what Core thinks of the 3 body…

siddhadev reposted

Jürgen Schmidhuber

@SchmidhuberAI

Nov 4, 2023

Silly AI regulation hype One cannot regulate AI research, just like one cannot regulate math. One can regulate applications of AI in finance, cars, healthcare. Such fields already have continually adapting regulatory frameworks in place. Don’t stifle the open-source movement!…

SchmidhuberAI's tweet image. Silly AI regulation hype

One cannot regulate AI research, just like one cannot regulate math.

One can regulate applications of AI in finance, cars, healthcare. Such fields already have continually adapting regulatory frameworks in place.

Don’t stifle the open-source movement!…

siddhadev reposted

Russ Salakhutdinov

@rsalakhu

Sep 14, 2023

I hear a lot of folks in our AI community complain about openAI -- they don't publish, don't release models, maximum-for-profit, etc., so they are more like closedAI rather than openAI. This is true, but you have to give it to those guys -- they showed the true potential of LLMs…

siddhadev reposted

Stephen Wolfram

@stephen_wolfram

Aug 1, 2023

Inspired by a book cover in 1972 ... half a century later I think I finally understand the Second Law ... and today published my own book about it. amazon.com/Second-Law-Res…

stephen_wolfram's tweet image. Inspired by a book cover in 1972 ... half a century later I think I finally understand the Second Law ... and today published my own book about it.
amazon.com/Second-Law-Res…

siddhadev

@siddhadev

Jul 26, 2023

Every single new @HazyResearch blog post and arXiv paper is a new delight. hazyresearch.stanford.edu/blog/2023-07-2…

siddhadev reposted

Paul Halpern

@phalpern

Jul 22, 2023

'It is our responsibility as scientists ... to teach how doubt is not to be feared but welcomed and discussed' -Richard Feynman

siddhadev reposted

Tri Dao

@tri_dao

Jul 17, 2023

Announcing FlashAttention-2! We released FlashAttention a year ago, making attn 2-4 faster and is now widely used in most LLM libraries. Recently I’ve been working on the next version: 2x faster than v1, 5-9x vs standard attn, reaching 225 TFLOPs/s training speed on A100. 1/