codergoose

@codergoose

Tham gia vào Tháng 8 2023

25Bài đăng 26Người theo dõi 842Đang theo dõi

codergoose

5 giờ

“getting actual ground truth about the physical world is much harder than I'd assumed, and much more valuable than the foundation model people seem to think.”

Robotics is NOT a data problem. Atleast not the way we think. For a long time, my world model was: whoever deploys robots at scale first wins. Get the units out there, collect the trajectories, train on the data, repeat. The fleet learning flywheel. I'm starting to think this is…

codergoose đã đăng lại

Google Research

@GoogleResearch

21 giờ

Today at #NeurIPS2025, we present Titans, a new architecture that combines the speed of RNNs with the performance of Transformers. It uses deep neural memory to learn in real-time, effectively scaling to contexts larger than 2 million tokens. More at: goo.gle/3Kd5ojF

GoogleResearch's tweet image. Today at #NeurIPS2025, we present Titans, a new architecture that combines the speed of RNNs with the performance of Transformers. It uses deep neural memory to learn in real-time, effectively scaling to contexts larger than 2 million tokens. More at: goo.gle/3Kd5ojF

codergoose đã đăng lại

Phil Trubey

@PTrubey

4 thg 12

This morning at NeurIPS, Rich Sutton reminded us that we need continual learning to reach AGI. This afternoon, Ali Behrouz presented a Google poster paper, Nested Learning, which provides new ideas on the path to continual learning. I recorded the 40 minute talk as it might be…

codergoose đã đăng lại

Clive Chan

@itsclivetime

9 thg 4

Google TPUv7: - 4.6 PFLOP/s FP8 - 192 GB HBM @ 7.4 TB/s - 600 GB/s (unidi) ICI - ~1000 watts Nvidia GB200: - 5 PFLOP/s FP8 / 10 PFLOP/s FP4 - 192 GB HBM @ 8 TB/s - 900 GB/s (unidi) NVLink - ~1200 watts blog.google/products/googl…

itsclivetime's tweet card. We’re introducing Ironwood, our seventh-generation Tensor Processing Unit (TPU) designed to power the age of generative AI inference.

Ironwood: The first Google TPU for the age of inference

Nguồn: blog.google

codergoose đã đăng lại

Crystal

@crystalsssup

25 thg 11

Another super impressive use case... Nano banana is killing the game 🥲 It's so luck to be a language learner in this era. > Prompt: Draw a detailed {{pet shop}} scene and label every object with English words. Label format: - First line: English word - Second line: IPA…

crystalsssup's tweet image. Another super impressive use case... Nano banana is killing the game 🥲 It's so luck to be a language learner in this era.

&gt; Prompt:
Draw a detailed {{pet shop}} scene and label every object with English words.

Label format:
- First line: English word
- Second line: IPA…

铁锤人

@lxfater

24 thg 11

用这个Prompt利用记忆宫殿学习英语太简单啦之前想做出效果来的，没想到Nano Banana一个Prompt 就搞定啦为我绘制一个详细的{{宠物商店}}场景并标注所有物体的英语单词，标注格式：第一行：英文单词第二行：音标（国际音标IPA格式）第三行：中文翻译真的是模型越强，啥都容易

lxfater's tweet image. 用这个Prompt利用记忆宫殿学习英语太简单啦

之前想做出效果来的，没想到Nano Banana一个Prompt
就搞定啦

为我绘制一个详细的{{宠物商店}}场景

并标注所有物体的英语单词，

标注格式：第一行：英文单词
第二行：音标（国际音标IPA格式）
第三行：中文翻译

真的是模型越强，啥都容易

codergoose đã đăng lại

Oriol Vinyals

@OriolVinyalsML

18 thg 11

The secret behind Gemini 3? Simple: Improving pre-training & post-training 🤯 Pre-training: Contra the popular belief that scaling is over—which we discussed in our NeurIPS '25 talk with @ilyasut and @quocleix—the team delivered a drastic jump. The delta between 2.5 and 3.0 is…

OriolVinyalsML's tweet image. The secret behind Gemini 3?

Simple: Improving pre-training &amp; post-training 🤯

Pre-training: Contra the popular belief that scaling is over—which we discussed in our NeurIPS '25 talk with @ilyasut and @quocleix—the team delivered a drastic jump. The delta between 2.5 and 3.0 is…

codergoose đã đăng lại

Sara Du

@saraduit

24 thg 4

life is 10x better when you're obsessed w/ building something

codergoose đã đăng lại

Yun-Ta Tsai

@yunta_tsai

8 thg 11

Designing an inference chip for robots is actually very difficult. In data centers each chip is bathed in jacuzzi and babysat by nannies. If they died it would be hot swapped by one of their clones. The fault rate of GPUs in datacenter is actually quite high. Industrial average…

codergoose đã đăng lại

Arthur Douillard

@Ar_Douillard

6 thg 11

I see a lot of bad takes on X about PhDs and frontier labs (not just this quoted tweet), so let me chime in. For context, I didn't do a prestigious undergrad, worked a bit in a startup as an applied ML engineer, then did a PhD, and now work in a frontier lab. A PhD isn't…

Karan🧋

@kmeanskaran

6 thg 11

HOT TAKE: Reality is, you can't actually work in top-quality ML research labs without a PhD. Top research labs still look for people with PhDs and excellence in maths, stats, PyTorch, neural networks, and CUDA kernels. In India, quality ML research labs are virtually…

codergoose đã đăng lại

Sam Altman

@sama

6 thg 11

I would like to clarify a few things. First, the obvious one: we do not have or want government guarantees for OpenAI datacenters. We believe that governments should not pick winners or losers, and that taxpayers should not bail out companies that make bad business decisions or…

codergoose đã đăng lại

Sebastien Bubeck

@SebastienBubeck

20 thg 10

My posts last week created a lot of unnecessary confusion*, so today I would like to do a deep dive on one example to explain why I was so excited. In short, it’s not about AIs discovering new results on their own, but rather how tools like GPT-5 can help researchers navigate,…

SebastienBubeck's tweet image. My posts last week created a lot of unnecessary confusion*, so today I would like to do a deep dive on one example to explain why I was so excited. In short, it’s not about AIs discovering new results on their own, but rather how tools like GPT-5 can help researchers navigate,…

codergoose đã đăng lại

Ahmad Beirami ✈️ NeurIPS

@abeirami

15 thg 9

What are the top 1-3 papers/projects/blogposts/tweets/apps/etc that you have seen on Agentic AI (design/generation of workflows, evals, optimization) in the past year, and why? (Please feel free to recommend your own work)

codergoose đã đăng lại

Ahmad

@TheAhmadOsman

12 thg 9

> be Chinese lab > post model called Qwen3-235B-A22B-Coder-Fast > trained on 100 trillion tokens of synthetic RLHF > open-weights, inference engine, 85-page tech report with detailed ablations > tweet gets 6 likes > Western AI crowd still debating if Gemini plagiarized a…

TheAhmadOsman's tweet image. &gt; be Chinese lab
&gt; post model called Qwen3-235B-A22B-Coder-Fast
&gt; trained on 100 trillion tokens of synthetic RLHF
&gt; open-weights, inference engine, 85-page tech report with detailed ablations
&gt; tweet gets 6 likes
&gt; Western AI crowd still debating if Gemini plagiarized a…

codergoose đã đăng lại

Shai Shalev-Shwartz

@shai_s_shwartz

14 thg 8

Are frontier AI models really capable of “PhD-level” reasoning? To answer this question, we introduce FormulaOne, a new reasoning benchmark of expert-level Dynamic Programming problems. We have curated a benchmark consisting of three tiers, in increasing complexity, which we call…

shai_s_shwartz's tweet image. Are frontier AI models really capable of “PhD-level” reasoning? To answer this question, we introduce FormulaOne, a new reasoning benchmark of expert-level Dynamic Programming problems. We have curated a benchmark consisting of three tiers, in increasing complexity, which we call…

codergoose

@codergoose

27 thg 7

“But one thing I do know for sure - there's no AGI without touching, feeling, and being embodied in the messy world.”

Jim Fan

@DrJimFan

13 thg 7

I've been a bit quiet on X recently. The past year has been a transformational experience. Grok-4 and Kimi K2 are awesome, but the world of robotics is a wondrous wild west. It feels like NLP in 2018 when GPT-1 was published, along with BERT and a thousand other flowers that…

DrJimFan's tweet image. I've been a bit quiet on X recently. The past year has been a transformational experience. Grok-4 and Kimi K2 are awesome, but the world of robotics is a wondrous wild west. It feels like NLP in 2018 when GPT-1 was published, along with BERT and a thousand other flowers that…

codergoose đã đăng lại

Sundar Pichai

@sundarpichai

3 thg 5

What a finish! Gemini 2.5 Pro just completed Pokémon Blue! Special thanks to @TheCodeOfJoel for creating and running the livestream, and to everyone who cheered Gem on along the way.

codergoose

@codergoose

6 thg 4

Counterpoint to Maverick hype.

BURKOV

@burkov

6 thg 4

If this post doesn't convince you that this arena is a joke, then nothing will. Just try this Maverick model yourself on the prompts you typically use for work. It's a model from 2023, not a frontier LM we are used to, like Grok, Claude, or o1. Not even close.

codergoose

@codergoose

6 thg 4

“and even remembering your day in video” wow!

Tweet này không còn khả dụng.

codergoose đã đăng lại

Dante

@CamutoDante

6 thg 4

Apple and Meta have published a monstruously elegant compression method that encodes model weights using pseudo-random seeds. The trick is to approximate model weights as the linear combination of a randomly generated matrix with fixed seed, and a smaller vector t.