Geosh

@Geoshh

於三月 2023 加入

2千貼文 107位跟隨者 1千個跟隨中

置頂

Geosh

@Geoshh

年10月12日

Gonna try to pin a few favorite posts that linger in mind over time:

Amusing how 99% of people using their own brains forget how it works: The brain is an advanced probability machine. It keeps predicting the next most likely thought, word, or action based on incoming signals and past learning. Under the hood, billions of neurons are doing…

Geosh 已轉發

Robert Youssef

@rryssf_

6 小時

Holy shit... Tencent researchers just killed fine-tuning AND reinforcement learning in one shot 😳 They call it Training-Free GRPO (Group Relative Policy Optimization). Instead of updating weights, the model literally learns from 'its own experiences' like an evolving memory…

rryssf_'s tweet image. Holy shit... Tencent researchers just killed fine-tuning AND reinforcement learning in one shot 😳

They call it Training-Free GRPO (Group Relative Policy Optimization).

Instead of updating weights, the model literally learns from 'its own experiences' like an evolving memory…

Geosh 已轉發

Awni Hannun

@awnihannun

3 小時

I'm super excited about M5. It's going to help a lot with compute-bound workloads in MLX. For example: - Much faster prefill. In other words time-to-first-token will go down. - Faster image / video generation - Faster fine-tuning (LoRA or otherwise) - Higher throughput for…

awnihannun's tweet image. I'm super excited about M5. It's going to help a lot with compute-bound workloads in MLX.

For example:
- Much faster prefill. In other words time-to-first-token will go down.
- Faster image / video generation
- Faster fine-tuning (LoRA or otherwise)
- Higher throughput for…

Geosh 已轉發

Julio Martinez

@JulioMTNeuro

年10月13日

Highly recommend this book if you dare to get into the world of biophysical thinking. Years ago Jorge, now ⁦@FIU,⁩ told me to get back to school and learn differential equations. I’m so glad he did. Biology needs math, so does medicine.

JulioMTNeuro's tweet image. Highly recommend this book if you dare to get into the world of biophysical thinking. Years ago Jorge, now ⁦@FIU,⁩ told me to get back to school and learn differential equations. I’m so glad he did. Biology needs math, so does medicine.

Geosh 已轉發

Michael Eisen

@mbeisen

12 小時

Finally, someone has solved a real problem with AI! No more having to take a paper in the format for a journal that rejected you, and reformat it for a new journal. Well done!! formatmypaper.com

mbeisen's tweet image. Finally, someone has solved a real problem with AI! No more having to take a paper in the format for a journal that rejected you, and reformat it for a new journal. Well done!!

formatmypaper.com

Geosh 已轉發

Millie Marconi

@Yesterday_work_

年10月14日

Meta just did the unthinkable. They figured out how to train AI agents without rewards, human demos, or supervision and it actually works better than both. It’s called 'Early Experience', and it quietly kills the two biggest pain points in agent training: → Human…

Yesterday_work_'s tweet image. Meta just did the unthinkable.

They figured out how to train AI agents without rewards, human demos, or supervision and it actually works better than both.

It’s called 'Early Experience', and it quietly kills the two biggest pain points in agent training:

→ Human…

Geosh 已轉發

Thomas Wolf

@Thom_Wolf

9 小時

Everyone wants to get into robotics. No one knows where to start. LeRobot's Francesco just dropped a 70-page crash course that takes you from zero to cutting-edge: - RL sim/real - ACT, Diffusion policies - VLAs, SmolVLA, Pi-0 Absolute gold if you want to catch up fast.

$_fracapuano's profile picture. AI in Paris | Curr. robotics @huggingface$

Francesco Capuano

@_fracapuano

9 小時

A comprehensive, hands-on tutorial on the most recent advancements in robotics 🤟 ...with self-contained explanations of modern techniques for end-to-end robot learning & ready-to-use code examples using @LeRobotHF and @huggingface. Now available everywhere! 🤗

$_fracapuano's tweet image. A comprehensive, hands-on tutorial on the most recent advancements in robotics 🤟 ...with self-contained explanations of modern techniques for end-to-end robot learning &amp; ready-to-use code examples using @LeRobotHF and @huggingface. Now available everywhere! 🤗$

Geosh 已轉發

Behnam

@OrganicGPT

20 小時

glm 4.6 on M3 Ultra (8bit, MLX) is smarter than GPT-5 in some domains I tested it on! 🔥

Geosh 已轉發

Daniel Merja (gotogether.ai)

@danielmerja

19 小時

Not a single YouTuber who received the NVIDIA DGX Spark has compared it to the 14-inch MacBook Pro M4 Max with 40 cores and 128GB of RAM even though they’re priced almost the same. The reason? The MacBook Pro has a bandwidth of 546 GB/s, while the DGX Spark only 273 GB/s.

Geosh 已轉發

Sophia Yang, Ph.D.

@sophiamyang

23 小時

Wow we @MistralAI have ~100 positions open! Join us!

Geosh 已轉發

Akshay 🚀

@akshay_pachaar

年10月14日

Transformer vs. Mixture of Experts (MoEs):

Akshay 🚀

@akshay_pachaar

年10月13日

You're in an ML Engineer interview at MistralAI. The interviewer asks: "We need an LLM that excels across code, math & creative writing. How do you achieve multi-domain performance?" You: "I'll increase the number of attention heads." Interview over. Here's what you missed:

Geosh 已轉發

David Finsterwalder | eu/acc

@DFinsterwalder

年10月14日

DGX Spark GPT OSS 120B: 11.65 tok/sec M3 max GPT OSS 120B: 41.71 tok/sec This is very bad. Can't be explained from bad ollama performance and slower memory speed (273GB/s vs 400GB/s) alone.

LMSYS Org

@lmsysorg

年10月14日

🚀 SGLang In-Depth Review of the NVIDIA DGX Spark is LIVE! Thanks to @NVIDIA’s early access program, SGLang makes its first ever appearance in a consumer product, the brand-new DGX Spark. The DGX Spark’s 128GB Unified Memory and Blackwell architecture set a new standard for…

Geosh 已轉發

Massimo

@Rainmaker1973

年10月13日

A microscopic tardigrade, also known as a water bear, walking across a glass slide. Extremely resilient, it can survive decades without food or water and are the only known animal that can survive in direct exposure in space. [📹 Tobin Sparling]

來自 Science Is Fun

Geosh 已轉發

Yacine Mahdid

@yacinelearning

年10月12日

I don’t want to throw shades at anyone but the HRM architecture is better explained in the TRM paper than in its own

Yacine Mahdid

@yacinelearning

年10月12日

we’re finally diving in the inner working of this itsy bitsy tiny reasoning network this sunday btw if you create a new architecture having a clear visual of said arch is awesome many many many thanks alexia

yacinelearning's tweet image. we’re finally diving in the inner working of this itsy bitsy tiny reasoning network this sunday

btw if you create a new architecture having a clear visual of said arch is awesome many many many thanks alexia

Geosh 已轉發

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)

@teortaxesTex

年10月14日

«Unitree robodogs have no use case outside research»

來自 Zhao DaShuai 东北进修🇨🇳

Geosh 已轉發

HHMI | Janelia

@HHMIJanelia

年10月13日

🧠 A new wiring diagram of the 166,000+ neurons in the brain & nerve cord of a male fruit fly is a key tool in uncovering how the brain enables complex behavior—information that could ultimately help scientists understand what causes different diseases ➡️ hhmi.news/4o3EJnk

Geosh 已轉發

Sebastian Raschka

@rasbt

年10月13日

Sliding Window Attention 🔗 github.com/rasbt/LLMs-fro…

Sebastian Raschka

@rasbt

年10月12日

Multi-Head Latent Attention 🔗 github.com/rasbt/LLMs-fro…

Geosh 已轉發

wh

@nrehiew_

年10月12日

I am convinced this post about GPU Internals through the lens of matmuls is the best blogpost of the year. The diagrams alone are beautifully insane. It is crazy that information like this is available for free