Geoshh's profile picture. Embodied A.I. | Socioaffective Alignment | Systems Biology & Interpersonal Neurobiology | @UChicago | @EuroGradSchool |healing,science,technology,connection

Geosh

@Geoshh

Embodied A.I. | Socioaffective Alignment | Systems Biology & Interpersonal Neurobiology | @UChicago | @EuroGradSchool |healing,science,technology,connection

置頂

Gonna try to pin a few favorite posts that linger in mind over time:

Amusing how 99% of people using their own brains forget how it works: The brain is an advanced probability machine. It keeps predicting the next most likely thought, word, or action based on incoming signals and past learning. Under the hood, billions of neurons are doing…



Geosh 已轉發

🎉 How do we measure the rapid progress of robotic learning and embodied AI research? The 1st BEHAVIOR challenge results are out! And we're to see such strong performance on 50 challenging household tasks. Congrats to the winning teams! 🥇Robot Learning Collective 🥈Comet…


Geosh 已轉發

Don't think of LLMs as entities but as simulators. For example, when exploring a topic, don't ask: "What do you think about xyz"? There is no "you". Next time try: "What would be a good group of people to explore xyz? What would they say?" The LLM can channel/simulate many…


Geosh 已轉發

Stanford literally dropped a 70-minute masterclass on how GPT works

yulintwt's tweet image. Stanford literally dropped a 70-minute masterclass on how GPT works

Oh wow...note to self, need to read and compare to synthetic pathology paper (this may change how I think about that significantly)

We tested one of the most common prompting techniques: giving the AI a persona to make it more accurate We found that telling the AI "you are a great physicist" doesn't make it significantly more accurate at answering physics questions, nor does "you are a lawyer" make it worse.

emollick's tweet image. We tested one of the most common prompting techniques: giving the AI a persona to make it more accurate

We found that telling the AI "you are a great physicist" doesn't make it significantly more accurate at answering physics questions, nor does "you are a lawyer" make it worse.
emollick's tweet image. We tested one of the most common prompting techniques: giving the AI a persona to make it more accurate

We found that telling the AI "you are a great physicist" doesn't make it significantly more accurate at answering physics questions, nor does "you are a lawyer" make it worse.


Geosh 已轉發

GLM-4.6V Series is here🚀 - GLM-4.6V (106B): flagship vision-language model with 128K context - GLM-4.6V-Flash (9B): ultra-fast, lightweight version for local and low-latency workloads First-ever native Function Calling in the GLM vision model family Weights:…

Zai_org's tweet image. GLM-4.6V Series is here🚀

- GLM-4.6V (106B): flagship vision-language model with 128K context
- GLM-4.6V-Flash (9B): ultra-fast, lightweight version for local and low-latency workloads

First-ever native Function Calling in the GLM vision model family

Weights:…

Geosh 已轉發

Has Claude sped up your biology research or enabled new capabilities for your lab? Is it underperforming on specific tasks? Do you have ideas to improve its usefulness in a specific domain of biology? Share feedback here: forms.gle/bXPqhLAHeo2CSa…


Geosh 已轉發

Demis Hassabis says the most ignored marvel is AI’s ability to understand video, images, and audio together. Gemini can watch a movie scene and explain the symbolism behind a tiny gesture. This shows the model grasps concepts, not just pixels or words. Such deep cross-media…


Geosh 已轉發

If you need a video guide to Karpathy's nanochat, check out Stanford's CS336! It covers: - Tokenization - Resource Accounting - Pretraining - Finetuning (SFT/RLHF) - Overview of Key Architectures - Working with GPUs - Kernels and Tritons - Parallelism - Scaling Laws - Inference…

_avichawla's tweet image. If you need a video guide to Karpathy's nanochat, check out Stanford's CS336!

It covers:

- Tokenization
- Resource Accounting
- Pretraining
- Finetuning (SFT/RLHF)
- Overview of Key Architectures
- Working with GPUs
- Kernels and Tritons
- Parallelism
- Scaling Laws
- Inference…

Geosh 已轉發

As everyone gets increasingly excited about RL, imo one underrated method is mid-training / SFT to solve the cold-start problem before RLing. New paper from CMU highlights that 1) strong pretraining and midtraining are essential to get full RL gains 2) PRMs reduce reward…

rronak_'s tweet image. As everyone gets increasingly excited about RL, imo one underrated method is mid-training / SFT to solve the cold-start problem before RLing.

New paper from CMU highlights that 
1) strong pretraining and midtraining are essential to get full RL gains
2) PRMs reduce reward…

Geosh 已轉發

New Anthropic research! We study how to train models so that high-risk capabilities live in a small, separate set of parameters, allowing clean capability removal when needed – for example in CBRN or cybersecurity domains.

_igorshilov's tweet image. New Anthropic research!

We study how to train models so that high-risk capabilities live in a small, separate set of parameters, allowing clean capability removal when needed – for example in CBRN or cybersecurity domains.

Geosh 已轉發

Banger paper from Stanford University on the missing layer of AGI just flipped the entire “LLMs are just pattern matchers” argument on its head. Not scaling tricks. Not another architecture. A "coordination layer" that actually makes models think. Here’s why this is insane 👇…

rryssf_'s tweet image. Banger paper from Stanford University on the missing layer of AGI just flipped the entire “LLMs are just pattern matchers” argument on its head.

Not scaling tricks.
Not another architecture.

A "coordination layer" that actually makes models think.

Here’s why this is insane 👇…

Geosh 已轉發

It seems likely that mitochondria transfer and mitochondria-based therapies will be a significant part of medicine in the future.

Boosting mitochondria number and function to slow cellular aging washingtonpost.com/science/2025/1… pnas.org/doi/10.1073/pn…



Geosh 已轉發

I’ve been at NeurIPS this past week. Here’s six things I learned: 1. Basically everyone is doing RL 2. And anyone who isn’t doing RL is doing a startup for data (usually for RL) 3. Diffusion LMs are popular enough that you now have to clarify discrete or continuous 4. I…


Geosh 已轉發

all roads lead to the same subspace "The Universal Weight Subspace Hypothesis" 500 ViTs, 500 Mistral LoRAs, 50 LLaMA models all collapse into shared 16-dimension subspaces regardless of data Not every problem in AI is a data one

askalphaxiv's tweet image. all roads lead to the same subspace

"The Universal Weight Subspace Hypothesis"

500 ViTs, 500 Mistral LoRAs, 50 LLaMA models all collapse into shared 16-dimension subspaces regardless of data

Not every problem in AI is a data one

Geosh 已轉發

Open recipe to turn Qwen3 into a diffusion LLM 👀👀 > Swap the causal mask for bidirectional attention > Source model matters a lot for performance > Block diffusion (BD3LM) >> masked diffusion (MDLM) > Light SFT with masking Great work from @asapzzhou with his dLLM library!


fun and remarkably good!

My wife has been using AI to create educational songs for our children (She's a Cardiac Electrophysiologist by day) It's already had a positive impact on our children (ages 4 and 7). Her workflow involves Claude, Suno, and NanoBanana This is mainly a "personal podcast" that…

HamelHusain's tweet image. My wife has been using AI to create educational songs for our children (She's a Cardiac Electrophysiologist by day)

It's already had a positive impact on our children (ages 4 and 7).  Her workflow involves Claude, Suno, and NanoBanana 

This is mainly a "personal podcast" that…


Geosh 已轉發

GLM4.6V is out! It's a reasoning vision language model that can write code and do tool-calling 😍 comes in 10B dense and 108B MoE variants with 128k tokens context window supported by transformers and vLLM from the get-go! 🤗

mervenoyann's tweet image. GLM4.6V is out! 

It's a reasoning vision language model that can write code and  do tool-calling 😍

comes in 10B dense and 108B MoE variants with 128k tokens context window

supported by transformers and vLLM from the get-go! 🤗

Geosh 已轉發

Fine-tuning using GRPO, visually explained:

akshay_pachaar's tweet image. Fine-tuning using GRPO, visually explained:

You're in a Research Scientist interview at Google. Interviewer: We have a base LLM that's terrible at maths. How would you turn it into a maths & reasoning powerhouse? You: I'll get some problems labeled and fine-tune the model. Interview over. Here's what you missed:



Geosh 已轉發

you put cells in a dish and energize then, they naturally connect. Either physically by growing protrusions. Or through secreted signals—cytokines, metabolites.

來自 Slava Bobrov

Geosh 已轉發

Bleak outlook for an AI-native social media platform that exploits real-time in-session signals to generate addictive content on the fly. openreview.net/pdf?id=1IpHkK5… #neurips @petergostev

JustinAngel's tweet image. Bleak outlook for an AI-native social media platform that exploits real-time in-session signals to generate addictive content on the fly. 

openreview.net/pdf?id=1IpHkK5…

#neurips 
 @petergostev

United States 趨勢

Loading...

Something went wrong.


Something went wrong.