Geosh
@Geoshh
Embodied A.I. | Socioaffective Alignment | Systems Biology & Interpersonal Neurobiology | @UChicago | @EuroGradSchool |healing,science,technology,connection
Gonna try to pin a few favorite posts that linger in mind over time:
Amusing how 99% of people using their own brains forget how it works: The brain is an advanced probability machine. It keeps predicting the next most likely thought, word, or action based on incoming signals and past learning. Under the hood, billions of neurons are doing…
This is fantastic article explaining why you should be paying attention to the emergence of hybrid models and why they are likely to replace self-attention-based models (hint: much faster and lower memory foot print inference). pytorch.org/blog/hybrid-mo… This is from vllm folks.
Lovely work by @gersonkroiz and Greg Kocher: Previously, when we studied models being aware they're being evaluated they said it in the CoT, but this might be easier to fix. They got it working silently and showed our methods still stop it! Done as a 4 day practice project!
Excited to publish some cool findings we found during our 4 day mini project in Neel Nanda's MATS Exploration stream. We studied whether LLMs can be evaluation aware without explicit verbalization. Read more here lesswrong.com/posts/W6ZFnhee…
TIL: Claude Code local sandbox environment is open-source. > native OS sandboxing primitives (sandbox-exec on macOS, bubblewrap on Linux) and proxy-based network filtering. It can be used to sandbox the behaviour of agents, local MCP servers, bash commands and arbitrary…
Your heart starts with a single cell that learns to beat. What you’re seeing This video shows a stem cell transforming into a heart muscle cell, or cardiomyocyte. As it develops, the cell organizes its internal scaffolding, forms contractile fibers called sarcomeres, and begins…
woah....note to self, cf
🚨🇺🇸 SEVEN MORE FAMILIES SUE OPENAI OVER CHATGPT SUICIDE CASES Seven families have filed new lawsuits against OpenAI, claiming the company rushed its GPT-4o model to market without proper safety testing. Four cases involve suicides allegedly linked to the chatbot’s responses,…
This is one of the most exciting agentic AI results I have seen! An AI agent (through several rounds of reasoning and experimentation) discovers distributed systems algorithms (e.g., GPU load balancing) that perform on par with those designed by world-renowned human experts in…
We built a Systems Researcher AI agent! Glia discovers novel distributed systems algorithms matching PhD-level experts in creativity & performance. We ran it on various networked systems problems and obtained publication-worthy results on each! Let me tell you how we did it 🧵
I've been using Kimi K2 for my mother's osteoporosis treatment to double-check her test results, bone density reports, and the feedback from her doctor. Honestly, it’s been surprisingly impressive. Compared to ChatGPT or Gemini, Kimi K2 gives far more detailed and accurate…
if you want the tweet version and not the 10min video version: this is now all it takes to train with prime-rl after installing verifiers
verifiers v0.1.7 is released 🚀 this one's all about making RL training and experimentation waaaay easier: - single-command installation for prime-rl - single-command training w/ unified configs - overhauled vf.RLTrainer for hacking on new algorithms quick demo + links below :)
Kimi-k2-thinking is incredible. So I built an agent to test it out, Kimi-writer. It can generate a full novel from one prompt, running up to 300 tool requests per session. Here it is creating an entire book, a collection of 15 short sci-fi stories.
An exciting new approach for doing continual learning, using nested optimization for enhancing long context processing.
Introducing Nested Learning: A new ML paradigm for continual learning that views models as nested optimization problems to enhance long context processing. Our proof-of-concept model, Hope, shows improved performance in language modeling. Learn more: goo.gle/47LJrzI…
hey yo folks if you liked the evolutionary strategy for finetuning chat we had with yulu check out the new version of the code they released which is 10X faster def a fine tuning method to tinker with
Our recent ES fine-tuning paper (arxiv.org/pdf/2509.24372) received lots of attention from the community (Thanks to all!). To speed up the research in this new direction, we developed an accelerated implementation with 10X speed-up in total running time, by refactoring the…
Your every thought, memory, and movement begins here: an electric spark turned chemical message, as one neuron whispers to the next. What you’re seeing This animation captures neurotransmission, the process that allows nerve cells to communicate across microscopic gaps called…
every single person i know who made a cool video demo of a project that shows agency & great technical ability has been reached out to from top labs & top companies you never need to compete with millions of other people for your top pick role
When a robot is forced to prove “I’m really a robot.” XPeng’s humanoid robot IRON seems to have crossed the uncanny valley — its body shape and movements look almost identical to a human’s. So XPeng had to cut open one of its legs to reveal the hardware inside, just to prove…
At yesterday’s Tech Day, XPeng’s female humanoid robot, Iron, made a catwalk-style entrance that stunned the audience. But soon after, people online started asking — “Is there a real person inside?” Today, XPeng CEO He Xiaopeng personally stepped in to respond to those doubts.
Exciting to see these results aligning with our recent work showing that memorization happens during the entropy-seeking phase of LLM pretraining, where information is added to the bottom directions of the representation space! 🧵: x.com/kumarkagrawal/…
We project activations of the two sets onto the eigenvectors of A. There’s a very large and clear disentanglement across the eigenspectrum: clean data interacts with top directions and memorized with the bottom directions in both LMs and ViTs
Haha! To prove she’s 100% robot, XPeng had to literally cut open one of IRON’s legs so everyone could see the actuators inside 😂
I'm a bit giddy over the fact that this is by all visible measures a frontier level model, if not THE frontier model, for agentic tasks. And you can run it. In it's native precision. On 2 M3 Ultras. Pretty fast. In MLX.
The new 1 Trillion parameter Kimi K2 Thinking model runs well on 2 M3 Ultras in its native format - no loss in quality! The model was quantization aware trained (qat) at int4. Here it generated ~3500 tokens at 15 toks/sec using pipeline-parallelism in mlx-lm:
Love this explanation
By decomposing weights using loss curvature, you can identify components used for memorization vs generalization. High-curvature = shared mechanisms used across data. Low-curvature = idiosyncratic directions for memorized examples. You can then ablate the memorization weights!
United States เทรนด์
- 1. Steelers 52.4K posts
- 2. Rodgers 21.1K posts
- 3. Chargers 37.2K posts
- 4. Tomlin 8,250 posts
- 5. Schumer 223K posts
- 6. Resign 106K posts
- 7. #BoltUp 3,003 posts
- 8. Tim Kaine 19.3K posts
- 9. #TalusLabs N/A
- 10. Keenan Allen 4,935 posts
- 11. #HereWeGo 5,676 posts
- 12. #RHOP 6,937 posts
- 13. Dick Durbin 12.5K posts
- 14. #ITWelcomeToDerry 4,657 posts
- 15. 8 Democrats 9,109 posts
- 16. Gavin Brindley N/A
- 17. Angus King 16.3K posts
- 18. 8 Dems 7,096 posts
- 19. Herbert 11.8K posts
- 20. Shaheen 34.4K posts
Something went wrong.
Something went wrong.