basavasagar18's profile picture. @UMRobotics Master's Student.

Basavasagar Patil

@basavasagar18

@UMRobotics Master's Student.

Basavasagar Patil reposted

Making everything one big CUDA graph helped wall clock consistency a lot on a laptop, but we still fought with power management. Looking forward to using an Nvidia Spark in the future. It isn’t in the repo code, but the single biggest win I have seen is an “action input” model…


With some caveats*

basavasagar18's tweet image. With some caveats*

Do VLAs really need large multi-billion parameter VLM backbones? Recent impressive work from TRI and BD seems to suggest maybe not.



Solution to Yann lecun's drifting problem is a system prompt!!

New blog post by @AmanGokrani: Everyone says Claude Code "just works" like magic. He proxied its API calls to see what's happening. The secret? It's riddled with <system-reminder> tags that never let it forget what it's doing. (1/6) [🔗 link in final post with system prompt]

vitransformer's tweet image. New blog post by @AmanGokrani:
Everyone says Claude Code &quot;just works&quot; like magic.

He proxied its API calls to see what&apos;s happening.
The secret? It&apos;s riddled with &amp;lt;system-reminder&amp;gt; tags that never let it forget what it&apos;s doing.

(1/6)

[🔗 link in final post with system prompt]


Basavasagar Patil reposted

PSA: if you work on plasticity loss you should read "Transient Non-stationarity and Generalisation in Deep Reinforcement Learning" by Igl et al. It's super relevant but suffers from an unfortunate lack of SEO due to predating the "plasticity loss" nomenclature.


ofcourse, they took the spotlight from genie


The progress is crazy on this!!

Harder, Better, Faster, Stronger, Real-time! We are excited to reveal Genie 3, our most capable real-time foundational world model. Fantastic cross-team effort led by @jparkerholder and @shlomifruchter. Below some interactive worlds and capabilities that were highlights for me…



Lol, I just used Opus to rewrite my torch PPO code to Jax. Let's just say I was nowhere near confident, and I was right, full of compilation errors and bugs

"We recently merged a 22,000-line change to our production reinforcement learning codebase that was written heavily by Claude."

Sauers_'s tweet image. &quot;We recently merged a 22,000-line change to our production reinforcement learning codebase that was written heavily by Claude.&quot;


Basavasagar Patil reposted

Sell my computer

You’re given a MacBook, no job, no money. You have 30 days to make $1,000 online. What’s your plan?



Basavasagar Patil reposted

I am excited to announce that our AI institute (Institute for Foundations of Machine Learning, IFML) has been renewed. IFML was part of the first cohort of AI Institutes announced in 2020. Led by UT Austin, the new award will build on the trajectory of the past five years and…

AlexGDimakis's tweet image. I am excited to announce that our AI institute (Institute for Foundations of Machine Learning, IFML) has been renewed. 
IFML was part of the first cohort of AI Institutes announced in 2020. Led by UT Austin, the new award will build on the trajectory of the past five years and…

Basavasagar Patil reposted

This very cool paper proposes an intriguing idea. If you use a small batch size, you can fine-tune LLMs with SGD or Adafactor (algorithms with very small memory overhead). But there is a small trap: Storage precision. Let's explore that. 🧵

🚨 Did you know that small-batch vanilla SGD without momentum (i.e. the first optimizer you learn about in intro ML) is virtually as fast as AdamW for LLM pretraining on a per-FLOP basis? 📜 1/n

micahgoldblum's tweet image. 🚨 Did you know that small-batch vanilla SGD without momentum (i.e. the first optimizer you learn about in intro ML) is virtually as fast as AdamW for LLM pretraining on a per-FLOP basis? 📜 1/n


Hierarchical RL!!!!

Scaling up RL is all the rage right now, I had a chat with a friend about it yesterday. I'm fairly certain RL will continue to yield more intermediate gains, but I also don't expect it to be the full story. RL is basically "hey this happened to go well (/poorly), let me slightly…



Basavasagar Patil reposted

clickbait title

Ar_Douillard's tweet image. clickbait title

I'll discuss distributed learning on Saturday, July 12. First, I'll cover current methods needing high bandwidth, then next-generation methods for decentralized learning



Basavasagar Patil reposted

before we got fancy optimizers like muon shampoo and conditioner with matmuls , optimizer steps were essentially entirely bandwidth-bound pointwise ops in this regime more optimizer steps by definition decreases your MFU, and vice-versa if the speed increases more than your…

I never understood gradient accumulation; it seemed to me that doing more optimizer steps was clearly optimal, as you can mimic the accumulation behaviour with a properly chosen learning rate



Basavasagar Patil reposted

(only tangentially related): Many interesting claims in empirical DL, including my own, are “not even wrong” — they are not stated precisely enough to even be hypothesis-tested. This is often by necessity: we don’t know formal definitions of the relevant objects. (Cont)

Grad students often feel pressure that doing science means only changing a-variable-at-a-time. No. You do that if you're testing a causal hypothesis. If you're exploring complex systems, you may maintain a mental model and update several variables at once in, eg, a Bayesian way.



Basavasagar Patil reposted

TRI's latest Large Behavior Model (LBM) paper landed on arxiv last night! Check out our project website: toyotaresearchinstitute.github.io/lbm1/ One of our main goals for this paper was to put out a very careful and thorough study on the topic to help people understand the state of the…


Basavasagar Patil reposted

Why you should stop working on RL research and instead work on product // The technology that unlocked the big scaling shift in AI is the internet, not transformers I think it's well known that data is the most important thing in AI, and also that researchers choose not to work…

_kevinlu's tweet image. Why you should stop working on RL research and instead work on product //
The technology that unlocked the big scaling shift in AI is the internet, not transformers

I think it&apos;s well known that data is the most important thing in AI, and also that researchers choose not to work…

Basavasagar Patil reposted

we are from New York

Share a piece of lore about yourself

frozenaesthetic's tweet image. Share a piece of lore about yourself


Loading...

Something went wrong.


Something went wrong.