shaojieb's profile picture. Doing AI at @thinkymachines. Previously GenAI+RLR @ Meta. CMU MLD. Twitter account for more than AI.

Shaojie Bai

@shaojieb

Doing AI at @thinkymachines. Previously GenAI+RLR @ Meta. CMU MLD. Twitter account for more than AI.

Post-training made (extremely) easy. Try it out!

Introducing Tinker: a flexible API for fine-tuning language models. Write training loops in Python on your laptop; we'll run them on distributed GPUs. Private beta starts today. We can't wait to see what researchers and developers build with cutting-edge open models!…

thinkymachines's tweet image. Introducing Tinker: a flexible API for fine-tuning language models.

Write training loops in Python on your laptop; we'll run them on distributed GPUs.

Private beta starts today. We can't wait to see what researchers and developers build with cutting-edge open models!…


Shaojie Bai 님이 재게시함

Introducing Tinker: a flexible API for fine-tuning language models. Write training loops in Python on your laptop; we'll run them on distributed GPUs. Private beta starts today. We can't wait to see what researchers and developers build with cutting-edge open models!…

thinkymachines's tweet image. Introducing Tinker: a flexible API for fine-tuning language models.

Write training loops in Python on your laptop; we'll run them on distributed GPUs.

Private beta starts today. We can't wait to see what researchers and developers build with cutting-edge open models!…

Shaojie Bai 님이 재게시함

LoRA makes fine-tuning more accessible, but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post, we share our experimental results and recommendations for LoRA.…

thinkymachines's tweet image. LoRA makes fine-tuning more accessible, but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post, we share our experimental results and recommendations for LoRA.…

Shaojie Bai 님이 재게시함

Efficient training of neural networks is difficult. Our second Connectionism post introduces Modular Manifolds, a theoretical step toward more stable and performant training by co-designing neural net optimizers with manifold constraints on weight matrices.…

thinkymachines's tweet image. Efficient training of neural networks is difficult. Our second Connectionism post introduces Modular Manifolds, a theoretical step toward more stable and performant training by co-designing neural net optimizers with manifold constraints on weight matrices.…

Shaojie Bai 님이 재게시함

Thinking Machines Lab exists to empower humanity through advancing collaborative general intelligence. We're building multimodal AI that works with how you naturally interact with the world - through conversation, through sight, through the messy way we collaborate. We're…


We’ll be mixing ideas, cocktails and discussions for a night of *neural-networking* at Singapore! Come learn more about Thinking Machines at our happy hour at #iclr2025 😎

Thinking Machines is hosting a happy hour in Singapore during #ICLR2025 on Friday, April 25: lu.ma/ecgmuhmx Come eat, drink, and learn more about us!



Shaojie Bai 님이 재게시함

Introducing our first set of Llama 4 models! We’ve been hard at work doing a complete re-design of the Llama series. I’m so excited to share it with the world today and mark another major milestone for the Llama herd as we release the *first* open source models in the Llama 4…

Ahmad_Al_Dahle's tweet image. Introducing our first set of Llama 4 models!

We’ve been hard at work doing a complete re-design of the Llama series. I’m so excited to share it with the world today and mark another major milestone for the Llama herd as we release the *first* open source models in the Llama 4…

Shaojie Bai 님이 재게시함

Mitigating racial bias from LLMs is a lot easier than removing it from humans! Can’t believe this happened at the best AI conference @NeurIPSConf We have ethical reviews for authors, but missed it for invited speakers? 😡

sunjiao123sun_'s tweet image. Mitigating racial bias from LLMs is a lot easier than removing it from humans! 

Can’t believe this happened at the best AI conference @NeurIPSConf 

We have ethical reviews for authors, but missed it for invited speakers? 😡

Shaojie Bai 님이 재게시함

I'm excited to announce that I am joining the OpenAI Board of Directors. I'm looking forward to sharing my perspectives and expertise on AI safety and robustness to help guide the amazing work being done at OpenAI.

Zico Kolter from Carnegie Mellon joins OpenAI’s Board, bringing technical and AI safety expertise; he also joins the Safety & Security Committee. openai.com/index/zico-kol…



Shaojie Bai 님이 재게시함

I am very excited to start working with GenAI team at @Meta, focusing on multimodal LLM agents, joining together with my amazing CMU colleagues Jing Yu Koh @kohjingyu and Daniel Fried @dan_fried!

rsalakhu's tweet image. I am very excited to start working with GenAI team at @Meta, focusing on multimodal LLM agents, joining together with my amazing CMU colleagues Jing Yu Koh @kohjingyu and Daniel Fried @dan_fried!

Shaojie Bai 님이 재게시함

I'm thrilled to share that I will become the next Director of the Machine Learning Department at Carnegie Mellon. MLD is a true gem, a department dedicated entirely to ML. Faculty and past directors have been personal role models in my career. cs.cmu.edu/news/2024/kolt…


Shaojie Bai 님이 재게시함

Say hello to GPT-4o, our new flagship model which can reason across audio, vision, and text in real time: openai.com/index/hello-gp… Text and image input rolling out today in API and ChatGPT with voice and video in the coming weeks.


Shaojie Bai 님이 재게시함

🚀Our latest blog post unveils the power of Consistency Models and introduces Easy Consistency Tuning (ECT), a new way to fine-tune pretrained diffusion models to consistency models. SoTA fast generative models using 1/32 training cost! 🔽 Get ready to speed up your generative…

ZhengyangGeng's tweet image. 🚀Our latest blog post unveils the power of Consistency Models and introduces Easy Consistency Tuning (ECT), a new way to fine-tune pretrained diffusion models to consistency models.

SoTA fast generative models using 1/32 training cost! 🔽
Get ready to speed up your generative…

Shaojie Bai 님이 재게시함

I feel like a lot of people leverage LLMs suboptimally, especially for long-form interactions that span a whole project. So I wrote a VSCode extension that supports what I think is a better use paradigm. 🧵 1/N Extension: marketplace.visualstudio.com/items?itemName… Code: github.com/locuslab/chatl…


Exciting new work with Evonne, @AlexRichardCS et al! 🤯 We (generatively) animate photorealistic full-body avatars using conversational audio (of anyone!). Next step, seeing GPT4 + photorealistic avatar [argue with]/[point a finger at]/[mock] you in VR?😏

Motion generation for photorealistic avatars? Say no more! Check out how we animate full body avatars exclusively from audio input! Paper: arxiv.org/abs/2401.01885 Project page: people.eecs.berkeley.edu/~evonne_ng/pro… Dataset + Code: github.com/facebookresear…



Shaojie Bai 님이 재게시함

My core ML team (@AIatMeta) is hiring research interns! Our projects span optimization, optimal transport, optimal control, generative modeling, complex systems, and geometry. Please apply here and reach out ([email protected]) if you're interested: metacareers.com/jobs/627997209…


Shaojie Bai 님이 재게시함

@CadeMetz at the New York Times just published a piece on a new paper we are releasing today, on adversarial attacks against LLMs. You can read the piece here: nytimes.com/2023/07/27/tec… And find more info and the paper at: llm-attacks.org [1/n]


Shaojie Bai 님이 재게시함

NeurIPS!!! First in-person meeting after 3yrs starting my research. 🥳 Glad to have any chats, neural dynamics, deep equilibrium models (DEQ), symmetries, protein folding/AF2, etc. Will be working on the intersection of DEQ and AF2 and expect to see all the collaboration chances!


Shaojie Bai 님이 재게시함

🆕📜When can **Equilibrium Models** learn from simple examples to handle complex ones? We identify a property — Path Independence — that enables this by letting EMs think for longer on hard examples. (NeurIPS) 📝: [arxiv.org/abs/2211.09961]()

cem__anil's tweet image. 🆕📜When can **Equilibrium Models** learn from simple examples to handle complex ones? 

We identify a property — Path Independence — that enables this by letting EMs think for longer on hard examples.

(NeurIPS) 📝: [arxiv.org/abs/2211.09961]()

Shaojie Bai 님이 재게시함

I just posted our Deep Learning Systems Lecture 6 on Fully Connected Networks, Optimization, and Initialization: youtu.be/CukpVt-1PA4 However, the real topic of interest here is that I used @OpenAI's whisper to caption it entirely. A thread 🧵on my experience. 1/N

zicokolter's tweet card. Lecture 6 - Fully connected networks, optimization, initialization

youtube.com

YouTube

Lecture 6 - Fully connected networks, optimization, initialization


Loading...

Something went wrong.


Something went wrong.