Mike Carroll

@mikecarroll_eng

Engineer. Previously @Facebook

Seattle, WA

Tham gia vào Tháng 10 2009

1KBài đăng 98Người theo dõi 376Đang theo dõi

Bạn có thể thích

@sarna_dev

@alotor

@triuzla

@copylove

@metanet

@inoutlabs

@CalvinSun2012

@qwwdfsad

@peng_win_

@bashrw

@rgsnatzke

Mike Carroll đã đăng lại

0xSero

@0x_Sero

13 giờ

Training Andrej Karpathy’s Nanochat on 4x RTX 3090s at 225W each: Step 2,694/21,400 (12.59% done) Loss: 3.14 Runtime: 6.78 hours Throughput: 3,600 tok/sec Temps: 52-57°C VRAM: 19GB/24GB per card Total cost: 15$ at 55h Zero errors, perfectly stable

0x_Sero's tweet image. Training Andrej Karpathy’s Nanochat on 4x RTX 3090s at 225W each:

Step 2,694/21,400 (12.59% done)
Loss: 3.14
Runtime: 6.78 hours
Throughput: 3,600 tok/sec
Temps: 52-57°C
VRAM: 19GB/24GB per card
Total cost: 15$ at 55h

Zero errors, perfectly stable

Mike Carroll đã đăng lại

Joseph Suarez 🐡

@jsuarez5341

31 phút

I don't like courses. Most were a waste of time. Yes, even at Stanford. If you're new to ML, take CS231N.

sankalp

@dejavucoder

15 thg 10

your honor i object, i dont know about harvard but stanford literally releases SOTA courses

Mike Carroll đã đăng lại

Bryan Johnson

@bryan_johnson

21 giờ

My meeting budget: 5 min - meet someone new 10 min - solve a problem 15 min - identify + solve a problem Parkinson’s law: work expands so as to fill the time available for its completion.

Mike Carroll đã đăng lại

Vivek Galatage

@vivekgalatage

14 thg 10

MIT's 6.851: Advanced Data Structures (Spring'21) courses.csail.mit.edu/6.851/spring21/ This has been on my recommendation list for a while, and the Memory hierarchy discussions are great in the context of cache-oblivious algorithms.

Vivek Galatage

@vivekgalatage

13 thg 10

"Cache‑Oblivious Algorithms and Data Structures" by Erik D. Demaine erikdemaine.org/papers/BRICS20… This is a foundational survey on designing cache‑oblivious algorithms and data structures that perform as well as cache‑aware approaches that require hardcoding cache size (M) and block…

vivekgalatage's tweet image. "Cache‑Oblivious Algorithms and Data Structures" by Erik D. Demaine

erikdemaine.org/papers/BRICS20…

This is a foundational survey on designing cache‑oblivious algorithms and data structures that perform as well as cache‑aware approaches that require hardcoding cache size (M) and block…

Mike Carroll đã đăng lại

Dhanian 🗯️

@e_opore

14 thg 10

50 LLM Projects with Source Code to Become a Pro 1. Beginner-Level LLM Projects → Text Summarizer using OpenAI API → Chatbot for Customer Support → Sentiment Analysis with GPT Models → Resume Optimizer using LLMs → Product Description Generator → AI-Powered Grammar…

e_opore's tweet image. 50 LLM Projects with Source Code to Become a Pro

1. Beginner-Level LLM Projects

→ Text Summarizer using OpenAI API
→ Chatbot for Customer Support
→ Sentiment Analysis with GPT Models
→ Resume Optimizer using LLMs
→ Product Description Generator
→ AI-Powered Grammar…

Mike Carroll đã đăng lại

Elon Musk

@elonmusk

14 thg 10

podcasts.apple.com/us/podcast/wha…

Mike Carroll đã đăng lại

ℏεsam

@Hesamation

13 thg 10

I left my plans for weekend to read this recent blog from HuggingFace 🤗 on how they maintain the most critical AI library: transformers. → 1M lines of Python, → 1.3M installations, → thousands of contributors, → a true engineering masterpiece, Here's what I learned:…

Hesamation's tweet image. I left my plans for weekend to read this recent blog from HuggingFace 🤗 on how they maintain the most critical AI library: transformers.

→ 1M lines of Python,
→ 1.3M installations,
→ thousands of contributors,
→ a true engineering masterpiece,

Here's what I learned:…

Mike Carroll đã đăng lại

Andrej Karpathy

@karpathy

13 thg 10

Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single,…

karpathy's tweet image. Excited to release new repo: nanochat!
(it's among the most unhinged I've written).

Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single,…

Mike Carroll đã đăng lại

Nicholas Fabiano, MD

@NTFabiano

12 thg 10

You're not depressed, you just lost your quest.

Mike Carroll đã đăng lại

Victor

@victor_explore

12 thg 10

🔥Free Google Collab notebooks to implement every Machine Learning Algorithm from scratch Link in comment

Mike Carroll đã đăng lại

Elliot Arledge (h/eng)

@elliotarledge

10 thg 10

how i got here: > i used to be and still tend towards having an obsessive/addictive personality > put many years of my life into video games > it was only 2 years ago i started to turn that around because i got other interests and starting really looking forward to the future >…

Mike Carroll đã đăng lại

D4rsh🦅

@d4rsh_tw

10 thg 10

found a repo that has a massive collection of Machine Learning system design case studies used in the real world, from Stripe, Spotify, Netflix, Meta, GitHub, Twitter/X, and much more link in replies

d4rsh_tw's tweet image. found a repo that has a massive collection of Machine Learning system design case studies used in the real world, from Stripe, Spotify, Netflix, Meta, GitHub, Twitter/X, and much more

link in replies

Mike Carroll đã đăng lại

Tom Yeh

@ProfTomYeh

9 thg 10

Copy-pasting PyTorch code is fast — using an AI coding model is even faster — but both skip the learning. That's why I asked my students to write by hand ✍️. 🔽 Download: byhand.ai/pytorch After the exercise, my students can understand what every line really does and…

Mike Carroll đã đăng lại

Dhanian 🗯️

@e_opore

1 thg 10

70 Python Projects with Source code for Developers Step 1: Beginner Foundations → Hello World Web App → Calculator (CLI) → To-Do List CLI → Number Guessing Game → Countdown Timer → Dice Roll Simulator → Coin Flip Simulator → Password Generator → Palindrome Checker →…

Mike Carroll đã đăng lại

Elliot Arledge (h/eng)

@elliotarledge

2 thg 10

everything you need to get started in one repo

Mike Carroll đã đăng lại

Akshay 🚀

@akshay_pachaar

1 thg 10

System prompts are getting outdated! Here's a counterintuitive lesson from building real-world Agents: Writing giant system prompts doesn't improve an Agent's performance; it often makes it worse. For example, you add a rule about refund policies. Then one about tone. Then…

akshay_pachaar's tweet image. System prompts are getting outdated!

Here's a counterintuitive lesson from building real-world Agents:

Writing giant system prompts doesn't improve an Agent's performance; it often makes it worse.

For example, you add a rule about refund policies. Then one about tone. Then…

Mike Carroll đã đăng lại

Joseph Suarez 🐡

@jsuarez5341

1 thg 10

Nethack is the best benchmark you've never heard of. I'd say that it makes ALE look like a toy, but well... it is

Mikael Henaff

@HenaffMikael

1 thg 10

Introducing Scalable Option Learning (SOL☀️), a blazingly fast hierarchical RL algorithm that makes progress on long-horizon tasks and demonstrates positive scaling trends on the largely unsolved NetHack benchmark, when trained for 30 billion samples. Details, paper and code in >

Mike Carroll đã đăng lại

Alex L Zhang

@a1zhang

30 thg 9

it's insane to me how little attention the llm.q repo has it's a fully C/C++/CUDA implementation of multi-gpu (zero + fsdp), quantized LLM training with support for selective AC it's genuinely the coolest OSS thing I've seen this year (what's crazier is 1 person wrote it!)

a1zhang's tweet image. it's insane to me how little attention the llm.q repo has

it's a fully C/C++/CUDA implementation of multi-gpu (zero + fsdp), quantized LLM training with support for selective AC

it's genuinely the coolest OSS thing I've seen this year (what's crazier is 1 person wrote it!)

Mike Carroll đã đăng lại

Andrej Karpathy

@karpathy

1 thg 10

Finally had a chance to listen through this pod with Sutton, which was interesting and amusing. As background, Sutton's "The Bitter Lesson" has become a bit of biblical text in frontier LLM circles. Researchers routinely talk about and ask whether this or that approach or idea…

Dwarkesh Patel

@dwarkesh_sp

26 thg 9

.@RichardSSutton, father of reinforcement learning, doesn’t think LLMs are bitter-lesson-pilled. My steel man of Richard’s position: we need some new architecture to enable continual (on-the-job) learning. And if we have continual learning, we don't need a special training…