felixcantcode's profile picture. generating values for stakeholders, internet plumber, http://mu2mi.com, blog: https://felixng.me

opinions are of my own

Felix Nguyen

@felixcantcode

generating values for stakeholders, internet plumber, http://mu2mi.com, blog: https://felixng.me opinions are of my own

Felix Nguyen reposted

New blog post: The bug that taught me more about PyTorch than years of using it started with a simple training loss plateau... ended up digging through optimizer states, memory layouts, kernel dispatch, and finally understanding how PyTorch works!

ElanaPearl's tweet image. New blog post: The bug that taught me more about PyTorch than years of using it

started with a simple training loss plateau... ended up digging through optimizer states, memory layouts, kernel dispatch, and finally understanding how PyTorch works!

Felix Nguyen reposted

Our latest post explores on-policy distillation, a training approach that unites the error-correcting relevance of RL with the reward density of SFT. When training it for math reasoning and as an internal chat assistant, we find that on-policy distillation can outperform other…

thinkymachines's tweet image. Our latest post explores on-policy distillation, a training approach that unites the error-correcting relevance of RL with the reward density of SFT. When training it for math reasoning and as an internal chat assistant, we find that on-policy distillation can outperform other…

Felix Nguyen reposted

one of the often overlooked factors behind deepseek’s success lies in how the team built their distributed infrastructure from the ground up (despite severe gpu constraints!). their custom communication library hfreduce replaced nccl and delivered substantially higher bandwidth…

novasarc01's tweet image. one of the often overlooked factors behind deepseek’s success lies in how the team built their distributed infrastructure from the ground up (despite severe gpu constraints!). their custom communication library hfreduce replaced nccl and delivered substantially higher bandwidth…
novasarc01's tweet image. one of the often overlooked factors behind deepseek’s success lies in how the team built their distributed infrastructure from the ground up (despite severe gpu constraints!). their custom communication library hfreduce replaced nccl and delivered substantially higher bandwidth…
novasarc01's tweet image. one of the often overlooked factors behind deepseek’s success lies in how the team built their distributed infrastructure from the ground up (despite severe gpu constraints!). their custom communication library hfreduce replaced nccl and delivered substantially higher bandwidth…
novasarc01's tweet image. one of the often overlooked factors behind deepseek’s success lies in how the team built their distributed infrastructure from the ground up (despite severe gpu constraints!). their custom communication library hfreduce replaced nccl and delivered substantially higher bandwidth…

Felix Nguyen reposted

You know how if you spend the whole day sitting on the couch watching TV, you get kind of restless yet somehow also too tired to get off your butt? Like you're tired *of* doing nothing, yet you're also tired *from* doing nothing? You know what I'm talking about, the state of…


Felix Nguyen reposted

Some patterns on how iterators in Rust play well with the borrow checker - Iterator-friendly APIs, Borrow splitting and reborrowing .. Iterators prefer borrows Iterators often yield references to avoid allocation: Keep using an iterator after borrowing it Use `by_ref()` to…

debasishg's tweet image. Some patterns on how iterators in Rust play well with the borrow checker - Iterator-friendly APIs, Borrow splitting and reborrowing ..

Iterators prefer borrows
Iterators often yield references to avoid allocation:

Keep using an iterator after borrowing it
Use `by_ref()` to…
debasishg's tweet image. Some patterns on how iterators in Rust play well with the borrow checker - Iterator-friendly APIs, Borrow splitting and reborrowing ..

Iterators prefer borrows
Iterators often yield references to avoid allocation:

Keep using an iterator after borrowing it
Use `by_ref()` to…
debasishg's tweet image. Some patterns on how iterators in Rust play well with the borrow checker - Iterator-friendly APIs, Borrow splitting and reborrowing ..

Iterators prefer borrows
Iterators often yield references to avoid allocation:

Keep using an iterator after borrowing it
Use `by_ref()` to…

Felix Nguyen reposted

I have fine-tuned over 100 different LLMs/VLMs for various use cases over the last 1–2 years, and here is my framework whenever I pick a new project or problem statement: 1. Benchmark/Evals Any problem you are solving for should have an evaluation set that you can easily…


Felix Nguyen reposted

What should I name this plugin?

can this be an extension in vscode, it just freezes your vscode randomly and plays phonk



Felix Nguyen reposted

Last night I taught nanochat d32 how to count 'r' in strawberry (or similar variations). I thought this would be a good/fun example of how to add capabilities to nanochat and I wrote up a full guide here: github.com/karpathy/nanoc… This is done via a new synthetic task…

karpathy's tweet image. Last night I taught nanochat d32 how to count 'r' in strawberry (or similar variations). I thought this would be a good/fun example of how to add capabilities to nanochat and I wrote up a full guide here:
github.com/karpathy/nanoc…

This is done via a new synthetic task…

Felix Nguyen reposted

Of ~200 books I've read, the few that stayed with me over time and I find myself often thinking back to or referring to, in ~random order: All short stories by Ted Chiang, especially Exhalation, Division By Zero, Understand, The Story of Your Life, Liking What You See, The…


me after watching karpathy x dwarkesh

felixcantcode's tweet image. me after watching karpathy x dwarkesh

Felix Nguyen reposted

There are two somewhat related myths about neural networks in many intro ML courses that, I think, mislead more than they help. 1) The statement, "neural networks are powerful," is often followed by the citation to universal approximation theorem. 2) Neural networks are often…

Finally had a chance to listen through this pod with Sutton, which was interesting and amusing. As background, Sutton's "The Bitter Lesson" has become a bit of biblical text in frontier LLM circles. Researchers routinely talk about and ask whether this or that approach or idea…



Felix Nguyen reposted

nanochat now has a primordial identity and can talk a bit about itself and its capabilities (e.g. it knows it's nanochat d32 that cost $800, that it was built by me, that it can't speak languages other than English too well and why, etc.). This kind of customization is all done…

I fixed it :) deployed live now. This was done by doing a round of synthetic data generation to collect a 1000 multi-turn conversations (given a bunch of information including the readme of the nanochat project), and then mixing that into midtraining and SFT. fun!

karpathy's tweet image. I fixed it :) deployed live now. This was done by doing a round of synthetic data generation to collect a 1000 multi-turn conversations (given a bunch of information including the readme of the nanochat project), and then mixing that into midtraining and SFT. fun!


Felix Nguyen reposted

In Rust, how the borrow-checker shapes your API inputs and outputs .. The general design principle is to choose signatures that minimize ownership churn while keeping call sites clean and safe. Accepting input • Borrow when you only read: `fn parse(src: &str)` • Borrow…

debasishg's tweet image. In Rust, how the borrow-checker shapes your API inputs and outputs ..

The general design principle is to choose signatures that minimize ownership churn while keeping call sites clean and safe.

Accepting input

• Borrow when you only read: `fn parse(src: &str)`
• Borrow…
debasishg's tweet image. In Rust, how the borrow-checker shapes your API inputs and outputs ..

The general design principle is to choose signatures that minimize ownership churn while keeping call sites clean and safe.

Accepting input

• Borrow when you only read: `fn parse(src: &str)`
• Borrow…

Felix Nguyen reposted

NVIDIA over USB4 on MacBook is ready to try! * ADT-UT3G dock + any 30/40/50 series GPU * Disable SIP * Install driver `extra/usbgpu/tbgpu` * Install NVK compiler `brew install tinymesa` * Test with: `DEBUG=2 NV_NAK=1 NV=1 python3 test/test_tiny.py TestTiny.test_plus`

__tinygrad__'s tweet image. NVIDIA over USB4 on MacBook is ready to try!

* ADT-UT3G dock + any 30/40/50 series GPU
* Disable SIP
* Install driver `extra/usbgpu/tbgpu`
* Install NVK compiler `brew install tinymesa`
* Test with:
`DEBUG=2 NV_NAK=1 NV=1 python3 test/test_tiny.py TestTiny.test_plus`

Felix Nguyen reposted

Sharding. Database sharding is one of the common techniques to scale a database horizontally. You split the db into small parts called shards and distribute them across machines. Shards are typically in the few hundreds or even thousands (for extremely large databases). Usually…


Felix Nguyen reposted

(1/2) i felt like no one actually teaches you a good framework for how to read (ML) papers well + fast, so i wrote this 5-minute read tldr: because so many papers suck, here's how to go through them quickly and revisit the good ones

masonwang025's tweet image. (1/2) i felt like no one actually teaches you a good framework for how to read (ML) papers well + fast, so i wrote this 5-minute read

tldr: because so many papers suck, here's how to go through them quickly and revisit the good ones

Felix Nguyen reposted

I just published the full guide to building forms with the Field component: - TanStack Form & React Hook Form - Zod validation and displaying errors - Practical examples we’ll actually use - Inputs, Radios, Fieldset, Arrays & more Check it out. Link below.


United States Trends

Loading...

Something went wrong.


Something went wrong.