Gary’s Notebook 🧙♂️
@gary_doesnt_lai
Dit vind je misschien leuk
Tip for debugging with LLM: instead of asking it to just stare at the code to guess what’s wrong, ask it to create a script to print out everything you need to diagnose the issue and then give it the output. Suddenly your LLM becomes 10x better at debugging.
One consequence of much faster internet may be that LLMs will move more and more to the frontend (browser) — better privacy, UX and no serving cost. WebGPU is already working pretty well and one major bottleneck (other than storage) is speed to download these giant models.
Been contemplating the fact that in LLMs, the outlier values are often the most important values. If you remove these outliers, your model performance degrades drastically. It's quite poetic and affirms my belief that "the average is overrated".
One cool thing from Gemini’s technical report no one is talking about is that it can take audio signal natively (as opposed to converting audio to text). This means it can potentially capture speech tones?
LLMs are to small deep learning models what small deep learning models are to hard-code logic
Is it possible to solve NLP tasks by simply following instructions that define the tasks? How can we measure the progress? Excited to announce Natural Instructions v2, a collection of 1600+ diverse language tasks and their expert-written instructions! 📜arxiv.org/abs/2204.07705
Everybody wants their models to run faster. However, researchers often cargo cult performance without a solid understanding on the underlying principles. To address that, I wrote a post called "Making Deep Learning Go Brrrr From First Principles". (1/3) horace.io/brrr_intro.html
Was wondering why these model checkpoints files are so big (~GB). Aren’t they just a bunch of floats (~4 bytes each)? Then realized roberta-large is 355M parameters 🤯🤯🤯
github.com/tonsky/FiraCode @FiraCode thank you for this font. Immediately noticing a reduction in cognitive load after switching my text editor to it!
We need a dedicated collection of Toy Datasets for Machine Learning: 1. They can be more interesting than real datasets, specially if designed to be hard for certain algorithms. 2. They are more useful for teaching / learning. Maybe @huggingface / @kaggle can help with this?
Got hit-and-run by a white work van on Atlantic and Hellman on 9/24, 7:21pm. Neck is a bit sore but otherwise I'm ok. I'm offering $1k cash or $2k to charity of your choice for first person to send me dash cam footage clearly showing the van's license plate (ends in 5G)
Stanford's ~entire AI Department has just released a 200 page 100 author Neural Scaling Laws Manifesto. They're pivoting to positioning themselves as #1 at academic ML Scaling (e.g. GPT-4) research. "On the Opportunities and Risks of Foundation Models" arxiv.org/abs/2108.07258
United States Trends
- 1. Packers 48.9K posts
- 2. Panthers 38.2K posts
- 3. Colts 34K posts
- 4. Drake London 6,798 posts
- 5. Steelers 49.9K posts
- 6. Falcons 27.6K posts
- 7. Bengals 34.8K posts
- 8. #KeepPounding 4,297 posts
- 9. Lions 56.6K posts
- 10. Daniel Jones 7,698 posts
- 11. FanDuel 39.7K posts
- 12. Bears 51.5K posts
- 13. #Skol 3,981 posts
- 14. Vikings 34.8K posts
- 15. Jordan Love 8,493 posts
- 16. Parker Romo 1,947 posts
- 17. LaFleur 5,730 posts
- 18. JJ McCarthy 5,529 posts
- 19. #HereWeGo 5,335 posts
- 20. Zac Taylor 1,676 posts
Dit vind je misschien leuk
-
Composable Security ⛓️💥
@Composable_Sec -
Haoyi Qiu
@HaoyiQiu -
Adam Barker
@_adam_barker_ -
Ziru Chen
@RonZiruChen -
tony
@_tonyfrancis -
Scott
@scottgoodwin3 -
noudalab.ai
@noudalab -
Average_Doodler
@Average_D00dler -
E-021
@ESquared021 -
Hanwen Xu
@HanwenXu6 -
lilaussie
@lilaussie24 -
Hossain Shaikh Saadi
@saadi_hossain -
Aakash Gupta
@skylord_99 -
Apurv Verma
@verma_apurv5 -
Pierce Primm, Sr.
@PiercePrimm
Something went wrong.
Something went wrong.