srk

@fastdaima

problem solver

Science & Technology

India

fastdaima.github.io/dltalkies/

三月 2021 加入

588帖子 19关注者 618正在关注

你可能会喜欢

@khoiuna

@DavidLaiKinChu1

@mriyaz

置顶

srk

@fastdaima

2021年12月31日

little little goals: & spending 2 hours practising new things & spending 2 hours on kaggle everyday & publishing blogs twice a week & small contributions to open source & losing 2 kilos per month -> (24 kilos this year) & finding my optimum sleep cycle (through long experiments)

srk 已转帖

ℏεsam

@Hesamation

20 小时

this is the most based and simple strategy for engineering excellence from first principles. you only need 3 steps: > identify your dream job or what you’d love to do 1 year from now. > search for job postings or look for experts of that field to find which skills you need to…

Hesamation's tweet image. this is the most based and simple strategy for engineering excellence from first principles. you only need 3 steps:
&gt; identify your dream job or what you’d love to do 1 year from now.
&gt; search for job postings or look for experts of that field to find which skills you need to…

srk 已转帖

Orhun Parmaksız 👾

@orhundev

年12月8日

Building tiny CPUs in the terminal!🤯 🧬 NanoCore — An 8-bit CPU emulator + assembler + TUI debugger 🔥 Fully minimal 256-byte memory with variable-length opcodes 🦀 Written in Rust & built with @ratatui_rs ⭐ GitHub: github.com/AfaanBilal/Nan… #rustlang #ratatui #tui #emulator…

srk 已转帖

Abhishek Singh

@0xlelouch_

年12月3日

ML engineering that is essential for backend engineers > **model serving & inference APIs.** you'll eventually serve a model. understand latency vs throughput tradeoffs. know when to use REST vs gRPC for predictions. batch inference vs real-time. cold start problems are real. >…

Aarno

@TheGlobalMinima

年12月3日

Backend engineering that is essential for ml engineers > API & service design (REST, gRPC) > message queues & event systems (kafka / redis) > databases & caching (postgres / qdrant / memgraph / redis // personal recommendations) > async programming > observability These are…

srk 已转帖

Jino Rohit

@jino_rohit

年11月21日

Written in 2019, but still the most detailed blog on learning about pytorch internals.

srk 已转帖

Nathan Lambert

@natolambert

年11月20日

We present Olmo 3, our next family of fully open, leading language models. This family of 7B and 32B models represents: 1. The best 32B base model. 2. The best 7B Western thinking & instruct models. 3. The first 32B (or larger) fully open reasoning model. This is a big…

natolambert's tweet image. We present Olmo 3, our next family of fully open, leading language models.
This family of 7B and 32B models represents:

1. The best 32B base model.
2. The best 7B Western thinking &amp; instruct models.
3. The first 32B (or larger) fully open reasoning model.

This is a big…

srk 已转帖

Deedy

@deedydas

年11月13日

If you feel like giving up, you must read this never-before-shared story of the creator of PyTorch and ex-VP at Meta, Soumith Chintala. > from hyderabad public school, but bad at math > goes to a "tier 2" college in India, VIT in Vellore > rejected from all 12 universities for…

deedydas's tweet image. If you feel like giving up, you must read this never-before-shared story of the creator of PyTorch and ex-VP at Meta, Soumith Chintala.

&gt; from hyderabad public school, but bad at math
&gt; goes to a "tier 2" college in India, VIT in Vellore
&gt; rejected from all 12 universities for…

srk 已转帖

Ashutosh Maheshwari

@asmah2107

年11月3日

Model serving patterns I recommend mastering. Bookmark this >Online Serving >Batch zerving >Real-Time Inference >Async Inference >Model Ensembling >Multi-Model Routing >GPU/TPU Offloading >Auto-Scaling Inference >Latency Optimization >Quantized Model Serving >Model Caching…

asmah2107's tweet image. Model serving patterns I recommend mastering.

Bookmark this

&gt;Online Serving
&gt;Batch zerving
&gt;Real-Time Inference
&gt;Async Inference
&gt;Model Ensembling
&gt;Multi-Model Routing
&gt;GPU/TPU Offloading
&gt;Auto-Scaling Inference
&gt;Latency Optimization
&gt;Quantized Model Serving
&gt;Model Caching…

srk 已转帖

Render

@render

年10月30日

Debugging infra at scale is rarely about one big “aha” moment. In our latest engineering blog post, Brian Stack (github.com/imbstack/) recounts his journey through the "Kubernetes hypercube of bad vibes" and how one small flag change led to a significant impact.…

srk 已转帖

Archie Sengupta

@archiexzzz

年11月2日

built my own vector db from scratch with - linear scan, kd_tree, hsnw, ivf indexes just to understand things from first principles. all the way from: > recursive BST insertion with d cycling split > hyperplan perpendicular splitting to axis at depth%d > bound and branch pruning…

archiexzzz's tweet image. built my own vector db from scratch with - linear scan, kd_tree, hsnw, ivf indexes just to understand things from first principles.

all the way from:
&gt; recursive BST insertion with d cycling split
&gt; hyperplan perpendicular splitting to axis at depth%d
&gt; bound and branch pruning…

srk 已转帖

Mrinal

@Hi_Mrinal

年10月31日

This is my go to tool for collecting resources across all domains I work here's a simple prompt to fetch some GOATED articles about a certain I am currently working in from substack and medium

Hi_Mrinal's tweet image. This is my go to tool for collecting resources across all domains I work

here's a simple prompt to fetch some GOATED articles about a certain I am currently working in from substack and medium

srk 已转帖

Femke Plantinga

@femke_plantinga

年10月29日

BM25 powers billions of searches daily. But 90% of developers can't explain how it actually ranks results. 𝗕𝗠𝟮𝟱𝗙 is the algorithm that powers keyword search in most modern search engines. Here's a super simple breakdown of how BM25 works: • 𝗧𝗲𝗿𝗺 𝗙𝗿𝗲𝗾𝘂𝗲𝗻𝗰𝘆…

femke_plantinga's tweet image. BM25 powers billions of searches daily.

But 90% of developers can't explain how it actually ranks results.

𝗕𝗠𝟮𝟱𝗙 is the algorithm that powers keyword search in most modern search engines.

Here's a super simple breakdown of how BM25 works:

• 𝗧𝗲𝗿𝗺 𝗙𝗿𝗲𝗾𝘂𝗲𝗻𝗰𝘆…

srk 已转帖

Elliot Arledge

@elliotarledge

年10月24日

if this is you i would take the following very seriously: what worked for me was teaching. you get obsessed with a paper or architecture or whatever it is and have the pain point of not being able to find good resources on it. whether it be that for LLMs 3 years ago, or CUDA one…

Gaurav Pandey

@Gauravkirket

年10月24日

@elliotarledge Any suggestions about projects, papers that I should study to give myself a good chance in ML field? I have intermediate level knowledge of about most of ML related domains.

srk 已转帖

Ashutosh Maheshwari

@asmah2107

年10月20日

Don’t overthink AI agents. > Learn Chain-of-Thought (CoT) > Learn Tree of Thoughts (ToT) > Learn ReAct Framework > Learn Self-Correction / Reflection > Learn Function Calling & Tool Use > Learn Planning Algorithms (LLM+P) > Learn Long-term Memory Architectures > Learn…

srk 已转帖

Phil Eaton

@eatonphil

年10月13日

My favorite technical blogs

srk 已转帖

Ahmad

@TheAhmadOsman

年10月10日

- build an autograd engine from scratch - write a mini-GPT from scratch - implement LoRA and fine-tune a model on real data - hate CUDA at least once - cry - keep going the roadmap - 5 phases - if you already know something? skip - if you're lost? rewatch - if you’re stuck? use…

srk 已转帖

Ahmad

@TheAhmadOsman

年10月8日

i built a simple tool that makes Claude Code work with any local LLM full demo: > vLLM serving GLM-4.5 Air on 4x RTX 3090s > Claude Code generating code + docs via my proxy > 1 Python file + .env handles all requests > nvtop showing live GPU load > how it all works Buy a GPU

srk 已转帖

Ashutosh Maheshwari

@asmah2107

年10月8日

Retrieval techniques I’d learn if I wanted to build RAG systems: Bookmark this. 1.BM25 2.Dense Retrieval 3.ColBERT 4.DPR (Dense Passage Retrieval) 5.ANN Indexes (FAISS, HNSW) 6.Vector Quantization 7.Re-ranking (Cross-Encoder) 8.Late Interaction Models 9.Embedding Normalization…

srk 已转帖

Ashutosh Maheshwari

@asmah2107

年10月6日

Inference optimizations I’d study if I wanted sub-second LLM responses: Bookmark this. 1.KV-Caching 2.Speculative Decoding 3.FlashAttention 4.PagedAttention 5.Batch Inference 6.Early Exit Decoding 7.Parallel Decoding 8.Mixed Precision Inference 9.Quantized Kernels 10.Tensor…

asmah2107's tweet image. Inference optimizations I’d study if I wanted sub-second LLM responses:

Bookmark this.

1.KV-Caching
2.Speculative Decoding
3.FlashAttention
4.PagedAttention
5.Batch Inference
6.Early Exit Decoding
7.Parallel Decoding
8.Mixed Precision Inference
9.Quantized Kernels
10.Tensor…

srk 已转帖

Ashutosh Maheshwari

@asmah2107

年9月24日

You're in an ML Engineer interview at Databricks The interviewer asks "Your production chatbot's accuracy was 95% at launch. Six weeks later, user complaints are up and evals show 80%. What do you do?" You reply : "The model is wrong, we need to retrain it." Game over.…