fastdaima's profile picture. problem solver

srk

@fastdaima

problem solver

置顶

little little goals: & spending 2 hours practising new things & spending 2 hours on kaggle everyday & publishing blogs twice a week & small contributions to open source & losing 2 kilos per month -> (24 kilos this year) & finding my optimum sleep cycle (through long experiments)


srk 已转帖

this is the most based and simple strategy for engineering excellence from first principles. you only need 3 steps: > identify your dream job or what you’d love to do 1 year from now. > search for job postings or look for experts of that field to find which skills you need to…

Hesamation's tweet image. this is the most based and simple strategy for engineering excellence from first principles. you only need 3 steps:
> identify your dream job or what you’d love to do 1 year from now.
> search for job postings or look for experts of that field to find which skills you need to…

srk 已转帖

Building tiny CPUs in the terminal!🤯 🧬 NanoCore — An 8-bit CPU emulator + assembler + TUI debugger 🔥 Fully minimal 256-byte memory with variable-length opcodes 🦀 Written in Rust & built with @ratatui_rs ⭐ GitHub: github.com/AfaanBilal/Nan… #rustlang #ratatui #tui #emulator


srk 已转帖

ML engineering that is essential for backend engineers > **model serving & inference APIs.** you'll eventually serve a model. understand latency vs throughput tradeoffs. know when to use REST vs gRPC for predictions. batch inference vs real-time. cold start problems are real. >…

Backend engineering that is essential for ml engineers > API & service design (REST, gRPC) > message queues & event systems (kafka / redis) > databases & caching (postgres / qdrant / memgraph / redis // personal recommendations) > async programming > observability These are…



srk 已转帖

Written in 2019, but still the most detailed blog on learning about pytorch internals.

jino_rohit's tweet image. Written in 2019, but still the most detailed blog on learning about pytorch internals.

srk 已转帖

We present Olmo 3, our next family of fully open, leading language models. This family of 7B and 32B models represents: 1. The best 32B base model. 2. The best 7B Western thinking & instruct models. 3. The first 32B (or larger) fully open reasoning model. This is a big…

natolambert's tweet image. We present Olmo 3, our next family of fully open, leading language models. 
This family of 7B and 32B models represents:

1. The best 32B base model.
2. The best 7B Western thinking & instruct models.
3. The first 32B (or larger) fully open reasoning model.

This is a big…
natolambert's tweet image. We present Olmo 3, our next family of fully open, leading language models. 
This family of 7B and 32B models represents:

1. The best 32B base model.
2. The best 7B Western thinking & instruct models.
3. The first 32B (or larger) fully open reasoning model.

This is a big…
natolambert's tweet image. We present Olmo 3, our next family of fully open, leading language models. 
This family of 7B and 32B models represents:

1. The best 32B base model.
2. The best 7B Western thinking & instruct models.
3. The first 32B (or larger) fully open reasoning model.

This is a big…

srk 已转帖

If you feel like giving up, you must read this never-before-shared story of the creator of PyTorch and ex-VP at Meta, Soumith Chintala. > from hyderabad public school, but bad at math > goes to a "tier 2" college in India, VIT in Vellore > rejected from all 12 universities for…

deedydas's tweet image. If you feel like giving up, you must read this never-before-shared story of the creator of PyTorch and ex-VP at Meta, Soumith Chintala.

> from hyderabad public school, but bad at math
> goes to a "tier 2" college in India, VIT in Vellore
> rejected from all 12 universities for…

srk 已转帖

Model serving patterns I recommend mastering. Bookmark this >Online Serving >Batch zerving >Real-Time Inference >Async Inference >Model Ensembling >Multi-Model Routing >GPU/TPU Offloading >Auto-Scaling Inference >Latency Optimization >Quantized Model Serving >Model Caching…

asmah2107's tweet image. Model serving patterns I recommend mastering.

Bookmark this

>Online Serving
>Batch zerving
>Real-Time Inference
>Async Inference
>Model Ensembling
>Multi-Model Routing
>GPU/TPU Offloading
>Auto-Scaling Inference
>Latency Optimization
>Quantized Model Serving
>Model Caching…

srk 已转帖

Debugging infra at scale is rarely about one big “aha” moment. In our latest engineering blog post, Brian Stack (github.com/imbstack/) recounts his journey through the "Kubernetes hypercube of bad vibes" and how one small flag change led to a significant impact.…


srk 已转帖

built my own vector db from scratch with - linear scan, kd_tree, hsnw, ivf indexes just to understand things from first principles. all the way from: > recursive BST insertion with d cycling split > hyperplan perpendicular splitting to axis at depth%d > bound and branch pruning…

archiexzzz's tweet image. built my own vector db from scratch with - linear scan, kd_tree, hsnw, ivf indexes just to understand things from first principles.

all the way from:
> recursive BST insertion with d cycling split
> hyperplan perpendicular splitting to axis at depth%d
> bound and branch pruning…
archiexzzz's tweet image. built my own vector db from scratch with - linear scan, kd_tree, hsnw, ivf indexes just to understand things from first principles.

all the way from:
> recursive BST insertion with d cycling split
> hyperplan perpendicular splitting to axis at depth%d
> bound and branch pruning…

srk 已转帖

This is my go to tool for collecting resources across all domains I work here's a simple prompt to fetch some GOATED articles about a certain I am currently working in from substack and medium

Hi_Mrinal's tweet image. This is my go to tool for collecting resources across all domains I work   

here's a simple prompt to fetch some GOATED articles about a certain I am currently working in from substack and medium
Hi_Mrinal's tweet image. This is my go to tool for collecting resources across all domains I work   

here's a simple prompt to fetch some GOATED articles about a certain I am currently working in from substack and medium

srk 已转帖

BM25 powers billions of searches daily. But 90% of developers can't explain how it actually ranks results. 𝗕𝗠𝟮𝟱𝗙 is the algorithm that powers keyword search in most modern search engines. Here's a super simple breakdown of how BM25 works: • 𝗧𝗲𝗿𝗺 𝗙𝗿𝗲𝗾𝘂𝗲𝗻𝗰𝘆…

femke_plantinga's tweet image. BM25 powers billions of searches daily.

But 90% of developers can't explain how it actually ranks results.

𝗕𝗠𝟮𝟱𝗙 is the algorithm that powers keyword search in most modern search engines.

Here's a super simple breakdown of how BM25 works:

• 𝗧𝗲𝗿𝗺 𝗙𝗿𝗲𝗾𝘂𝗲𝗻𝗰𝘆…

srk 已转帖

if this is you i would take the following very seriously: what worked for me was teaching. you get obsessed with a paper or architecture or whatever it is and have the pain point of not being able to find good resources on it. whether it be that for LLMs 3 years ago, or CUDA one…

@elliotarledge Any suggestions about projects, papers that I should study to give myself a good chance in ML field? I have intermediate level knowledge of about most of ML related domains.



srk 已转帖

Don’t overthink AI agents. > Learn Chain-of-Thought (CoT) > Learn Tree of Thoughts (ToT) > Learn ReAct Framework > Learn Self-Correction / Reflection > Learn Function Calling & Tool Use > Learn Planning Algorithms (LLM+P) > Learn Long-term Memory Architectures > Learn…


srk 已转帖

My favorite technical blogs

eatonphil's tweet image. My favorite technical blogs
eatonphil's tweet image. My favorite technical blogs

srk 已转帖

- build an autograd engine from scratch - write a mini-GPT from scratch - implement LoRA and fine-tune a model on real data - hate CUDA at least once - cry - keep going the roadmap - 5 phases - if you already know something? skip - if you're lost? rewatch - if you’re stuck? use…


srk 已转帖

i built a simple tool that makes Claude Code work with any local LLM full demo: > vLLM serving GLM-4.5 Air on 4x RTX 3090s > Claude Code generating code + docs via my proxy > 1 Python file + .env handles all requests > nvtop showing live GPU load > how it all works Buy a GPU


srk 已转帖

Retrieval techniques I’d learn if I wanted to build RAG systems: Bookmark this. 1.BM25 2.Dense Retrieval 3.ColBERT 4.DPR (Dense Passage Retrieval) 5.ANN Indexes (FAISS, HNSW) 6.Vector Quantization 7.Re-ranking (Cross-Encoder) 8.Late Interaction Models 9.Embedding Normalization…


srk 已转帖

Inference optimizations I’d study if I wanted sub-second LLM responses: Bookmark this. 1.KV-Caching 2.Speculative Decoding 3.FlashAttention 4.PagedAttention 5.Batch Inference 6.Early Exit Decoding 7.Parallel Decoding 8.Mixed Precision Inference 9.Quantized Kernels 10.Tensor…

asmah2107's tweet image. Inference optimizations I’d study if I wanted sub-second LLM responses:

Bookmark this.

1.KV-Caching
2.Speculative Decoding
3.FlashAttention
4.PagedAttention
5.Batch Inference
6.Early Exit Decoding
7.Parallel Decoding
8.Mixed Precision Inference
9.Quantized Kernels
10.Tensor…

srk 已转帖

You're in an ML Engineer interview at Databricks The interviewer asks "Your production chatbot's accuracy was 95% at launch. Six weeks later, user complaints are up and evals show 80%. What do you do?" You reply : "The model is wrong, we need to retrain it." Game over.…


srk 已转帖

Google's former CEO Eric Schmidt shares his weekend habit that led to billion dollar decision.


United States 趋势

Loading...

Something went wrong.


Something went wrong.