sankalp

@dejavucoder

llm and shitposting into crafting ai products and evals dm open to talk on ai engg/post-training/llm stuff

sankalp.bearblog.dev/blog/

Se unió en Octubre de 2021

41KPosts 17KSeguidores 601Siguiendo

Tal vez te guste

@dprophecyguy

@cto_junior

@sand7one

@shrihacker

@shauseth

@fuckpoasting

@ankushdharkar

@IndraAdhikary7

@filterpapi

@Gravito841

@pragdua

@Parikshit_K_

@OtakuProcess

@NirantK

@ankitiscracked

Fijado

sankalp

@dejavucoder

17 jul

you can read my latest blogpost: my experience with claude code after 2 weeks of adventure now - some lore why i started using it - it's several features - my current workflow - must know commands almost like a beginner guide or log if you will sankalp.bearblog.dev/my-claude-code…

dejavucoder's tweet image. you can read my latest blogpost: my experience with claude code after 2 weeks of adventure now

- some lore why i started using it
- it's several features
- my current workflow
- must know commands

almost like a beginner guide or log if you will

sankalp.bearblog.dev/my-claude-code…

sankalp reposteó

tokenbender

@tokenbender

2 h

one of the most interesting evals or attempts to test shape rotating powers of a language model i have seen in a while. so simple yet challenging the models in latent space.

krishna

@OccupyingM

3 h

can your llm rotate a shape inside it's head? i found out yes but it's a fucking idiot when it comes to the upper layer... why? non uniform spatial reasoning.... here's an eval to test the internal latent reasoning of your models.

OccupyingM's tweet image. can your llm rotate a shape inside it's head?

i found out yes but it's a fucking idiot when it comes to the upper layer...

why? non uniform spatial reasoning....

here's an eval to test the internal latent reasoning of your models.

sankalp

@dejavucoder

3 h

love karpathy sensei's koding koding koding energy

Andrej Karpathy

@karpathy

5 h

Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single,…

karpathy's tweet image. Excited to release new repo: nanochat!
(it's among the most unhinged I've written).

Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single,…

sankalp

@dejavucoder

5 h

read this thread an year ago but i didn't fully understand it today lol. cursor codebase indexing is clever in the sense they separate the semantic search (cloud) from actual code access (done locally). they store your codebase embeddings on cloud with reference meta-data. they…

dejavucoder's tweet image. read this thread an year ago but i didn't fully understand it today lol. cursor codebase indexing is clever in the sense they separate the semantic search (cloud) from actual code access (done locally). they store your codebase embeddings on cloud with reference meta-data. they…

Aman Sanger

@amanrsanger

24 ene 2024

An underrated part of Cursor is our codebase indexing system. It provides efficient indexing/updating without storing any code on our servers. (1/9)

sankalp reposteó

Chinmay Kak

@ChinmayKak

7 h

Introducing nanosft, a clean single file implementation of finetuning for chat style model. Loads gpt2-124M weights on nanogpt and does supervised finetuning using just pytorch. a side project that I made recently for some prep. link below :) qts/rts appericiated

ChinmayKak's tweet image. Introducing nanosft, a clean single file implementation of finetuning for chat style model. Loads gpt2-124M weights on nanogpt and does supervised finetuning using just pytorch.
a side project that I made recently for some prep. link below :)
qts/rts appericiated

sankalp

@dejavucoder

10 h

timeline cleanse

sankalp

@dejavucoder

12 h

the 'you're absolutely right' problem is similar or worse with sonnet 4.5 and more than annoyance, i find it hard to trust the output when the model says so especially for more subjective tasks

sankalp

@dejavucoder

13 h

asked a difficult technical question to a friend who got claude merch recently and he said wait a second... let me put on my thinking cap first

sankalp

@dejavucoder

12 oct

cursor for x - aimed for power users / ai augmentor lovable for x - aimed for vibe coder / non technical people this is just how i interpret it

sankalp

@dejavucoder

12 oct

looks interesting

机器之心 JIQIZHIXIN

@jiqizhixin

12 oct

Ever wondered how LLMs evolve from predicting the next token to following your instructions? Post-training 101: A hitchhiker's guide into LLM post-training This is a new guide breaks down the basics of LLM post-training, covering the full journey from pre-training to…

jiqizhixin's tweet image. Ever wondered how LLMs evolve from predicting the next token to following your instructions?

Post-training 101: A hitchhiker's guide into LLM post-training

This is a new guide breaks down the basics of LLM post-training, covering the full journey from pre-training to…

sankalp

@dejavucoder

12 oct

brilliant thread on regrowing after falling off or like just gaining more momentum/traction after being inactive here for sometime

jia

@jia_seed

8 oct

so, you might have had a period of traction. suddenly, none of your posts are hitting for weeks i’ve experienced this, but i figured out how to bounce back here’s how to regrow on twitter after falling off (thread)