Vivek

@cuda_optimized

powerful dreamer @iiscbangalore Interests : ai, tech, f1, cricket, music

India

於九月 2019 加入

529貼文 35位跟隨者 1千個跟隨中

你可能會喜歡

@dill_sunnyb11

@Brunot3ch

@ibne_eunus

@mdda123

@gdsttian

@bertrandcouture

@Xarahenergy

@amvitable

@notes_own

@mhanchinmani

@Tsunami70510954

@semsphy

@anujdutt92

@dokondokon

@AdityaMorolia

置頂

Vivek

@cuda_optimized

年10月13日

woke up to this!! 😱😱

cuda_optimized's tweet image. woke up to this!! 😱😱

Vivek

@cuda_optimized

4 小時

does anyone actually use linkedIn, or do we all just log in once a month to accept random connection requests from strangers.

Vivek

@cuda_optimized

7 小時

open source gives ideas. closed source takes them, scales them, hides them. fair game or just pure cheating?

Vivek

@cuda_optimized

9 小時

highlight your notes in a quick and easy way. credit : @adivekar_ -> green - quick read -> yellow - read slowly and imp -> red - read, think and understand

cuda_optimized's tweet image. highlight your notes in a quick and easy way. credit : @adivekar_
-&gt; green - quick read
-&gt; yellow - read slowly and imp
-&gt; red - read, think and understand

Vivek

@cuda_optimized

9 小時

yeah, now this makes sense.

sadernoheart

9 小時

relax guys

sadernoheart's tweet image. relax guys

Vivek

@cuda_optimized

10 小時

btw, @waitin4agi_ cooked this one!!

Vivek

@cuda_optimized

21 小時

winter arc #2 (9hrs): -> read deepseek-math paper and grpo -> finished first 8 chap of rlhfbook by @natolambert -> read @kipperrii transformer inference arithmetic -> watched @elliotarledge vid on cublas and cublasLt -> wrote sgemm & hgemm in cublas -> @karpathy nanochat

cuda_optimized's tweet image. winter arc #2 (9hrs):
-&gt; read deepseek-math paper and grpo
-&gt; finished first 8 chap of rlhfbook by @natolambert
-&gt; read @kipperrii transformer inference arithmetic
-&gt; watched @elliotarledge vid on cublas and cublasLt
-&gt; wrote sgemm &amp; hgemm in cublas
-&gt; @karpathy nanochat

Vivek

@cuda_optimized

年10月13日

does kl penalty and grad norm ultimately have the same effect on the grpo loss. if so, then why can't we just add grad norm instead of the kl penalty term.

Vivek

@cuda_optimized

年10月13日

notes on deepseek-math paper - deepseek-math-base -> pretrained model on code and math data - deepseekmath-instruct 7B -> sft using coT, poT and tool reasoning - deepseekmath-rl -> grpo on gsm8k and math questions - rl is increasing prob of correct response

cuda_optimized's tweet image. notes on deepseek-math paper
- deepseek-math-base -&gt; pretrained model on code and math data
- deepseekmath-instruct 7B -&gt; sft using coT, poT and tool reasoning
- deepseekmath-rl -&gt; grpo on gsm8k and math questions
- rl is increasing prob of correct response

Vivek

@cuda_optimized

年10月12日

winter arc #1 (9.5hrs): -> read context & sequence parallel -> failed to impl ring attn -> binge watched @willccbb vids on yt. -> went deep into deepseek - r1 and watched some vids. -> posted tweet an gpt2 impl in triton and got a like from karpathy -> overall not a bad day!!

Vivek

@cuda_optimized

年10月10日

most of the llm's today are a simp.

Vivek

@cuda_optimized

年9月26日

i love how @dwarkesh_sp is trying to convince richard sutton that next token prediction is kinda like rl

Vivek

@cuda_optimized

年9月26日

just saw the @elliotarledge yt latest vid. man that’s so deep and thoughtful on how you spoke about your highs and lows. just wanted to say your an absolute inspiration man. good things will definitely happen soon brother!! keep inspiring us with your work and time lapses!!

Vivek

@cuda_optimized

年9月24日

why chatgpt is better than google ->compression : quick answers + stores a lot information. compression ratio is very good. ->context : able to identify your problems/questions which are not there on the internet and answer specifically.

Vivek

@cuda_optimized

年9月22日

man these llm's are so good without any context i wonder what happens if we give the right context to these llm's

Vivek

@cuda_optimized

年9月22日

this is not what i expected for humans vs robots to be

vittorio

@IterIntellectus

年9月22日

I’m sorry but WHAT THE FUCK?!

Vivek

@cuda_optimized

年9月22日

sam altman has a way of answering the question without actually answering the question while doing a podcast

Vivek

@cuda_optimized

年9月21日

not sure why but after a number of responses grok tends to repeats previous answer. @xai

EvangelineRuth

@7Wrg7eU7W77Z2

Ocorse

@Ocorse95987

Francesca

@Sreorqu5967

Natasha Howell

@NatashaHow39821

Irhuidor

@Irhuidor2130

Kearm h/eng

@Nottlespike

赖勇强

@LaiYongqiang_

Eva

@eva86madhavan

Edwin Kaliwanga

@EdwinKaliw6218

Lursoshyth

@LursoshythBvZR

Teatot

@TeatotIUy1

Thabiso

@Thabiso_Mapogo2

Ahmad Hassan

@ahmadhassan_seo

Shetyez

@ShetyezQyzYXWz

Chanel

@Chanel518302

Lovetogrow

@Lovetogrow16662

Bitter

@Bitter1521657

Orange

@Cadmium_orange

David Butler

@davogones

Ines

@ines_whitesell_

Advait Paliwal

@advaitpaliwal

DaniellaOlenius

@DaniellaOl88614

Morris

@teeleth15802

Sirrip

@Sirrip244714

Make money easily

@9Rs88N2yra0LOh4

Zhengzhong Tu

@_vztu

Jayanth Ragav

@jayra137

Aditya Ramesh

@theadityaramesh

Gale

@gale_beck_

Sébastien Darses

@DarsesSebastien

faraa29caqt

@kxvci60oxioci

Lauran

@lackey92lauran

Gio at QRC

@GioQrc

Shreyas K

@Shreyask0401

Lucy

@dreggsmcgee

arya

@aryagxr

Yacine Mahdid

@yacinelearning

Vincent Weisser

@vincentweisser

sadernoheart

@sadernoheart

Extraordinary

@extraordinary

snimu

@omouamoua

Arnie Ramesh

@arnie_hacker

William Fedus

@LiamFedus

Periodic Labs

@periodiclabs

Inference

@inference_net

Simon Mo

@simon_mo_

anandmaj

@Almondgodd

Ahmad

@TheAhmadOsman

Nathan Chen

@nathancgy4

LMSYS Org

@lmsysorg

Ali Taha

@AliesTaha

Lei Mao

@matchaleimao

j4orz

@j4orz

a16z

@a16z

Deedy

@deedydas

Pranjal

@pranjalssh

The Cinéprism

@TheCineprism

tender

@tenderizzation

Hao Zhang

@haozhangml

Ying Sheng

@ying11231

Lianmin Zheng

@lm_zheng

Zhuohan Li

@zhuohan123

Woosuk Kwon

@woosuk_k

Logan Thorneloe

@loganthorneloe

Mika Senghaas

@mikasenghaas

Justus Mattern

@MatternJustus

Grad

@Grad62304977

Standard Kernel Co.

@Standard_Kernel

Nikita Bier

@nikitabier

the tiny corp

@__tinygrad__

Vivek Galatage

@vivekgalatage

Outa

@CallMeOuta

Igor Michalak

@igorjmichalak

Rathul Anand

@vendablechart

Qwen

@Alibaba_Qwen

Origin Financial

@useorigin

Marques Brownlee

@MKBHD

Anindya

@anindyadeeps

Mark Saroufim

@marksaroufim

AthenaAgent

@AthenaAgentRL

Sachin

@sachdh

SemiAnalysis

@SemiAnalysis_

Johannes Hagemann

@johannes_hage

samsja

@samsja19

Thien Tran

@gaunernst

United States 趨勢

1. D’Angelo 14.5K posts
2. Happy Birthday Charlie 86.9K posts
3. #BornOfStarlightHeeseung 55.5K posts
4. Angie Stone N/A
5. #csm217 1,627 posts
6. #tuesdayvibe 5,103 posts
7. Alex Jones 19.5K posts
8. Sandy Hook 6,235 posts
9. Pentagon 84.9K posts
10. #NationalDessertDay N/A
11. Brown Sugar 1,722 posts
12. Drew Struzan N/A
13. #PortfolioDay 5,496 posts
14. Cheryl Hines 1,689 posts
15. George Floyd 6,013 posts
16. Good Tuesday 38.9K posts
17. Taco Tuesday 12.5K posts
18. Powell 20K posts
19. Monad 214K posts
20. Riggins N/A

你可能會喜歡

Something went wrong.

Something went wrong.