neural_morty's profile picture. trying to impress rick

neural_morty

@neural_morty

trying to impress rick

I don't know shit dawg.


neural_morty reposted

the whole idea of nvidia dlss can be applied to webrtc, essentially using VFI to fill in the gaps of jitters you experince due to packet loss, but the model will a lot of computation and it has to be quick as fuck, the gap between detecting the last ack bit and generating ....


What tf is he onto.

I joined xAI to manifest AI’s acceleration of scientific discovery. With the release of Grok 4, I now feel we have the foundation upon which we can Understand the Universe. The techniques that brought you Grok 4 will compound the next scaling paradigms, and humanity’s…



neural_morty reposted

Grok 4 (Thinking) achieves new SOTA on ARC-AGI-2 with 15.9% This nearly doubles the previous commercial SOTA and tops the current Kaggle competition SOTA

arcprize's tweet image. Grok 4 (Thinking) achieves new SOTA on ARC-AGI-2 with 15.9%

This nearly doubles the previous commercial SOTA and tops the current Kaggle competition SOTA

It's just stitching shit up and then optimizing, then realise a week later that it could've been done so much better.


Yet again working with hybrid models.


Till now i have been asked to clone designmumbai.com i will be using react, the website was made on some CMS and i have been asked to completely shift that to Node. Will be using Redis and Kafka for caching and low latency operations and distribution respectively.

neural_morty's tweet image. Till now i have been asked to clone designmumbai.com
i will be using react, the website was made on some CMS and i have been asked to completely shift that to Node. 
Will be using Redis and Kafka for caching and low latency operations and distribution respectively.

i am high, i see gradients on the perfectly round butts


neural_morty reposted

ablations underscore the core idea: it's not about GRUs. in appendix D, we swap the GRU for a TCN - same [CLS]/sequence split, same top-1 gate. performance holds. routing adapts. hecto is a modular scaffold. the expert is a choice. a complete plug and play module.


neural_morty reposted

hecto was trained in a controlled setting - 5 epochs with batch size 16. even then, it holds steady on AG News, SST-2, and HotpotQA, trailing homogeneous MoEs by <1%. at batch size 64, performance improves significantly - validating the design under scale.

sanskxr02's tweet image. hecto was trained in a controlled setting - 5 epochs with batch size 16.
even then, it holds steady on AG News, SST-2, and HotpotQA, trailing homogeneous MoEs by &amp;lt;1%.
at batch size 64, performance improves significantly -  validating the design under scale.

neural_morty reposted

we set out to rethink how models allocate reasoning. today we’re releasing hecto,a modular mixture-of-experts model combining GRU and FFNN to specialize computation per input. no supervision. no backing. just a team of undergrads building what didn’t exist.

sanskxr02's tweet image. we set out to rethink how models allocate reasoning.
today we’re releasing hecto,a modular mixture-of-experts model combining GRU and FFNN to specialize computation per input.
no supervision. no backing. just a team of undergrads building what didn’t exist.

United States Trends

Loading...

Something went wrong.


Something went wrong.