NobodyExistsOnTheInternet

@nullvaluetensor

Human Large Language model. Skills: Distill data. Training LLMs. Test and Evaluate. Rinse and repeat as required. Based in SEA.

SEA

Joined November 2023

376Posts 546Followers 87Following

NobodyExistsOnTheInternet

@nullvaluetensor

Oct 30

you know back in my days we had to sort tokens by entropy manually and it worked just fine! None of this fancy transformer stuff

NobodyExistsOnTheInternet

@nullvaluetensor

Oct 2

Something something sonnet 4.5 is just the first model to be autistic enough to point this out

Models are now smart enough to understand that any scenario like this is unrealistic and obviously fictional They know they aren't capable enough to manage autonomous mining equipment. No clever prompting can fix this

JeffLadish's tweet image. Models are now smart enough to understand that any scenario like this is unrealistic and obviously fictional
They know they aren't capable enough to manage autonomous mining equipment. No clever prompting can fix this

NobodyExistsOnTheInternet

@nullvaluetensor

Sep 18

> 3. [...] Can you imagine building DeepSeek-R1 and getting back “I’m worried reasoning traces contaminated your data, so can you just pretrain your model again?” ?????????

Justin Angel

@JustinAngel

Sep 18

The Nature DeepSeek-R1 peer review reads like Cards Against Humanity. My top 3: 1. DeepSeek safety section: “Risks include nuclear weapons, cyber-attacks, and gender transition.” Reviewer: “One of those isn’t like the other???” 2. Reviewer: “The reasoning traces from your…

JustinAngel's tweet image. The Nature DeepSeek-R1 peer review reads like Cards Against Humanity. My top 3:

1. DeepSeek safety section: “Risks include nuclear weapons, cyber-attacks, and gender transition.”
Reviewer: “One of those isn’t like the other???”

2. Reviewer: “The reasoning traces from your…

NobodyExistsOnTheInternet reposted

Justin Angel

@JustinAngel

Sep 18

NobodyExistsOnTheInternet

@nullvaluetensor

Sep 13

"Thing work. Sometimes not work because math sad. Two smart humans look at code. Used old code others already like."

NobodyExistsOnTheInternet

@nullvaluetensor

Sep 12

huggingface.co/NobodyExistsOn…

NobodyExistsOnTheInternet/K3-Q4-GGUF · Hugging Face

Source: huggingface.co

NobodyExistsOnTheInternet

@nullvaluetensor

Sep 11

Are these good/relevant takes for the question: "When do you think we will achieve AGI"

NobodyExistsOnTheInternet

@nullvaluetensor

Sep 2

It took me 2 weeks to figure out my issue trying to create kimi k2 3T was trying to make a """memory efficient""" dequanter to bf16 for kimi/deepseek. I really need to practice the scientific method more.

NobodyExistsOnTheInternet

@nullvaluetensor

Aug 29

Is it just me or is gpt-5-pro's only weakness is that it's search tool is very weak. I've been asking it for help monkeypatching some GitHub repos and in it's cots the main issue is that it's hitting rate limits ironically.

NobodyExistsOnTheInternet reposted

Nous Research

@NousResearch

Aug 26

Nous Research presents Hermes 4, our latest line of hybrid reasoning models. hermes4.nousresearch.com Hermes 4 builds on our legacy of user-aligned models with expanded test-time compute capabilities. Special attention was given to making the models creative and interesting to…