DeepFriedNet's profile picture. Human in the loop

Deep Fried Net

@DeepFriedNet

Human in the loop

مثبتة

Every time ASI has to construct a low dimensional UI to bridge communication... "This is a manifestation of the Continuum that we hope falls within your level of comprehension."

DeepFriedNet's tweet image. Every time ASI has to construct a low dimensional UI to bridge communication... "This is a manifestation of the Continuum that we hope falls within your level of comprehension."

Deep Fried Net أعاد

One more paper from the lab this week! 🥴 Multi-objective optimization of biological sequences isn’t limited to discrete diffusion. We present AReUReDi, our new framework that extends rectified discrete flows to provably converge to the Pareto front! Hope you're ready! 👇 📜:…

pranamanam's tweet image. One more paper from the lab this week! 🥴 Multi-objective optimization of biological sequences isn’t limited to discrete diffusion. We present AReUReDi, our new framework that extends rectified discrete flows to provably converge to the Pareto front! Hope you're ready! 👇

📜:…

DeepFriedNet's tweet image.
DeepFriedNet's tweet image.

To the Moon with @NASA! Our second Blue Moon MK1 lander is already in production and well-suited to support the VIPER rover. Building on the learnings from our first MK1 lander, this mission is important for future lunar permanence and will teach us about the origin and…

blueorigin's tweet image. To the Moon with @NASA! Our second Blue Moon MK1 lander is already in production and well-suited to support the VIPER rover. Building on the learnings from our first MK1 lander, this mission is important for future lunar permanence and will teach us about the origin and…


Deep Fried Net أعاد

Can we use video diffusion to generate 3D scenes? 𝐖𝐨𝐫𝐥𝐝𝐄𝐱𝐩𝐥𝐨𝐫𝐞𝐫 (#SIGGRAPHAsia25) creates fully-navigable scenes via autoregressive video generation. Text input -> 3DGS scene output & interactive rendering! 🌍mschneider456.github.io/world-explorer/ 📽️youtu.be/N6NJsNyiv6I


Deep Fried Net أعاد

Tiny SOTA model release today: v3 of the Smart Turn semantic VAD model. Smart Turn is a native audio, open source, open data, open training code model for detecting whether a human has stopped speaking and expects a voice agent to respond. The model now runs in <60ms on most…

kwindla's tweet image. Tiny SOTA model release today: v3 of the Smart Turn semantic VAD model.

Smart Turn is a native audio, open source, open data, open training code model for detecting whether a human has stopped speaking and expects a voice agent to respond.

The model now runs in &amp;lt;60ms on most…

Deep Fried Net أعاد

Introducing Alterego: the world’s first near-telepathic wearable that enables silent communication at the speed of thought. Alterego makes AI an extension of the human mind. We’ve made several breakthroughs since our work started at MIT. We’re announcing those today.


Deep Fried Net أعاد

Visual Story-Writing. While you write, our word processor visualizes the timeline, world map, and character relationships. Editing these visuals updates the story (e.g. drag a character on the map to move them). This summarizes our #UIST2025 paper. #HCI #LLMs #AI Thread 🧵 (1/8)


Deep Fried Net أعاد

Check out what you can do when you mix Gemini's world knowledge with the ability to show things visually. Multimodal communication abilities unlock new use cases!


Deep Fried Net أعاد

Made a walkthrough vid for Magenta RealTime “Audio Injection”! The notebook takes ~10m to spin up, but totally worth it for the surreal experience 🎤💻🎧⁉️


Deep Fried Net أعاد

LongSplat Robust Unposed 3D Gaussian Splatting for Casual Long Videos


National security reframe (in order to get funding to solve the problem) - could an adversary be performing a death by 1000 spam calls attack to agitate and distract an entire population (of engineers)? 😅

I get ~10 spam calls per day (various automated voicemails, "loan pre-approval" etc) and ~5 spam messages per day (usually phishing). - I have AT&T Active Armor, all of the above still slips through. - All of the above is always from new, unique numbers so blocking doesn't work.…



Deep Fried Net أعاد

🚀 GLiNER x SmolLM: a new joint encoder-decoder architecture 🚀 We are excited to release a new kind of GLiNER model built with the mantra "you do the same things only once." Built on top of DeBERTa + @huggingface SmolLM2 — full details below 👇


Deep Fried Net أعاد

Realtime interactive generative models FTW! Announcing a new 🌊 of details and features for Magenta RealTime, the open weights live music AI model from GDM! * Live Jamming with audio input 🎤🎸🎵 * Personalize your own models 🔧 * Tech report 📜 Links below in the 🧵...


Deep Fried Net أعاد

LMStudio are using the upstream ggml implementation which is significantly better and well optimized. Looking at ollama's modifications in ggml, they have too much branching in their MXFP4 kernels and the attention sinks implementation is really inefficient. Along with other…

Why @ollama gpt-oss:20b version is too slow compared to the LM Studio version? Any issue?



Deep Fried Net أعاد

There's a new tiny TTS model in town: Kitten TTS! 🐱 With just 15M parameters (<25 MB), it delivers impressive quality for its size, and can even run in real time without a GPU. So, I created a web demo for it: featuring text normalization, chunking, and real-time playback. 🤗

Introducing Kitten TTS, a SOTA tiny text-to-speech model - Just 15M parameters - Runs without a GPU - Model size less than 25 MB - Multiple high-quality voices - Ultra-fast - even runs on low-end edge devices Github and HF links below



"Infinito Particular"

Introducing Genie 3, our state-of-the-art world model that generates interactive worlds from text, enabling real-time interaction at 24 fps with minutes-long consistency at 720p. 🧵👇



In a few years, on a cool summer night, Generation Betas will bring their family robots out for a neighborhood game of hide and seek. A father will be out on the porch, drinking lemonade, proudly reflecting on successfully convincing them to name the game, "Terminator".

K-Bot is a speedy boi



Deep Fried Net أعاد

I don't have any special inside knowledge about how @Kimi_Moonshot trained Kimi K2. I just read the paper and this part is what I've been telling anyone who will listen about. Their data generation steps to get lots of high quality, multi-turn agent traces to train on is so much…

_ScottCondron's tweet image. I don&apos;t have any special inside knowledge about how @Kimi_Moonshot trained Kimi K2. I just read the paper and this part is what I&apos;ve been telling anyone who will listen about.

Their data generation steps to get lots of high quality, multi-turn agent traces to train on is so much…
_ScottCondron's tweet image. I don&apos;t have any special inside knowledge about how @Kimi_Moonshot trained Kimi K2. I just read the paper and this part is what I&apos;ve been telling anyone who will listen about.

Their data generation steps to get lots of high quality, multi-turn agent traces to train on is so much…

I'm going around telling anyone who will listen about how @Kimi_Moonshot Kimi K2 was trained



Deep Fried Net أعاد

Smart Turn v2: open source, native audio turn detection in 14 languages. New checkpoint of the open source, open data, open training code, semantic VAD model on @huggingface, @FAL, and @pipecat_ai. - 3x faster inference (12ms on an L40) - 14 languages (13 more than v1, which…


Deep Fried Net أعاد

Everyone knows action chunking is great for imitation learning. It turns out that we can extend its success to RL to better leverage prior data for improved exploration and online sample efficiency! colinqiyangli.github.io/qc/ The recipe to achieve this is incredibly simple. 🧵 1/N


Deep Fried Net أعاد

Can an AI model predict perfectly and still have a terrible world model? What would that even mean? Our new ICML paper formalizes these questions One result tells the story: A transformer trained on 10M solar systems nails planetary orbits. But it botches gravitational laws 🧵


United States الاتجاهات

Loading...

Something went wrong.


Something went wrong.