developerlin's profile picture. Create Limitless Value with AI | Exploring AI’s Future

Frank Lin

@developerlin

Create Limitless Value with AI | Exploring AI’s Future

Frank Lin reposted

New paper 📜: Tiny Recursion Model (TRM) is a recursive reasoning approach with a tiny 7M parameters neural network that obtains 45% on ARC-AGI-1 and 8% on ARC-AGI-2, beating most LLMs. Blog: alexiajm.github.io/2025/09/29/tin… Code: github.com/SamsungSAILMon… Paper: arxiv.org/abs/2510.04871


Simple but beautiful little agent based on Kimi K2 thinking

Kimi-k2-thinking is incredible. So I built an agent to test it out, Kimi-writer. It can generate a full novel from one prompt, running up to 300 tool requests per session. Here it is creating an entire book, a collection of 15 short sci-fi stories.



Frank Lin reposted

The GPT moment for Robot Control is here? @LeCARLab 's BFM-Zero, a promptable behavioral foundation model for humanoid control trained via unsupervised reinforcement learning. One policy, countless behaviors: motion tracking, goal-reaching, and reward optimization all zero-shot


Impressive, 3Q mix

MLLMs are great at understanding videos, but struggle with spatial reasoning—like estimating distances or tracking objects across time. the bottleneck? getting precise 3D spatial annotations on real videos is expensive and error-prone. introducing SIMS-V 🤖 [1/n]



Frank Lin reposted

100M Gaussians. Streamed on a single RTX 3090. Welcome to @NVIDIAAIDev 's fVDB Reality Capture — bringing 3D Gaussian Splatting, volumetric data, and OpenUSD into one pipeline. 🎥 Full demo → youtu.be/ZnhBGmHyJqM #NVIDIA #GaussianSplatting #3D


Frank Lin reposted

🚨 Wan 2.2 Animate just got a 𝗺𝗮𝘀𝘀𝗶𝘃𝗲 upgrade! ⚡️4× faster inference 🎨Even sharper, cleaner visuals 💸Only $0.08/s for 720p


Frank Lin reposted

A 1-billion-parameter motion model trained on NVIDIA CUDA, requiring 8GB of VRAM for real-time operation. It converts webcam or video footage into real-time XYZ skeletal point data and rotation values, transmitting them to Blender, Unity, and UE for retargeting.


Frank Lin reposted

EdgeTAM, real-time segment tracker by Meta is now in @huggingface transformers with Apache-2.0 license 🔥 > 22x faster than SAM2, processes 16 FPS on iPhone 15 Pro Max with no quantization > supports single/multiple/refined point prompting, bounding box prompts


Frank Lin reposted

Can’t believe it — our Princeton AI^2 postdoc Shilong Liu @atasteoff re-built DeepSeek-OCR from scratch in just two weeks 😳 — and open-sourced it. This is how research should be done 🙌 #AI #LLM #DeepSeek #MachineLearning #Princeton @omarsar0 @PrincetonAInews @akshay_pachaar

MengdiWang10's tweet image. Can’t believe it — our Princeton AI^2 postdoc Shilong Liu @atasteoff  re-built DeepSeek-OCR from scratch in just two weeks 😳 — and open-sourced it. This is how research should be done 🙌 #AI #LLM #DeepSeek #MachineLearning #Princeton @omarsar0 @PrincetonAInews @akshay_pachaar

Discover DeepOCR: a fully open-source reproduction of DeepSeek-OCR, complete with training & evaluation code! #DeepLearning #OCR



Frank Lin reposted

MotionStream Real-Time Video Generation with Interactive Motion Controls model runs in real time on a single NVIDIA H100 GPU (29 FPS, 0.4s Latency)


Frank Lin reposted

Our paper "Vision Transformers Don't Need Trained Registers" will appear as a Spotlight at NeurIPS 2025! We uncover the mechanism behind high-norm tokens and attention sinks in ViTs, propose a training-free fix, and recently added an analytical model -- more on that below. ⬇️

Vision transformers have high-norm outliers that hurt performance and distort attention. While prior work removed them by retraining with “register” tokens, we find the mechanism behind outliers and make registers at ✨test-time✨—giving clean features and better performance! 🧵

nickhjiang's tweet image. Vision transformers have high-norm outliers that hurt performance and distort attention. While prior work removed them by retraining with “register” tokens, we find the mechanism behind outliers and make registers at ✨test-time✨—giving clean features and better performance! 🧵


Frank Lin reposted

We present MotionStream — real-time, long-duration video generation that you can interactively control just by dragging your mouse. All videos here are raw, real-time screen captures without any post-processing. Model runs on a single H100 at 29 FPS and 0.4s latency.


Frank Lin reposted

ThinkMorph: A New Leap in Multimodal Reasoning This unified model, fine-tuned on 24K high-quality interleaved reasoning traces, learns to generate progressive text-image thoughts that mutually advance reasoning. It achieves huge gains on vision-centric tasks & exhibits emergent…

HuggingPapers's tweet image. ThinkMorph: A New Leap in Multimodal Reasoning

This unified model, fine-tuned on 24K high-quality interleaved reasoning traces, learns to generate progressive text-image thoughts that mutually advance reasoning. It achieves huge gains on vision-centric tasks & exhibits emergent…

This is cool

Introducing Molview - the ipython/jupyter widget version of nano-protein-viewer🔍:



Frank Lin reposted

The Illustrated NeurIPS 2025: A Visual Map of the AI Frontier New blog post! NeurIPS 2025 papers are out—and it’s a lot to take in. This visualization lets you explore the entire research landscape interactively, with clusters, summaries, and @cohere LLM-generated explanations…


Frank Lin reposted

"I don’t think there’s any reason why a machine shouldn’t have consciousness. If you swapped out one neuron with an artificial neuron that acts in all the same ways, would you lose consciousness?" ~ Geoffrey Hinton A fascinating discussion in this video. --- From the 'The…


Frank Lin reposted

Qwen-Edit 2509 Multiple-angles LoRA. Enables camera movement commands; control up, down, left, right, rotation, and look direction; also supports switching between wide-angle and close-up shots. huggingface.co/dx8152/Qwen-Ed…

wildmindai's tweet image. Qwen-Edit 2509 Multiple-angles LoRA. Enables camera movement commands; control up, down, left, right, rotation, and look direction; also supports switching between wide-angle and close-up shots.
huggingface.co/dx8152/Qwen-Ed…
wildmindai's tweet image. Qwen-Edit 2509 Multiple-angles LoRA. Enables camera movement commands; control up, down, left, right, rotation, and look direction; also supports switching between wide-angle and close-up shots.
huggingface.co/dx8152/Qwen-Ed…

Frank Lin reposted

Just trained Qwen3-VL-2B-Instruct for an epoch on this dataset, and it already seems to improve reasoning performance across several tasks! model: huggingface.co/hbXNov/Qwen3-V…

hbXNov's tweet image. Just trained Qwen3-VL-2B-Instruct for an epoch on this dataset, and it already seems to improve reasoning performance across several tasks!

model: huggingface.co/hbXNov/Qwen3-V…

New paper 📢 Most powerful vision-language (VL) reasoning datasets remain proprietary 🔒, hindering efforts to study their principles and develop similarly effective datasets in the open 🔓. Thus, we introduce HoneyBee, a 2.5M-example dataset created through careful data…

hbXNov's tweet image. New paper 📢 Most powerful vision-language (VL) reasoning datasets remain proprietary 🔒, hindering efforts to study their principles and develop similarly effective datasets in the open 🔓. 

Thus, we introduce HoneyBee, a 2.5M-example dataset created through careful data…


Frank Lin reposted

One of our best talks yet. Thanks @a1zhang for the amazing presentation + Q&A on Recursive Language Models! If you're interested in how we can get agents to handle near-infinite contexts, this one is a must. Watch the recording here! youtu.be/_TaIZLKhfLc

askalphaxiv's tweet card. Recursive Lanugage Models w: Alex Zhang

youtube.com

YouTube

Recursive Lanugage Models w: Alex Zhang


Frank Lin reposted

ollama run qwen3-vl Ollama's engine now supports all the Qwen 3 VL models locally. 2B to 235B parameter sizes. The smaller models work exceptionally well for their size. The latest version of Ollama v0.12.7 is needed! Give it a try! 👇👇👇

ollama's tweet image. ollama run qwen3-vl

Ollama's engine now supports all the Qwen 3 VL models locally. 2B to 235B parameter sizes. 

The smaller models work exceptionally well for their size. 

The latest version of Ollama v0.12.7 is needed! 

Give it a try! 👇👇👇
ollama's tweet image. ollama run qwen3-vl

Ollama's engine now supports all the Qwen 3 VL models locally. 2B to 235B parameter sizes. 

The smaller models work exceptionally well for their size. 

The latest version of Ollama v0.12.7 is needed! 

Give it a try! 👇👇👇
ollama's tweet image. ollama run qwen3-vl

Ollama's engine now supports all the Qwen 3 VL models locally. 2B to 235B parameter sizes. 

The smaller models work exceptionally well for their size. 

The latest version of Ollama v0.12.7 is needed! 

Give it a try! 👇👇👇
ollama's tweet image. ollama run qwen3-vl

Ollama's engine now supports all the Qwen 3 VL models locally. 2B to 235B parameter sizes. 

The smaller models work exceptionally well for their size. 

The latest version of Ollama v0.12.7 is needed! 

Give it a try! 👇👇👇

Loading...

Something went wrong.


Something went wrong.