panda_hacker_vc's profile picture.

panda hacker

@panda_hacker_vc

panda hacker reposted

Introducing IndQA — a new benchmark that evaluates how well AI systems understand Indian languages and everyday cultural context. openai.com/index/introduc…


panda hacker reposted

Hybrid models like Qwen3-Next, Nemotron Nano 2 and Granite 4.0 are now fully supported in vLLM! Check out our latest blog from the vLLM team at IBM to learn how the vLLM community has elevated hybrid models from experimental hacks in V0 to first-class citizens in V1. 🔗…

PyTorch's tweet image. Hybrid models like Qwen3-Next, Nemotron Nano 2 and Granite 4.0 are now fully supported in vLLM!  Check out our latest blog from the vLLM team at IBM to learn how the vLLM community has elevated hybrid models from experimental hacks in V0 to first-class citizens in V1.

🔗…

panda hacker reposted

🚀 Introducing Emu3.5 — a large-scale multimodal world model that natively predicts the next vision-language state. 🔥 Trained on over 10T interleaved vision-language tokens and enhanced with reinforcement learning, Emu3.5 achieves powerful multimodal reasoning and generation.…


panda hacker reposted

Training LLMs end to end is hard. Very excited to share our new blog (book?) that cover the full pipeline: pre-training, post-training and infra. 200+ pages of what worked, what didn’t, and how to make it run reliably huggingface.co/spaces/Hugging…

eliebakouch's tweet image. Training LLMs end to end is hard. Very excited to share our new blog (book?) that cover the full pipeline: pre-training, post-training and infra. 200+ pages of what worked, what didn’t, and how to make it run reliably

huggingface.co/spaces/Hugging…

United States Trends

Loading...

Something went wrong.


Something went wrong.