Nikhil G

@nikhil_r_ghosh

inferring all over the place @anyscalecompute @raydistributed

Joined October 2025

4Posts 3Followers 75Following

Nikhil G reposted

kourosh hakhamaneshi

@CyrusHakha

Oct 28

The quality of this year’s @raydistributed summit agenda and speaker lineups is awesome🔥. Personally looking forward to these: physical AI @DrJimFan Terminal-Bench @Mike_A_Merrill PrimeIntellect & EnvHub for RL on LLMs @willccbb @johannes_hage Apple on LLM inference w/ Ray…

Nikhil G reposted

eric

@erictang000

Oct 7

SkyRL now supports Megatron! Training massive MoE models demands more than just ZeRO-3/FSDP sharding. The Megatron backend for SkyRL unlocks high throughput training with: ✅ 5D parallelism (tensor + pipeline + context + expert + data) ✅ Efficient training for 30B+ MoEs

erictang000's tweet image. SkyRL now supports Megatron!

Training massive MoE models demands more than just ZeRO-3/FSDP sharding. The Megatron backend for SkyRL unlocks high throughput training with:

✅ 5D parallelism (tensor + pipeline + context + expert + data)
✅ Efficient training for 30B+ MoEs

Nikhil G reposted

Seiji Eicher

@seiji_________

Sep 15

Prefix cache-aware routing is now available in Ray 2.49 🚀 Scaling input token-heavy workloads (like multi-turn convos & agent loops) requires maintaining prefix cache hit rate across 100s of vLLM engine replicas, and PrefixCacheAffinityRouter makes it easy. Here’s how it…

seiji_________'s tweet image. Prefix cache-aware routing is now available in Ray 2.49 🚀

Scaling input token-heavy workloads (like multi-turn convos &amp; agent loops) requires maintaining prefix cache hit rate across 100s of vLLM engine replicas, and PrefixCacheAffinityRouter makes it easy.

Here’s how it…

Nikhil G reposted

Robert Nishihara

@robertnishihara

Oct 1

Very excited to see the Tinker release! @pcmoritz and I had a chance to experiment with the API. It does a nice job of providing flexibility while abstracting away GPU handling. Here's a simple example showing how to generate synthetic data and fine tune a text to SQL model.…

robertnishihara's tweet card. Powered by Ray, Anyscale empowers AI builders to run and scale all ML and AI workloads on any cloud and on-prem.

Fine-tuning a Text-to-SQL Model with Tinker and Ray

Source: anyscale.com

Thinking Machines

@thinkymachines

Oct 1

Introducing Tinker: a flexible API for fine-tuning language models. Write training loops in Python on your laptop; we'll run them on distributed GPUs. Private beta starts today. We can't wait to see what researchers and developers build with cutting-edge open models!…