#distributedreinforcementlearning 搜尋結果

The paper shows a small model trained with reinforcement learning can outperform prompt only agents on machine learning engineering. Most agents just prompt large models and search longer, but they do not learn from experience. This work instead trains a 3B Qwen model with…

rohanpaul_ai's tweet image. The paper shows a small model trained with reinforcement learning can outperform prompt only agents on machine learning engineering.

Most agents just prompt large models and search longer, but they do not learn from experience.

This work instead trains a 3B Qwen model with…

This week’s ECON & ML digest breaks down a new paper (Chi, Fan, Ghigliazza, Giannone & Wang, 2025) that brings distributional thinking to macro forecasting. 📩 Summary: open.substack.com/pub/erdeneremi… 📄 Paper: doi.org/10.48550/arXiv… #EconTwitter #MachineLearning #Forecasting #Macro

erdeneremineker's tweet image. This week’s ECON & ML digest breaks down a new paper (Chi, Fan, Ghigliazza, Giannone & Wang, 2025) that brings distributional thinking to macro forecasting.

📩 Summary: open.substack.com/pub/erdeneremi…

📄 Paper: doi.org/10.48550/arXiv…

#EconTwitter #MachineLearning #Forecasting #Macro

Since compute grows faster than the web, we think the future of pre-training lies in the algorithms that will best leverage ♾ compute We find simple recipes that improve the asymptote of compute scaling laws to be 5x data efficient, offering better perf w/ sufficient compute

kothasuhas's tweet image. Since compute grows faster than the web, we think the future of pre-training lies in the algorithms that will best leverage ♾ compute

We find simple recipes that improve the asymptote of compute scaling laws to be 5x data efficient, offering better perf w/ sufficient compute

Research Log Day 0: DiLoCo Days I decided to a thesis around distributed low-communication training. Essentially, how can we train large models efficiently across distributed nodes and not be utterly destroyed by network latency and bandwidth? (1/n)

nathanbarrydev's tweet image. Research Log Day 0: DiLoCo Days
I decided to a thesis around distributed low-communication training. Essentially, how can we train large models efficiently across distributed nodes and not be utterly destroyed by network latency and bandwidth? (1/n)

Ever wish a robot could just move to any goal in any environment—avoiding all collisions and reacting in real time? 🚀Excited to share our #CoRL2025 paper, Deep Reactive Policy (DRP), a learning-based motion planner that navigates complex scenes with moving obstacles—directly…


After another wonderful year of neural motion planning research, we are excited to report a major upgrade on our pipeline 🎉 Introducing Deep Reactive Policy (DRP) 🚀 — our #CoRL2025 paper that extends our prior work Neural MP with both generalizability and reactivity while…


Diffusion policies have demonstrated impressive performance in robot control, yet are difficult to improve online when 0-shot performance isn’t enough. To address this challenge, we introduce DSRL: Diffusion Steering via Reinforcement Learning. (1/n) diffusion-steering.github.io


We introduce a new generative model and it hits #2 on Hacker News’ daily rankings! Discrete Distribution Networks (DDN), accepted at ICLR 2025, is a simple-yet-intriguing generative model with unique properties. Why it's intriguing 👇 [1/N]


Holy shit…Diffusion just leveled up 🔥 A new paper “Diffusion Transformers with Representation Autoencoders” basically kills the VAE era. Instead of the old VAE bottleneck, they use representation autoencoders (RAEs) built from pretrained encoders like DINO or SigLIP. The…

godofprompt's tweet image. Holy shit…Diffusion just leveled up 🔥

A new paper “Diffusion Transformers with Representation Autoencoders”  basically kills the VAE era.

Instead of the old VAE bottleneck, they use representation autoencoders (RAEs) built from pretrained encoders like DINO or SigLIP.

The…

🚨 Decentralized reinforcement learning (DRL) 🤖. I get excited when I learn new terms and that will usually send me down a rabbit hole of research. @FractionAI_xyz uses DRL in its training. As soon as I heard about decentralized reinforcement training, I had to deep dive.…

Cryptking_1's tweet image. 🚨   Decentralized reinforcement learning (DRL) 🤖.

I get excited when I learn new terms and that will usually send me down a rabbit hole of research.

 @FractionAI_xyz uses DRL in its training. As soon as I heard about decentralized reinforcement training, I had to deep dive.…
Cryptking_1's tweet image. 🚨   Decentralized reinforcement learning (DRL) 🤖.

I get excited when I learn new terms and that will usually send me down a rabbit hole of research.

 @FractionAI_xyz uses DRL in its training. As soon as I heard about decentralized reinforcement training, I had to deep dive.…
Cryptking_1's tweet image. 🚨   Decentralized reinforcement learning (DRL) 🤖.

I get excited when I learn new terms and that will usually send me down a rabbit hole of research.

 @FractionAI_xyz uses DRL in its training. As soon as I heard about decentralized reinforcement training, I had to deep dive.…

Introducing d1🚀 — the first framework that applies reinforcement learning to improve reasoning in masked diffusion LLMs (dLLMs). Combining masked SFT with a novel form of policy gradient algorithm, d1 significantly boosts the performance of pretrained dLLMs like LLaDA.

siyan_zhao's tweet image. Introducing d1🚀 — the first framework that applies reinforcement learning to improve reasoning in masked diffusion LLMs (dLLMs).

Combining masked SFT with a novel form of policy gradient algorithm, d1 significantly boosts the performance of pretrained dLLMs like LLaDA.

🚗 Sim-to-Real Application of Reinforcement Learning Agents for Autonomous, Real Vehicle Drifting Read: mdpi.com/2624-8921/6/2/… #AutonomousDrifting #ReinforcementLearning

Vehicles_MDPI's tweet image. 🚗 Sim-to-Real Application of Reinforcement Learning Agents for Autonomous, Real Vehicle Drifting

Read: mdpi.com/2624-8921/6/2/…

#AutonomousDrifting #ReinforcementLearning

Approaches in decentralised training challenges Decentralized training encounters issues with communication, fault intolerance and subpar scaling. These challenges need to be overcome for decentralized networks to be able to provide the compute required for training large…

SPTN007's tweet image. Approaches in decentralised training challenges 

Decentralized training encounters issues with communication, fault intolerance and subpar scaling. These challenges need to be overcome for  decentralized networks to be able to provide the compute required for training large…

We just put out a key step for making distributed training work at larger and larger models: Scaling Laws for DiLoCo TL;DR: We can do LLM training across datacenters in a way that scales incredibly well to larger and larger models!

MatharyCharles's tweet image. We just put out a key step for making distributed training work at larger and larger models: Scaling Laws for DiLoCo

TL;DR: We can do LLM training across datacenters in a way that scales incredibly well to larger and larger models!

Introduce 3D Diffusion Policy (DP3), a simple visual imitation learning algorithm that achieves: - 55.3% relative improvement on 72 simulated tasks, most with 10 demos - 85% success rates on 4 real-world tasks, with 40 demos🥟🌯 Open-sourced! Code/Data: 3d-diffusion-policy.github.io


No limits, no boundaries. The infinite datacenter. LLMs, image gen, text-to-speech + more Distribute.ai powered by $DIS


Introducing RND1, the most powerful base diffusion language model (DLM) to date. RND1 (Radical Numerics Diffusion) is an experimental DLM with 30B params (3B active) with a sparse MoE architecture. We are making it open source, releasing weights, training details, and code to…


未找到 "#distributedreinforcementlearning" 的結果
Loading...

Something went wrong.


Something went wrong.


United States Trends