#reinforcementlearning 搜尋結果
PROF🌀Right answer, flawed reason?🤔🌀 📄arxiv.org/pdf/2509.03403 Excited to share our work: PROF-PRocess cOnsistency Filter! 🚀 Challenge: ORM is blind to flawed logic, and PRM suffers from reward hacking. Our method harmonizes strengths of PRM & ORM. #LLM #ReinforcementLearning


🚗 Sim-to-Real Application of Reinforcement Learning Agents for Autonomous, Real Vehicle Drifting Read: mdpi.com/2624-8921/6/2/… #AutonomousDrifting #ReinforcementLearning

UserRL: Training Interactive User-Centric Agent via Reinforcement Learning Qian et al.: arxiv.org/abs/2509.19736 #ArtificialIntelligence #DeepLearning #ReinforcementLearning

Deep #ReinforcementLearning Hands-On — Practical easy-to-follow guide to RL from Q-learning and DQNs to PPO and RLHF: amzn.to/3MV9o60 [3rd Edition] v/ @PacktDataML —— #AI #MachineLearning #DeepLearning #DataScience #DataScientist —— 𝓚𝓮𝔂 𝓕𝓮𝓪𝓽𝓾𝓻𝓮𝓼: 🟢Learn with…
![KirkDBorne's tweet image. Deep #ReinforcementLearning Hands-On — Practical easy-to-follow guide to RL from Q-learning and DQNs to PPO and RLHF: amzn.to/3MV9o60 [3rd Edition] v/ @PacktDataML
——
#AI #MachineLearning #DeepLearning #DataScience #DataScientist
——
𝓚𝓮𝔂 𝓕𝓮𝓪𝓽𝓾𝓻𝓮𝓼:
🟢Learn with…](https://pbs.twimg.com/media/GyBwO_CWwAEu3YW.jpg)
Scientists have designed a #ReinforcementLearning-based framework that enables multiple robot arms to perform up to 40 tasks simultaneously without colliding in a crowded workspace. @GoogleDeepMind Learn more in Science #Robotics: scim.ag/3JIh7WF
#ReinforcementLearning foundational book (2nd edition of this classic): amzn.to/3UtbeAa ————— #DataScience #AI #MachineLearning #ML #DeepLearning #DataMining #Mathematics #Gamification

[1/4] 🚀 We’re excited to announce the v1 release of JaxAHT – a new library for Ad Hoc Teamwork (AHT) research, built with JAX for speed & scalability! Check it out 👉 larg.github.io/jax-aht #AI #MARL #ReinforcementLearning #JAX #AdHocTeamwork
Day 12 🦾 of becoming an ML Beast: Explored Reinforcement Learning – where an agent interacts with an environment, takes actions, and learns from rewards to improve decisions over time. #MachineLearning #ReinforcementLearning #AI #Learninginpublic #100daysofcoding


Enabling robots to improve autonomously via RL will be powerful, and dense shaping rewards can greatly facilitate RL. Our #IROS2025 paper presents a method leveraging VLMs to derive dense rewards for efficient autonomous RL. ⚡🦾 #Robotics #ReinforcementLearning 🧵1/5
Scientists have developed a method based on #ReinforcementLearning that enables a robot to use its upper body to lift and flip a water jug. @ToyotaResearch Learn more in Science #Robotics: scim.ag/4oK6qmt
In the Age of AI, start from First Principles. Unlock bottom-up design. Solve classes of problems, not isolated features. Think systems, not silos. Solve fundamentally. Scale exponentially. #AI #AIAgents #ReinforcementLearning #RAG #KnowledgeGraph #Orchestration

7/10 Reinforcement Learning trains agents through trial and error to maximize rewards. It’s used in gaming, robotics, and real-time decision systems like traffic control. #ReinforcementLearning #AI #SmartSystems #DeepLearning #GameAI #AutonomousTech

ロボットアーム装着 四足歩行ロボットが 腕と脚を組み合わせて動かし、移動しながら物体を操作する rai-inst.com/resources/blog… #ReinforcementLearning #framework #WholeBody #flexible #LocoManipulation #ReLIC #RAIInstitute
RL playgrounds 🚀🔨🔨 I am playing with the Unity ML agents (which isnt even very recent). The possibilities are insane. From simple tasks to complex challenges, AI agents are leveling up. #ReinforcementLearning #AI #Unity

Introduction to various #ReinforcementLearning #Algorithms: bit.ly/2UPHbSj ————— #DataScience #AI #MachineLearning #ML #DeepLearning #DataMining #Mathematics #Gamification ————— + See this foundational book (2nd edition): amzn.to/3UtbeAa

Inverse reinforcement learning infers reward functions from observing expert behavior. Instead of programming rewards, AI learns what experts value by watching them. Learning goals from demonstrations. #ReinforcementLearning #InverseRL #LearningObjectives andrewroche.ai/ai-reinforceme…
アスリートのように考え、計画し、動くロボット自転車 rai-inst.com/resources/blog… パルクールの機動性と、どんなに複雑な地形も知覚して計画し、ナビゲートする知性を兼ね備える #ReinforcementLearning #UltraMobileVehicle #UMV #JumpingBicycle #RAI_Institute
🚗 Sim-to-Real Application of Reinforcement Learning Agents for Autonomous, Real Vehicle Drifting Read: mdpi.com/2624-8921/6/2/… #AutonomousDrifting #ReinforcementLearning

Inverse reinforcement learning infers reward functions from observing expert behavior. Instead of programming rewards, AI learns what experts value by watching them. Learning goals from demonstrations. #ReinforcementLearning #InverseRL #LearningObjectives andrewroche.ai/ai-reinforceme…
#RA3: Mid-Training with Temporal Action Abstractions for Faster #ReinforcementLearning (RL) Post-Training in Code #LLMs buff.ly/LXLMNyi

Congrats to the team for building Webscale-RL, an automated data pipeline that turns pretraining corpora into verifiable RL data at trillion-token scale. Excited to see this bridge the gap between pretraining and RL! @SFResearch #AI #LLMs #ReinforcementLearning #MachineLearning
📣 Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels 📣 RL for LLMs faces a critical data bottleneck: existing RL datasets are <10B tokens while pretraining uses >1T tokens. Our Webscale-RL pipeline solves this by automatically converting pretraining…

Reinforcement learning is a transformative AI approach where machines learn through trial and error, akin to human learning. Unlike traditional programming, it allows AI systems to... #ReinforcementLearning #ArtificialIntelligence #MachineLearning youaccel.com/blog/reinforce…
Finished chapters 1–8 of Sutton & Barto’s Reinforcement Learning. Reading alongside Stanford’s CS234 lectures, great combo so far. Any recs for what to read next once I finish? #reinforcementlearning
(Open Access) Distributional Reinforcement Learning: freecomputerbooks.com/Distributional… Look for "Read and Download Links" section to download. Follow me if you like this post. #ReinforcementLearning #MachineLearning #DeepLearning #LLMs #GenAI #GenerativeAI #NeuralNetworks

Postdoctoral Researcher in Reinforcement Learning 📍ETH Zurich, Institute of Machine Learning, Switzerland Online, Multi-Agent & Human Feedback RL. Apply now: researchhires.com/position/ea709… #Postdoc #ReinforcementLearning #AIResearch #MachineLearning #AcademicJobs #ResearchHires
(Open Access) Reinforcement Learning: An Introduction - freecomputerbooks.com/Reinforcement-… Look for "Read and Download Links" section to download. Follow me if you like this post. #ReinforcementLearning #MachineLearning #DeepLearning #LLMs #GenAI #GenerativeAI #NeuralNetworks

📉 Errors aren’t failures; they’re learning vectors. Talus Labs trains agents to evolve through error feedback. @Talus_Labs #ReinforcementLearning #AI
“Usually, the first idea doesn't check out. That's just the nature of #AIresearch.” For Derek Li, Senior Researcher at Noah's Ark Lab, that first idea was training all multitask #reinforcementlearning objectives together. What went wrong: Runs underperformed baselines, with…
Every swap, limit order, cross-chain action, it’s input to the #ReinforcementLearning engine. In #Deluthium, your request becomes part of the learning feedback loop. No black boxes. Full transparency. Brought to you by the Onchain Flash Boys. Powered by RL.

Reinforcement learning AI skills are advancing quickly, creating a split in tech capabilities. Leaders ignoring this risk falling behind. Prioritize where AI momentum builds. reyemte.ch/jrh #AI #ReinforcementLearning
📉 Errors aren’t failures; they’re learning vectors. Talus Labs trains agents to evolve through error feedback. @Talus_Labs #ReinforcementLearning #AI
⚙️ A model becomes intelligent when it changes itself. Talus Labs programs that reflexivity into machine agents. @Talus_Labs #ReinforcementLearning #AI
💫 Intelligence scales when feedback deepens. Talus Labs builds recursive systems that learn through their own outputs. @Talus_Labs #ReinforcementLearning #AI
Meet PASTA: An RL agent that refines text-to-image results collaboratively, reducing trial-and-error. 🤖🎨 #PASTA #ReinforcementLearning
PROF🌀Right answer, flawed reason?🤔🌀 📄arxiv.org/pdf/2509.03403 Excited to share our work: PROF-PRocess cOnsistency Filter! 🚀 Challenge: ORM is blind to flawed logic, and PRM suffers from reward hacking. Our method harmonizes strengths of PRM & ORM. #LLM #ReinforcementLearning


7/10 Reinforcement Learning trains agents through trial and error to maximize rewards. It’s used in gaming, robotics, and real-time decision systems like traffic control. #ReinforcementLearning #AI #SmartSystems #DeepLearning #GameAI #AutonomousTech

UserRL: Training Interactive User-Centric Agent via Reinforcement Learning Qian et al.: arxiv.org/abs/2509.19736 #ArtificialIntelligence #DeepLearning #ReinforcementLearning

Day 12 🦾 of becoming an ML Beast: Explored Reinforcement Learning – where an agent interacts with an environment, takes actions, and learns from rewards to improve decisions over time. #MachineLearning #ReinforcementLearning #AI #Learninginpublic #100daysofcoding


Today on the blog we propose Action-Based Contrastive Self-Training, a data-efficient #ReinforcementLearning tuning approach for improving multi-turn conversation modeling in mixed-initiative LLM interaction. Read all about it →goo.gle/3Sxas2T

RL playgrounds 🚀🔨🔨 I am playing with the Unity ML agents (which isnt even very recent). The possibilities are insane. From simple tasks to complex challenges, AI agents are leveling up. #ReinforcementLearning #AI #Unity

#ReinforcementLearning foundational book (2nd edition of this classic): amzn.to/3UtbeAa ————— #DataScience #AI #MachineLearning #ML #DeepLearning #DataMining #Mathematics #Gamification

Introduction to various #ReinforcementLearning #Algorithms: bit.ly/2UPHbSj ————— #DataScience #AI #MachineLearning #ML #DeepLearning #DataMining #Mathematics #Gamification ————— + See this foundational book (2nd edition): amzn.to/3UtbeAa

A Tutorial on Meta-Reinforcement Learning Beck et al.: arxiv.org/abs/2301.08028 #ArtificialIntelligence #MetaLearning #ReinforcementLearning

Intro: Deep #ReinforcementLearning. #BigData #Analytics #DataScience #AI #MachineLearning #IoT #IIoT #PyTorch #Python #RStats #TensorFlow #Java #ReactJS #GoLang #CloudComputing #Serverless #DataScientist #Linux #Programming #Coding #100DaysofCode geni.us/DeepRL




Reinforcing General Reasoning without Verifiers Zhou et al.: arxiv.org/abs/2505.21493 #ArtificialIntelligence #DeepLearning #ReinforcementLearning

Solving a Rubik’s Cube with #ReinforcementLearning. #BigData #Analytics #DataScience #AI #MachineLearning #IoT #IIoT #Python #RStats #Java #JavaScript #ReactJS #CloudComputing #Serverless #DataScientist #Linux #Programming #Coding #100DaysofCode bit.ly/44UHZYf



🚀 Exciting News! Our paper has been accepted at @NeurIPSConf! 🎉 We introduce State Chrono Representation (SCR) -- a novel approach in #ReinforcementLearning. SCR integrates long-term temporal dynamics and cumulative rewards into state representations, addressing key challenges…

Automated Design of Agentic Systems Shengran Hu, Cong Lu, Jeff Clune: arxiv.org/abs/2408.08435 #ArtificialIntelligence #DeepLearning #ReinforcementLearning

#QuantumComputing: Deep #ReinforcementLearning. #BigData #Analytics #DataScience #AI #MachineLearning #IoT #IIoT #Python #RStats #TensorFlow #JavaScript #ReactJS #CloudComputing #Serverless #DataScientist #Linux #Programming #Coding #100DaysofCode geni.us/QC-Deep-Learni…

The Bitter Lesson "Search and learning are general purpose methods that continue to scale with increased computation, even as the available computation becomes very great." — Richard Sutton Rich Sutton: incompleteideas.net/IncIdeas/Bitte… #ReinforcementLearning

Had an incredible time yesterday at #TECHMEET Abeokuta 3.0! 🎉 Spoke about the future of AI and why #ReinforcementLearning deserves more attention. Great discussions on how can AI can be leveraged and the positive impact on future careers! 🚀 #AI #TechEvent #Abeokuta




PowerScale Deep Learning Infrastructure with NVIDIA Systems for Autonomous Driving with #ReinforcementLearning! #BigData #Analytics #DataScience #AI #MachineLearning #IoT #IIoT #PyTorch #Python #RStats #TensorFlow #Java #JavaScript #ReactJS #GoLang #CloudComputing #Serverless…



🚀 New Survey: Reinforcement Learning in Vision We review 200+ works spanning MLLMs, visual generation, unified models & VLA — from RLHF to GRPO & RLVR. 🔗 Paper: arxiv.org/abs/2508.08189 🔗 Resources: github.com/weijiawu/Aweso… #AI #ReinforcementLearning #ComputerVision #Survey

Something went wrong.
Something went wrong.
United States Trends
- 1. Branch 35.8K posts
- 2. Chiefs 110K posts
- 3. Red Cross 46.1K posts
- 4. Mahomes 34.2K posts
- 5. #LaGranjaVIP 78.7K posts
- 6. Binance DEX 5,198 posts
- 7. #TNABoundForGlory 57.9K posts
- 8. Rod Wave 1,499 posts
- 9. #LoveCabin 1,293 posts
- 10. Air Force One 49.7K posts
- 11. Bryce Miller 4,553 posts
- 12. Dan Campbell 4,009 posts
- 13. Eitan Mor 12.4K posts
- 14. Goff 13.9K posts
- 15. #OnePride 6,449 posts
- 16. LaPorta 11.5K posts
- 17. Alon Ohel 12.5K posts
- 18. Kelce 16.9K posts
- 19. Tel Aviv 55.2K posts
- 20. Tom Homan 79.5K posts