TraceOpt
@traceml_ai
Tracing and Optimizing ML workloads
Training models feels like flying blind: OOMs, idle GPUs, hidden bottlenecks. Would live observability would actually help optimize training ? Curious what you'd want live: Multi-GPU view Throughput Gradient stability Cost @PyTorch @huggingface @wandb @modal @NVIDIAAIDev
OpenAI shows how gpt-oss can autonomously beat 2048 using reinforcement learning (RL). Training was done locally with Unsloth on NVIDIA DGX Spark. You can also do it free on Colab. 🦥 OpenAI DevDay notebook: github.com/openai/gpt-oss…
Tired of “CUDA out of memory” while training? 😩 I built TraceML, a tiny open-source tool that shows GPU & memory usage live while fine-tuning PyTorch models. Now with ⏱️ step timing. github.com/traceopt-ai/tr… @PyTorch #MachineLearning #CUDA
TraceML: a lightweight tool for real-time PyTorch training memory visibility. View live in your terminal: ⚡ CPU, RAM, GPU usage ⚡ Layer-level allocations ⚡ Activation & gradient memory ⚡ Total forward/backward estimates github.com/traceopt-ai/tr… #PyTorch #DeepLearning #MLOps
🔥 My PyTorch training was slower, so I built a tiny CLI profiler to spot bottlenecks. It shows live: CPU, GPU util + mem, RAM, activation mem, gradient mem. github.com/traceopt-ai/tr… Focus: answer “why is my training slow?” Would love feedback: what to improve or add next?
United States Trends
- 1. Good Thursday 26.8K posts
- 2. Happy Friday Eve N/A
- 3. #thursdayvibes 2,078 posts
- 4. #thursdaymotivation 1,293 posts
- 5. ESPN Bet N/A
- 6. #Talus_Labs N/A
- 7. #ThursdayThoughts 1,313 posts
- 8. Lakers 84.8K posts
- 9. #River 4,303 posts
- 10. Wemby 28.6K posts
- 11. Vatican 10.6K posts
- 12. Unplanned 6,256 posts
- 13. Grapefruit 1,480 posts
- 14. Marcus Smart 6,900 posts
- 15. Captain Kangaroo 1,266 posts
- 16. Blazers 9,886 posts
- 17. Russ 12K posts
- 18. Richard 45.4K posts
- 19. Shroud 5,775 posts
- 20. Shabbat 7,301 posts
Something went wrong.
Something went wrong.