traceml_ai's profile picture. Tracing and Optimizing ML workloads

TraceOpt

@traceml_ai

Tracing and Optimizing ML workloads

Training models feels like flying blind: OOMs, idle GPUs, hidden bottlenecks. Would live observability would actually help optimize training ? Curious what you'd want live: Multi-GPU view Throughput Gradient stability Cost @PyTorch @huggingface @wandb @modal @NVIDIAAIDev


TraceOpt reposted

OpenAI shows how gpt-oss can autonomously beat 2048 using reinforcement learning (RL). Training was done locally with Unsloth on NVIDIA DGX Spark. You can also do it free on Colab. 🦥 OpenAI DevDay notebook: github.com/openai/gpt-oss…


Tired of “CUDA out of memory” while training? 😩 I built TraceML, a tiny open-source tool that shows GPU & memory usage live while fine-tuning PyTorch models. Now with ⏱️ step timing. github.com/traceopt-ai/tr… @PyTorch #MachineLearning #CUDA


TraceML: a lightweight tool for real-time PyTorch training memory visibility. View live in your terminal: ⚡ CPU, RAM, GPU usage ⚡ Layer-level allocations ⚡ Activation & gradient memory ⚡ Total forward/backward estimates github.com/traceopt-ai/tr… #PyTorch #DeepLearning #MLOps

traceml_ai's tweet image. TraceML: a lightweight tool for real-time PyTorch training memory visibility.

View live in your terminal:
⚡ CPU, RAM, GPU usage
⚡ Layer-level allocations
⚡ Activation & gradient memory
⚡ Total forward/backward estimates
github.com/traceopt-ai/tr…
#PyTorch #DeepLearning #MLOps

🔥 My PyTorch training was slower, so I built a tiny CLI profiler to spot bottlenecks. It shows live: CPU, GPU util + mem, RAM, activation mem, gradient mem. github.com/traceopt-ai/tr… Focus: answer “why is my training slow?” Would love feedback: what to improve or add next?


United States Trends

Loading...

Something went wrong.


Something went wrong.