
Satya Mallick
@LearnOpenCV
CEO, http://OpenCV.org. Course Director, http://OpenCV.org/Courses Entrepreneur. Ph.D. ( Computer Vision & Machine Learning ). Author: http://LearnOpenCV.com
내가 좋아할 만한 콘텐츠
Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single,…

📢VideoRAG: Redefining Long-Context Video Comprehension In this week’s deep dive, we explore another interesting approach for performing RAG on videos. VideoRAG, a groundbreaking framework that brings RAG to the world of extremely long videos. Unlike traditional LVLMs that…
San Diego Investors👇🏽 AI Beyond the Buzz: Smarter Money, Leaner Work, Safer Decisions in 2025 by Satya Mallick @LearnOpenCV Come on over to San Diego meeting this Saturday 11th October at 9am hosted by AAII San Diego!! <Details in link below> Want to know how AI is disrupting…

The more you show your 💜, the more we keep it coming every week :) +600 people have signed up for today's @SensAIHackademy hands-on workshop, starting in 2 hours, with Christoph Spinger from Ballee and @LearnOpenCV from @opencvlive ! And here is another interesting workshop…

📢The Ultimate Guide To VLM Evaluation Metrics, Datasets, And Benchmarks Evaluating Vision-Language Models (VLMs) is more than just checking accuracy. How would you know if your model understands a scene or is just hallucinating? Our new, comprehensive guide on LearnOpenCV…

The 3rd edition of my book Deep Learning with Python is being printed right now, and will be in bookstores within 2 weeks. You can order it now from Amazon or from Manning. This time, we're also releasing the whole thing as a 100% free website. I don't care if it reduces book…

📢 Getting Started with VLM on Jetson Nano Tiny Vision Language Models (VLMs) like Moondream2, LiquidAI’s LFM2-VL, Apple’s FastVLM, and Huggingface’s SmolVLM2 are bringing vision-language capabilities to the edge. In this tutorial, LearnOpenCV demonstrates how to deploy and run…
📢New Post Alert: 📙VLM on Edge Devices: Worth the Hype or Just a Novelty? The rise of Vision Language Models (VLMs) has been meteoric but can they really run effectively on edge devices? Our latest post is first in the series of experiments that we will continue for VLMs on…
📢AnomalyCLIP: Harnessing CLIP for Weakly-Supervised Video Anomaly Recognition In this week’s deep dive, we explore AnomalyCLIP, the first method to adapt CLIP’s vision–language latent space for Video Anomaly Recognition (VAR) under weak supervision. We break down how it learns…
📢AI for Video Understanding: From Content Moderation to Summarization In this blog post, we explore how to build a practical pipeline for AI-powered video understanding. We look at two main applications: video content moderation using CLIP and Gemini, and video summarization…
📢DINOv3: Scaling Self-Supervised Learning for Vision Foundation Models (Meta AI) DINOv3 is a next-generation vision foundation model trained purely with self-supervised learning. It introduces innovations that allow robust dense feature learning at scale with models reaching 7B…
📢☑️Video-RAG: Training-Free Retrieval for Long-Video LVLMs In this week’s deep dive, we implement Video-RAG as a training-free, single-pass pipeline and integrate it with LLaVA-Video-7B (Qwen2, 32K context), without APE - to keep things reproducible on today’s stacks. We enable…
Created this video using a single image using grok. Quite impressive
Huge computer science result: A Tsinghua professor JUST discovered the fastest shortest path algorithm for graphs in 40yrs. This improves on Turing award winner Tarjan’s O(m + nlogn) with Dijkstra’s, something every Computer Science student learns in college.

One word: relentless. just in the past two weeks, we’ve shipped: 🌐 Genie 3 - the most advanced world simulator ever 🤔 Gemini 2.5 Pro Deep Think available to Ultra subs 🎓 Gemini Pro free for uni students & $1B for US ed 🌍 AlphaEarth - a geospatial model of the entire planet…
📢Object Detection and Spatial Understanding with VLMs ft. Qwen2.5-VL Object Detection used to mean bounding boxes and pre-trained classes. Now? You can upload an image and ask: “What brand are the sneakers the person on the left is wearing?” Welcome to the world of…
Huge thanks to all the open source projects that've made a lot of the tech we rely on in the world possible: Linux Git FFmpeg PyTorch & TensorFlow Apache & Nginx MySQL, PostgreSQL, SQLite Chromium & Firefox GCC & LLVM Docker & Kubernetes Also, all the open-weight LLMs... and…
United States 트렌드
- 1. phil 32.4K posts
- 2. Columbus 191K posts
- 3. PHAN 55.5K posts
- 4. President Trump 1.21M posts
- 5. Middle East 298K posts
- 6. Thanksgiving 58.7K posts
- 7. Cam Talbot N/A
- 8. Brian Callahan 12.1K posts
- 9. #IndigenousPeoplesDay 15.1K posts
- 10. Azzi 9,992 posts
- 11. Titans 39.2K posts
- 12. Macron 234K posts
- 13. Vrabel 7,036 posts
- 14. #UFC323 4,274 posts
- 15. Cape Verde 23.2K posts
- 16. HAZBINTOOZ 7,287 posts
- 17. Cejudo 1,279 posts
- 18. Marc 53.5K posts
- 19. #DonnaAdelson N/A
- 20. Native Americans 16.1K posts
내가 좋아할 만한 콘텐츠
-
Soumith Chintala
@soumithchintala -
Ian Goodfellow
@goodfellow_ian -
François Chollet
@fchollet -
Dr. Angelica Lim @petitegeek.bsky.social
@petitegeek -
Sebastian Ruder
@seb_ruder -
Sylvain Gugger
@GuggerSylvain -
OpenCV Live
@opencvlive -
Jeremy Howard
@jeremyphoward -
Gary Marcus
@GaryMarcus -
Russ Salakhutdinov
@rsalakhu -
Sander Dieleman
@sedielem -
Oriol Vinyals
@OriolVinyalsML -
Nando de Freitas
@NandoDF -
Tejas Kulkarni
@tejasdkulkarni -
Kyunghyun Cho
@kchonyc
Something went wrong.
Something went wrong.