kaleemcs's profile picture. Data Science

Kaleem

@kaleemcs

Data Science

Real-time weather transitions on Gaussian Splats, captured on the K1 Huge appreciation to RCS Studios and Volinga. The smooth shift between clear skies, rain, snow, and night shows how flexible environment control becomes when procedural systems meet high-quality 3DGS captures🚀

Real-time weather transitions on Gaussian Splats, captured on the K1 Huge appreciation to RCS Studios and Volinga. The smooth shift between clear skies, rain, snow, and night shows how flexible environment control becomes when procedural systems meet high-quality 3DGS captures🚀



Introducing SPIDER — Scalable Physics-Informed Dexterous Retargeting! A dynamically feasible, cross-embodiment retargeting framework for BOTH humanoids 🤖 and dexterous hands ✋. From human motion → sim → real robots, at scale.

🕸️ Introducing SPIDER — Scalable Physics-Informed Dexterous Retargeting! A dynamically feasible, cross-embodiment retargeting framework for BOTH humanoids 🤖 and dexterous hands ✋. From human motion → sim → real robots, at scale. 🔗 Website: jc-bao.github.io/spider-project/ 🧵 1/n



Depth Anything 3 (DA3)! 🚀 Aiming for human-like spatial perception, DA3 extends monocular depth estimation to any-view scenarios, including single images, multi-view images, and video.

After a year of team work, we're thrilled to introduce Depth Anything 3 (DA3)! 🚀 Aiming for human-like spatial perception, DA3 extends monocular depth estimation to any-view scenarios, including single images, multi-view images, and video. In pursuit of minimal modeling, DA3…



Watch the Lixel L2 Pro to the test, scanning 250m of tunnels in just 18 minutes using Multi-SLAM tech. ⚙️ Real-time 3D point clouds 📏 1 cm relative / 3 cm absolute accuracy 🌍 300 m range.

🚧 Precision underground. Watch the Lixel L2 Pro to the test, scanning 250m of tunnels in just 18 minutes using Multi-SLAM tech. ⚙️ Real-time 3D point clouds 📏 1 cm relative / 3 cm absolute accuracy 🌍 300 m range Thanks to @geocomchile for proving what XGRIDS can do🙌



RF-DETR paper is finally on arXiv

RF-DETR paper is finally on arXiv - real time detection with DINOv2 backbone - runs neural architecture search (NAS) over about 6000 architecture variants - uses weight sharing across all configs - first real-time segmentation DETR to break past top YOLO results ↓ more



SMF-VO: Direct Ego-Motion Estimation via Sparse Motion Fields Sangheon Yang, Yeongin Yoon, Hong Mo Jung, Jongwoo Lim tl;dr: sparse optical flow->linear and angular velocity; generalized 3D ray-based motion field->different camera models arxiv.org/abs/2511.09072

SMF-VO: Direct Ego-Motion Estimation via Sparse Motion Fields Sangheon Yang, Yeongin Yoon, Hong Mo Jung, Jongwoo Lim tl;dr: sparse optical flow->linear and angular velocity; generalized 3D ray-based motion field->different camera models arxiv.org/abs/2511.09072

zhenjun_zhao's tweet image. SMF-VO: Direct Ego-Motion Estimation via Sparse Motion Fields

Sangheon Yang, Yeongin Yoon, Hong Mo Jung, Jongwoo Lim

tl;dr: sparse optical flow->linear and angular velocity; generalized 3D ray-based motion field->different camera models

arxiv.org/abs/2511.09072
zhenjun_zhao's tweet image. SMF-VO: Direct Ego-Motion Estimation via Sparse Motion Fields

Sangheon Yang, Yeongin Yoon, Hong Mo Jung, Jongwoo Lim

tl;dr: sparse optical flow->linear and angular velocity; generalized 3D ray-based motion field->different camera models

arxiv.org/abs/2511.09072
zhenjun_zhao's tweet image. SMF-VO: Direct Ego-Motion Estimation via Sparse Motion Fields

Sangheon Yang, Yeongin Yoon, Hong Mo Jung, Jongwoo Lim

tl;dr: sparse optical flow->linear and angular velocity; generalized 3D ray-based motion field->different camera models

arxiv.org/abs/2511.09072
zhenjun_zhao's tweet image. SMF-VO: Direct Ego-Motion Estimation via Sparse Motion Fields

Sangheon Yang, Yeongin Yoon, Hong Mo Jung, Jongwoo Lim

tl;dr: sparse optical flow->linear and angular velocity; generalized 3D ray-based motion field->different camera models

arxiv.org/abs/2511.09072


Paper: “Tracking and Understanding Object Transformations” (NeurIPS 2025) Code & Dataset: tubelet-graph.github.io

Most trackers lose sight of an object once it changes shape… [👇Code & Dataset] an apple turns into slices, a caterpillar into a butterfly, and the model just gives up. Researchers at Cornell built a new system called Track Any State that does something different: it follows…



Researchers at Cornell built a new system called Track Any State that does something different: it follows objects through their transformations while describing what actually changed.

Most trackers lose sight of an object once it changes shape… [👇Code & Dataset] an apple turns into slices, a caterpillar into a butterfly, and the model just gives up. Researchers at Cornell built a new system called Track Any State that does something different: it follows…



What aspects of human knowledge do vision models like CLIP fail to capture, and how can we improve them? We suggest models miss key global organization; aligning them makes them more robust.

What aspects of human knowledge do vision models like CLIP fail to capture, and how can we improve them? We suggest models miss key global organization; aligning them makes them more robust. Check out @lukas_mut's work, finally out (in @Nature!?) + our new blogpost! 1/4

AndrewLampinen's tweet image. What aspects of human knowledge do vision models like CLIP fail to capture, and how can we improve them? We suggest models miss key global organization; aligning them makes them more robust. Check out @lukas_mut's work, finally out (in @Nature!?) + our new blogpost! 1/4


OUGS: Active View Selection via Object-aware Uncertainty Estimation in 3DGS

OUGS: Active View Selection via Object-aware Uncertainty Estimation in 3DGS Haiyi Li, Qi Chen, Denis Kalkofen, Hsiang-Ting Chen tl;dr: Gaussian parameters->covariance->diagonal Fisher Information Matrix->uncertainty arxiv.org/abs/2511.09397

zhenjun_zhao's tweet image. OUGS: Active View Selection via Object-aware Uncertainty Estimation in 3DGS

Haiyi Li, Qi Chen, Denis Kalkofen, Hsiang-Ting Chen

tl;dr: Gaussian parameters->covariance->diagonal Fisher Information Matrix->uncertainty

arxiv.org/abs/2511.09397
zhenjun_zhao's tweet image. OUGS: Active View Selection via Object-aware Uncertainty Estimation in 3DGS

Haiyi Li, Qi Chen, Denis Kalkofen, Hsiang-Ting Chen

tl;dr: Gaussian parameters->covariance->diagonal Fisher Information Matrix->uncertainty

arxiv.org/abs/2511.09397
zhenjun_zhao's tweet image. OUGS: Active View Selection via Object-aware Uncertainty Estimation in 3DGS

Haiyi Li, Qi Chen, Denis Kalkofen, Hsiang-Ting Chen

tl;dr: Gaussian parameters->covariance->diagonal Fisher Information Matrix->uncertainty

arxiv.org/abs/2511.09397
zhenjun_zhao's tweet image. OUGS: Active View Selection via Object-aware Uncertainty Estimation in 3DGS

Haiyi Li, Qi Chen, Denis Kalkofen, Hsiang-Ting Chen

tl;dr: Gaussian parameters->covariance->diagonal Fisher Information Matrix->uncertainty

arxiv.org/abs/2511.09397


With KIRI Engine, turning real life objects into 3D models is easy. But what comes after that? Lighting, animating and rendering your models can take a lot of skill - but we want everyone - from beginners to pros - to start creating immediately.

With KIRI Engine, turning real life objects into 3D models is easy. But what comes after that? Lighting, animating and rendering your models can take a lot of skill - but we want everyone - from beginners to pros - to start creating immediately. So, we added an automatic light…



D-LIO: 6DoF Direct LiDAR-Inertial Odometry based on Simultaneous Truncated Distance Field Mapping github.com/robotics-upo/D

D-LIO: 6DoF Direct LiDAR-Inertial Odometry based on Simultaneous Truncated Distance Field Mapping github.com/robotics-upo/D…

rsasaki0109's tweet image. D-LIO: 6DoF Direct LiDAR-Inertial Odometry based on Simultaneous Truncated Distance Field Mapping
github.com/robotics-upo/D…
rsasaki0109's tweet image. D-LIO: 6DoF Direct LiDAR-Inertial Odometry based on Simultaneous Truncated Distance Field Mapping
github.com/robotics-upo/D…


Robot Learning from a Physical World Model

Robot Learning from a Physical World Model



Inside a real-time 3D mapping system!

Inside a real-time 3D mapping system! 🧭 That's how modern home bots map and localize using only cameras. @maticrobots is using voxel-based neural networks running on NVIDIA Jetson Orin to build real-time, photorealistic 3D maps of the world around its robots. Its autonomy…



This work has been accepted to WACV'26! Preliminary version was presented at CVPR CV4Animal Workshop. arxiv.org/abs/2403.08227

This work has been accepted to WACV'26! Preliminary version was presented at CVPR CV4Animal Workshop. arxiv.org/abs/2403.08227



`pip install gsply` for fast Gaussian splat ply loading in python. 6.3x faster than plyfile


OVO Official repository of "Open-Vocabulary Online Semantic Mapping for SLAM" github.com/tberriel/OVO?t

OVO Official repository of "Open-Vocabulary Online Semantic Mapping for SLAM" github.com/tberriel/OVO?t…

rsasaki0109's tweet image. OVO
Official repository of "Open-Vocabulary Online Semantic Mapping for SLAM"
github.com/tberriel/OVO?t…
rsasaki0109's tweet image. OVO
Official repository of "Open-Vocabulary Online Semantic Mapping for SLAM"
github.com/tberriel/OVO?t…


ProcGen3D: Learning Neural Procedural Graphs for Image-to-3D Reconstruction

📢ProcGen3D: Learning Neural Procedural Graphs for Image-to-3D Reconstruction @xinyi092298 learns neural procedural graphs to generate high-fidelity 3D - MCTS-guided sampling maintains consistency with the input image, even from real images! Check it out: xzhang-t.github.io/project/ProcGe…



4D3R: Motion-Aware Neural Reconstruction and Rendering of Dynamic Scenes from Monocular Videos @mengqi_guo, Bo Xu, Yanyan Li, @gimhee_lee tl;dr: joint optimization of motion mask and scene reconstruction arxiv.org/abs/2511.05229

4D3R: Motion-Aware Neural Reconstruction and Rendering of Dynamic Scenes from Monocular Videos @mengqi_guo, Bo Xu, Yanyan Li, @gimhee_lee tl;dr: joint optimization of motion mask and scene reconstruction arxiv.org/abs/2511.05229

zhenjun_zhao's tweet image. 4D3R: Motion-Aware Neural Reconstruction and Rendering of Dynamic Scenes from Monocular Videos

@mengqi_guo, Bo Xu, Yanyan Li, @gimhee_lee

tl;dr: joint optimization of motion mask and scene reconstruction

arxiv.org/abs/2511.05229
zhenjun_zhao's tweet image. 4D3R: Motion-Aware Neural Reconstruction and Rendering of Dynamic Scenes from Monocular Videos

@mengqi_guo, Bo Xu, Yanyan Li, @gimhee_lee

tl;dr: joint optimization of motion mask and scene reconstruction

arxiv.org/abs/2511.05229
zhenjun_zhao's tweet image. 4D3R: Motion-Aware Neural Reconstruction and Rendering of Dynamic Scenes from Monocular Videos

@mengqi_guo, Bo Xu, Yanyan Li, @gimhee_lee

tl;dr: joint optimization of motion mask and scene reconstruction

arxiv.org/abs/2511.05229


Loading...

Something went wrong.


Something went wrong.