DataScienceHarp's profile picture. 🤖 👨🏽‍💻 Hacker-in-residence @voxel51| ❤️open source deep learning | VLMs| Visual AI| Learn. Hack. Write. Teach. Repeat. 🪯

harpreet

@DataScienceHarp

🤖 👨🏽‍💻 Hacker-in-residence @voxel51| ❤️open source deep learning | VLMs| Visual AI| Learn. Hack. Write. Teach. Repeat. 🪯

i just created a dataset of visual ai papers that are being presented at neurips this year you can checkout the dataset here: huggingface.co/datasets/Voxel… what can you do with this? good question. find out at this virtual event i'm presenting at this week:…


harpreet reposteó

No one teaches this, but this is what really happens when you hit `run` on an LLM. User → API → Engines → Multi-GPU → CUDA → Hardware I mapped every layer (100+ components) of the LLM Inference Stack so you can finally see the full picture. Full blogpost coming soon!

GoAbiAryan's tweet image. No one teaches this, but this is what really happens when you hit `run` on an LLM.

User → API → Engines → Multi-GPU → CUDA → Hardware

I mapped every layer (100+ components) of the LLM Inference Stack so you can finally see the full picture.

Full blogpost coming soon!

SAS and R

Programming language you learnd but never used again is...?



btw, two events coming up all about document visual ai nov 6: voxel51.com/events/visual-… nov 14: voxel51.com/events/documen…

i just integrated 6 visual document retrieval models into fiftyone as remote zoo models. these are all available as remote source zoo models now. here's what they do:



harpreet reposteó

we just updated the model comparison on our blog for you 🫡 added Chandra, OlmOCR-2, Qwen3-VL and their averaged OlmOCR score!

mervenoyann's tweet image. we just updated the model comparison on our blog for you 🫡

added Chandra, OlmOCR-2, Qwen3-VL and their averaged OlmOCR score!

A pixel is worth a thousand tokens

I quite like the new DeepSeek-OCR paper. It's a good OCR model (maybe a bit worse than dots), and yes data collection etc., but anyway it doesn't matter. The more interesting part for me (esp as a computer vision at heart who is temporarily masquerading as a natural language…



Loading...

Something went wrong.


Something went wrong.