JitendraMalikCV's profile picture.

Jitendra MALIK

@JitendraMalikCV

(5/5)As it is becoming abundantly clear that the current hyper-scaling paradigm for LLMs is not going to lead to AGI / superintelligence, perhaps it is time to return to a “cooperate” framework? Publish research results openly, and build on what each other has published.


(4/5)Sadly, this is no longer true. Open AI’s ChatGPT success was accompanied by their decision to no longer reveal the details of their training strategy – all the aspects which are needed for replication of work in science. Since then more and more labs have gone “dark” and we…


(3/5)2016-2021 was a wonderful period for AI research precisely because the leading labs at the time – FAIR, Deep Mind, Google, Open AI – were all publishing freely and building off each other’s results. If the transformer paper in 2017 had been held as a secret inside Google…


(2/5)However if just one lab publishes and the other doesn’t, the lab which doesn’t publish (“defect” in prisoner’s dilemma jargon) than that lab will have an advantage. Assuming rational players, both labs will choose not to publish. To get to a cooperative outcome as a Nash…


We can now 3dfy any object from a single real world image. This has been a holy grail in computer vision for many decades. Try it here (you can upload your own images) aidemos.meta.com/segment-anythi… and you can read the paper here ai.meta.com/.../sam-3d-3df… . Enjoy!


Again the power of tactile sensing and multi-finger hands comes through. This is the future of dexterous manipulation!

🤖 What if a humanoid robot could make a hamburger from raw ingredients—all the way to your plate? 🔥 Excited to announce ViTacFormer: our new pipeline for next-level dexterous manipulation with active vision + high-resolution touch. 🎯 For the first time ever, we demonstrate…



Angjoo Kanazawa @akanazawa and I taught CS 280, graduate computer vision, this semester at UC Berkeley. We found a combination of classical and modern CV material that worked well, and are happy to share our lecture material from the class. cs280-berkeley.github.io Enjoy!


Enjoy watching a humanoid walking around UC Berkeley. It only looks inebriated :-)

our new system trains humanoid robots using data from cell phone videos, enabling skills such as climbing stairs and sitting on chairs in a single policy (w/ @redstone_hong @junyi42 @davidrmcall)



I'm happy to post course materials for my class at UC Berkeley "Robots that Learn", taught with the outstanding assistance of @ToruO_O. Lecture videos at youtube.com/playlist?list=… Lecture notes & other course materials at robots-that-learn.github.io


Happy to share these exciting new results on video synthesis of humans in movement. Arguably, these establish the power of having explicit 3D representations. Popular video generation models like Sora don't do that, making it hard for the resulting video to be 4D consistent.

I’ve dreamt of creating a tool that could animate anyone with any motion from just ONE image… and now it’s a reality! 🎉 Super excited to introduce updated 3DHM: Synthesizing Moving People with 3D Control. 🕺💃3DHM can generate human videos from a single real or synthetic human…



Touche', Sergey!

Lots of memorable quotes from @JitendraMalikCV at CoRL, the most significant one of course is: “I believe that Physical Intelligence is essential to AI” :) I did warn you Jitendra that out of context quotes are fair game. Some liberties taken wrt capitalization.



Autoregressive modeling is not just for language, it can equally be used to model human behavior. This paper shows how..


Loading...

Something went wrong.


Something went wrong.