mstanojevic118's profile picture. ML, NLP, ML4Health, MultiModality, and STEM geek. Travel enthusiast. marija-stanojevic@github.io

Marija Stanojevic

@mstanojevic118

ML, NLP, ML4Health, MultiModality, and STEM geek. Travel enthusiast. [email protected]

I truly enjoyed giving this lecture on Machine Unlearning and was positively surprised by the interest of the audience! Hope to do it again next year!

🔍 Join @mstanojevic118 from EudAImonia Science for an insightful talk on Machine Unlearning in LLMs and MMs. Discover how to enhance AI ethics, privacy, and compliance. 🤖 Register now: lnkd.in/g6EEsEc

TMLS_TO's tweet image. 🔍 Join @mstanojevic118 from EudAImonia Science for an insightful talk on Machine Unlearning in LLMs and MMs. Discover how to enhance AI ethics, privacy, and compliance.

🤖 Register now: lnkd.in/g6EEsEc


Marija Stanojevic reposted

New Research: a lot of talk today about "what happens" inside a language model, since they spend the exact same amount of compute on each token, regardless of difficulty. we touch on this question on our new theory paper, Do Language Models Plan for Future Tokens?

jxmnop's tweet image. New Research:

a lot of talk today about "what happens" inside a language model, since they spend the exact same amount of compute on each token, regardless of difficulty.

we touch on this question on our new theory paper, Do Language Models Plan for Future Tokens?

Marija Stanojevic reposted

Introducing Jamba, our groundbreaking SSM-Transformer open model! As the first production-grade model based on Mamba architecture, Jamba achieves an unprecedented 3X throughput and fits 140K context on a single GPU. 🥂Meet Jamba ai21.com/jamba 🔨Build on @huggingface

AI21Labs's tweet image. Introducing Jamba, our groundbreaking SSM-Transformer open model!

As the first production-grade model based on Mamba architecture, Jamba achieves an unprecedented 3X throughput and fits 140K context on a single GPU.

🥂Meet Jamba ai21.com/jamba

🔨Build on @huggingface

While companies are trying to find the talent they don't have, the interviews are mostly testing the alignment of candidate's knowledge with the talent they already have.


When can I buy this for my husband? 😂

Folding clothes with $250 robot arms. I've added another motor to improve mobility and extend the reach. The CAD files and the code are public at: github.com/AlexanderKoch-… (video at 2x speed)



Marija Stanojevic reposted

a tiny bit of a cat is out now; we train our own large (medium) sized LM on our own proprietary data from scratch ourselves at @PrescientDesign and @genentech . very easy in my opinion, and @keunwoochoi hates it whenever i say this 😂

i'm giving an introductory talk about LLMs for drug discovery at #ASCPT2024 pre-conference soon.

keunwoochoi's tweet image. i'm giving an introductory talk about LLMs for drug discovery at #ASCPT2024 pre-conference soon.
keunwoochoi's tweet image. i'm giving an introductory talk about LLMs for drug discovery at #ASCPT2024 pre-conference soon.
keunwoochoi's tweet image. i'm giving an introductory talk about LLMs for drug discovery at #ASCPT2024 pre-conference soon.
keunwoochoi's tweet image. i'm giving an introductory talk about LLMs for drug discovery at #ASCPT2024 pre-conference soon.


Marija Stanojevic reposted

Foundation Agent: a roadmap to build generally capable embodied AI that acts skillfully across many worlds, virtual or real. Project GR00T, the Humanoid robot foundation model, is a cornerstone for Foundation Agent. It's the North Star, the next grand challenge in our quest for…


Insane! So exciting!

Livestream of @Neuralink demonstrating “Telepathy” – controlling a computer and playing video games just by thinking



Marija Stanojevic reposted

If you know Torch, I think you can code for GPU now with OpenAI's Triton language. We made some puzzles to help you rewire your brain. Starts easy, but gets quickly to fun modern models like FlashAttention and GPT-Q. Good luck! github.com/srush/Triton-P…


Great resource!

Holy! The Machine Learning Engineering Open Book repo has just crossed 9k stars on github! That's insane as I have started writing it ~6 month ago! github.com/stas00/ml-engi… Thank you so much for your vote of confidence! It's super encouraging to continue investing into this…

StasBekman's tweet image. Holy! The Machine Learning Engineering Open Book repo has just crossed 9k stars on github! 

That's insane as I have started writing it ~6 month ago!

github.com/stas00/ml-engi…

Thank you so much for your vote of confidence! It's super encouraging to continue investing into this…


Marija Stanojevic reposted

Introducing a new, fully open robotics dataset! - 76k episodes - 564 unique scenes - 100 contributors - 13 labs/institutions - 3 continents droid-dataset.github.io A short 🧵 on the backstory


Marija Stanojevic reposted

Today is the beginning of our moonshot to solve embodied AGI in the physical world. I’m so excited to announce Project GR00T, our new initiative to create a general-purpose foundation model for humanoid robot learning. The GR00T model will enable a robot to understand multimodal…


While everyone is talking about NVIDIA Blackwell GPU, I am equally impressed by speed and quality of AI software development. With a support for all kinds of tasks and data, including preprocessing, training, finetuning, and postprocessing they make it easy for anyone to add ML.

mstanojevic118's tweet image. While everyone is talking about NVIDIA Blackwell GPU, I am equally impressed by speed and quality of AI software development. With a support for all kinds of tasks and data, including preprocessing, training, finetuning, and postprocessing they make it easy for anyone to add ML.

Marija Stanojevic reposted

EMNLP 2024 invites the submission of long and short papers featuring substantial, original, and unpublished research on empirical methods for Natural Language Processing. More info at: 2024.emnlp.org/calls/main_con… #EMNLP2024


Marija Stanojevic reposted
ClementDelangue's tweet image. 👀👀👀 huggingface.co/xai-org/grok-1

Why is this done only up to the age of 84 when that's an average living time in some countries? It would also be interesting to see how it compares with data from other countries

What do people die from at different ages? I hadn’t seen a satisfying chart that showed causes of death in different age groups all at once, so I just made it myself. Turns out, in the US, “external causes” are a majority of deaths until ~age 40 scientificdiscovery.dev/p/20-so-many-g…

salonium's tweet image. What do people die from at different ages?

I hadn’t seen a satisfying chart that showed causes of death in different age groups all at once, so I just made it myself.

Turns out, in the US, “external causes” are a majority of deaths until ~age 40
scientificdiscovery.dev/p/20-so-many-g…


Marija Stanojevic reposted

We live in such strange times. Apple, a company famous for its secrecy, published a paper with staggering amount of details on their multimodal foundation model. Those who are supposed to be open are now wayyy less than Apple. MM1 is a treasure trove of analysis. They discuss…

DrJimFan's tweet image. We live in such strange times. Apple, a company famous for its secrecy, published a paper with staggering amount of details on their multimodal foundation model. Those who are supposed to be open are now wayyy less than Apple.

MM1 is a treasure trove of analysis. They discuss…

Marija Stanojevic reposted

I'm biased but I think this paper is pretty cool too arxiv.org/abs/2311.13647 (at ICLR this year)


Marija Stanojevic reposted

The proceedings of our workshop are available here: ceur-ws.org/Vol-3649/!

ML4CMH's tweet image. The proceedings of our workshop are available here: ceur-ws.org/Vol-3649/!

Marija Stanojevic reposted

Here's details on Meta's 24k H100 Cluster Pods that we use for Llama3 training. * Network: two versions RoCEv2 or Infiniband. * Llama3 trains on RoCEv2 * Storage: NFS/FUSE based on Tectonic/Hammerspace * Stock PyTorch: no real modifications that aren't upstreamed * NCCL with…


Loading...

Something went wrong.


Something went wrong.