Pablo Montalvo

@m_olbap

ML Engineer @HuggingFace. Previously ML R&D @ Rakuten. Computer vision and NLP mixer, ex-physicist. Dice thrower, dreamer, learner. He/him. Usually friendly :)

Paris, France

Joined December 2019

143Posts 909Followers 359Following

You might like

@AdinaYakup

@lunarflu1

@clefourrier

@lvwerra

@JingfengY

@Wauplin

@LucSGeorges

@vwxyzjn

@madisenxtaylor

@_marcsun

@HugoLaurencon

@willjsteele

@yoachlacombe

@GisreRegis

@platojobs

Pablo Montalvo reposted

Sergio Paniego

@SergioPaniego

Oct 6

A few days ago, @thinkymachines released “LoRA Without Regret”, showing that LoRA can match full fine-tuning performance when configured right. Naturally, we decided to reproduce the results with TRL and release a guide

SergioPaniego's tweet image. A few days ago, @thinkymachines released “LoRA Without Regret”, showing that LoRA can match full fine-tuning performance when configured right.

Naturally, we decided to reproduce the results with TRL and release a guide

Pablo Montalvo reposted

Sergio Paniego

@SergioPaniego

Sep 26

You need to try this tool! 🫡 My colleague @m_olbap built an interactive HF Space to explore the modular support of open models in transformers over time 👀 You’ll spot things like 🦙 llama defining many models or which ones could be modular next

Pablo Montalvo reposted

Rémi Ouazan

@remi_or_

Sep 19

Why is your KV so small? 🤏 In continuous batching, if you increase the max number of tokens per batch, you must decrease the memory allocated for your cache. In transformers, we make sure they are perfectly balanced (as all things should be). No matter how big your model is🦠🐋

Pablo Montalvo reposted

Rémi Ouazan

@remi_or_

Sep 17

You have no idea what attention looks like 🤥 Many talk about attention like it's simple, but few know how it actually works. Even basic stuff like shapes and prefill / decode are not that easy to grasp. Good thing HF is cooking a blogpost to help you out 🫂

remi_or_'s tweet image. You have no idea what attention looks like 🤥

Many talk about attention like it's simple, but few know how it actually works. Even basic stuff like shapes and prefill / decode are not that easy to grasp.
Good thing HF is cooking a blogpost to help you out 🫂

Pablo Montalvo

@m_olbap

Jun 17

Ever wondered how models actually see an image? Been playing with some visualizations of patch extraction, token layouts, how they affect predictions too. Planning a short visual deep dive comparing how different models process images. Would love thoughts before I go on.

Pablo Montalvo reposted

Arthur Zucker

@art_zucker

May 15

A quick update on the future of the `transformers` library! In order to provide a source of truth for all models, we are working with the rest of the ecosystem to make the modeling code the standard. A joint effort with vLLM, LlamaCPP, SGLang, Mlx, Qwen, Glm, Unsloth, Axoloth,…

Pablo Montalvo reposted

Lysandre

@LysandreJik

May 15

The Transformers library is undergoing it's largest pivot to date 🙌 It now cements its role as the central model definition, irrespective of the backend and runner. One ground truth to bring more reliability across the ecosystem. Why is this important?

LysandreJik's tweet image. The Transformers library is undergoing it's largest pivot to date 🙌

It now cements its role as the central model definition, irrespective of the backend and runner.

One ground truth to bring more reliability across the ecosystem.

Why is this important?

taci_ata

@taci_ata

Thomas Liang

@_thliang01

Yermay Fowson

@fowson65199

Gerardo Huerta

@RoblenHuerta

xofdrl3xbb6

@xofdrl3xbb71360

Ohad

@Ohadata

Samy Haffoudhi

@samyhaff

John Galt

@JotaGalt

Sachin Rastogi

@sacchinrastogi

WizardOfSomething

@wizardofsmthing

hou.mon | هومان

@ihouman

🧾🧾

@Asiawceeq14779

Ilia Semukhin

@Semukhin

DataDigestDaily

@DataDigestDaily

Bernardo Pamplona

@Bgpamplona

Nusret

@nsrt_py

xh3ktor

@Koncsek_Legnd

Max Bee

@MaxBeeBaaBoo

Kolli Sarath

@kollisarath

adz5A

@adz5A

DrameBaaz

@_baazdrame_

Rupinder_Singh

@CrySmmf

Stephen McConnachie

@mcnatch

.

@heisguyy

Pablo Vela

@pablovelagomez1

mrragava

@mrragava

Mati Ferreira

@MEFA__

Dongw

@DongWan42101620

speed

@strayer_13

120l3327

@120l3327

Loïck BOURDOIS

@BdsLoick

Rota

@pli_cachete

Ben Burtenshaw

@ben_burtenshaw

Earthzeta - Solana Developer

@earthzeta

Basit Malik

@basit_view

SAI RAM

@rsai31385_ram

Seamless Ops

@SeamlessOps7

dnvzozj409jsrw8

@dnvzozj40929161

Caroline Pascal

@Caro_Nahkriin

Bram

@BramVanroy

A DK

@ADK2019745

jelveh

@jelveh

Jordan Parker 🪬

@jordanwaparker

opdroid1234

@opdroid1234

SomeOddCodeGuy

@SomeOddCodeGuy

Nadav Timor

@NadavTimor

Robert Scoble

@Scobleizer

Harsh Desai

@dreamerharsh

Men1scus

@Men1scus

carlo

@carlo_l_fritz

Andrew Curran

@AndrewCurran_

Alexia Jolicoeur-Martineau

@jm_alexia

Dan Ni • tldr.tech 🤖

@tldrdan

Pablo Vela

@pablovelagomez1

Loïck BOURDOIS

@BdsLoick

Yoni

@yonigoz

Morph

@morph_labs

Sergio Paniego

@SergioPaniego

Presage Labs

@presage_labs

Rémi Ouazan

@remi_or_

tomaarsen

@tomaarsen

Aritra

@ariG23498

Georgia Channing

@cgeorgiaw

Xuan-Son Nguyen

@ngxson

TimDarcet

@TimDarcet

Sushmanth reddy Mereddy

@SushmanthReddyM

Dawood Khan

@dawoodnyc

Dana Aubakirova

@danaaubakir

Adil D. Ztn 👒

@AdilZtn

Michel Aractingi

@AractingiMichel

Remi Cadene

@RemiCadene

Junyang Lin

@JustinLin610

Raghu Ganti

@RaghuGanti

Zhou Xian

@zhou_xian_

Florent Daudens

@fdaudens

célina

@hanouticelina

Erik Kaunismäki

@ErikKaum

m_ric

@AymericRoucher

Mohamed

@mekkcyber

⿻ Andrew Trask

@iamtrask

Pavel Iakubovskii

@qubvelx

$sarahookr's profile picture. Adaptive Intelligence. Built @Cohere_Labs, @GoogleBrain, @GoogleDeepmind. ML Efficiency, Multimodal\lingual. Changing spaces where breakthroughs happen.$