Xavi Giró

@DocXavi

Applied scientist at @amazonscience Barcelona, Catalonia. Made at @la_upc & @columbia. Promoting @dlbcnai. Opinions my own.

Science & Technology

Badalona, Catalonia

imatge.upc.edu/web/people/xav…

Joined July 2012

6KPosts 3KFollowers 2KFollowing

You might like

@oscmansan

@cgmsnoek

@dlbcnai

@inthebrownbag

@cristiancanton

@dimadamen

@abursuc

@SattlerTorsten

@HazyResearch

@serrjoa

@adri_romsor

@CVC_UAB

@JordiTorresAI

@jponttuset

@jaywhang_

Xavi Giró reposted

Nando de Freitas

@NandoDF

Nov 9

Congratulations @TencentHunyuan for the best image generation model in the world. It comes with a fantastic paper that describes the data pipeline in great detail. More detail on the RL part would be great 😉

Hunyuan

@TencentHunyuan

Oct 5

🏆HunyuanImage 3.0 has taken the #1 spot in @arena, ranked as both the top overall and top open-source Text-to-Image model. This achievement came just one week after release and followed a week at the top of Hugging Face trend list. Big thanks to the community for the incredible…

TencentHunyuan's tweet image. 🏆HunyuanImage 3.0 has taken the #1 spot in @arena, ranked as both the top overall and top open-source Text-to-Image model. This achievement came just one week after release and followed a week at the top of Hugging Face trend list. Big thanks to the community for the incredible…

Xavi Giró reposted

Soumith Chintala

@soumithchintala

Nov 6

Leaving Meta and PyTorch I'm stepping down from PyTorch and leaving Meta on November 17th. tl;dr: Didn't want to be doing PyTorch forever, seemed like the perfect time to transition right after I got back from a long leave and the project built itself around me. Eleven years…

soumithchintala's tweet image. Leaving Meta and PyTorch

I'm stepping down from PyTorch and leaving Meta on November 17th.

tl;dr: Didn't want to be doing PyTorch forever, seemed like the perfect time to transition right after I got back from a long leave and the project built itself around me.

Eleven years…

Xavi Giró reposted

Pau Rodríguez

@prlz77

Nov 5

We have unlocked parallel training of non-linear RNNs! > LSTM entered the chat 🔥

Federico Danieli

@FedericoDa40495

Nov 4

𝗣𝗮𝗿𝗮𝗥𝗡𝗡: 𝗨𝗻𝗹𝗼𝗰𝗸𝗶𝗻𝗴 𝗣𝗮𝗿𝗮𝗹𝗹𝗲𝗹 𝗧𝗿𝗮𝗶𝗻𝗶𝗻𝗴 𝗼𝗳 𝗡𝗼𝗻𝗹𝗶𝗻𝗲𝗮𝗿 𝗥𝗡𝗡𝘀 𝗳𝗼𝗿 𝗟𝗟𝗠𝘀 For years, we’ve given RNNs for doomed, and looked at Transformer as 𝘁𝗵𝗲 LLM—but we just needed better math 📄arxiv.org/abs/2510.21450 💻github.com/apple/ml-parar…

FedericoDa40495's tweet card. Contribute to apple/ml-pararnn development by creating an account on GitHub.

GitHub - apple/ml-pararnn

Source: github.com

Xavi Giró reposted

WACV

@wacv_official

Nov 3

Check out the list of #WACV2026 workshops at wacv.thecvf.com/Conferences/20…. Many have paper submission deadlines in the next month!

wacv_official's tweet image. Check out the list of #WACV2026 workshops at wacv.thecvf.com/Conferences/20…. Many have paper submission deadlines in the next month!

Xavi Giró reposted

Tal Linzen

@tallinzen

Nov 3

Good thread! I'd also add that faculty positions are often decently paid, give you the freedom to do work that benefits society, and, if everything works out, you can keep the job for life so you don't need to worry about saving 50% of your income and all of that FIRE nonsense.

Deb Raji

@rajiinio

Oct 29

Even before @mmitchell_ai recently raised this discussion, I've had conversation after conversation with students & new grads struggling with this exact dilemma. I want to help! Here's a live thread of AI-related opportunities for those looking to do good & make (enough) money:

rajiinio's tweet image. Even before @mmitchell_ai recently raised this discussion, I've had conversation after conversation with students &amp; new grads struggling with this exact dilemma.

I want to help! Here's a live thread of AI-related opportunities for those looking to do good &amp; make (enough) money:

Xavi Giró reposted

JaesungHuh

@huh_jaesung

Nov 3

✨The source code for our paper “Look, Listen and Recognise: Character-Aware Audio-Visual Subtitling” is now released! ✨ Given a video, its audio, and a cast list for each episode, the model can automatically generate subtitles with speaker names.

huh_jaesung's tweet image. ✨The source code for our paper “Look, Listen and Recognise: Character-Aware Audio-Visual Subtitling” is now released! ✨
Given a video, its audio, and a cast list for each episode, the model can automatically generate subtitles with speaker names.

Xavi Giró

@DocXavi

Oct 29

It was Spring 2015, and it was the first time I was teaching deep learning, @UPCTelecos Barcelona. Only two students posed questions that day. 10 years later, they are both scientists at @GoogleDeepMind. As said in @AmazonScience : “learn and be curious”.

DocXavi's tweet image. It was Spring 2015, and it was the first time I was teaching deep learning, @UPCTelecos Barcelona. Only two students posed questions that day.

10 years later, they are both scientists at @GoogleDeepMind.

As said in @AmazonScience : “learn and be curious”.

Xavi Giró reposted

Stefano Ermon

@StefanoErmon

Oct 29

Tired of chasing references across dozens of papers? This monograph distills it all: the principles, intuition, and math behind diffusion models. Thrilled to share!

Chieh-Hsin (Jesse) Lai ✈️ NeurIPS

@JCJesseLai

Oct 29

Tired to go back to the original papers again and again? Our monograph: a systematic and fundamental recipe you can rely on! 📘 We’re excited to release 《The Principles of Diffusion Models》— with @DrYangSong, @gimdong58085414, @mittu1204, and @StefanoErmon. It traces the core…

JCJesseLai's tweet image. Tired to go back to the original papers again and again? Our monograph: a systematic and fundamental recipe you can rely on!

📘 We’re excited to release 《The Principles of Diffusion Models》— with @DrYangSong, @gimdong58085414, @mittu1204, and @StefanoErmon.

It traces the core…

Xavi Giró reposted

Mila - Institut québécois d'IA

@Mila_Quebec

Oct 27

Congratulations to @Yoshua_Bengio, founder and scientific advisor of Mila, who has become the first researcher in the world to surpass one million citations on Google Scholar, the leading platform for academic and scientific research. A remarkable milestone that highlights the…

Mila_Quebec's tweet card. Yoshua Bengio, the most-cited researcher in the world has become the first living scientist to surpass one million citations on Google Scholar.

AI Researcher Yoshua Bengio Becomes First Living Scientist to Reach 1 Million Citations on Google...

Source: mila.quebec

LawZero - LoiZéro

@LawZero_

Oct 27

Our Founder and Scientific Director @Yoshua_Bengio has become the first living researcher to surpass 1 million citations on Google Scholar, a testament to the foundational and global impact of his work. Congratulations Yoshua!

LawZero_'s tweet image. Our Founder and Scientific Director @Yoshua_Bengio has become the first living researcher to surpass 1 million citations on Google Scholar, a testament to the foundational and global impact of his work. Congratulations Yoshua!

Xavi Giró

@DocXavi

Oct 28

This is my annual post claiming that the program of #DLBCN 2025 this year is even better than the previous one.

Deep Learning Barcelona Symposium

@dlbcnai

Oct 28

The 9 spotlights talks of #DLBCN 2025 have been announced in our site: sites.google.com/view/dlbcn2025… The selection was made based on both quality & diversity criteria.

dlbcnai's tweet image. The 9 spotlights talks of #DLBCN 2025 have been announced in our site:

sites.google.com/view/dlbcn2025…

The selection was made based on both quality &amp; diversity criteria.

Xavi Giró reposted

Joan Serrà

@serrjoa

Oct 17

And yet another paper within our series of works on music matching. Now... 🥁🥁🥁 Music sampling! Complementing fingerprinting, version ID, and lyrics matching systems, detecting music sampling is key given modern music practice. SOTA results with a number of clever tricks.

Alain Riou

@howariou

Oct 17

Eminem sampled Aerosmith, 50 Cent sampled Nina Simone, everybody sampled Chic... Many great songs sampled existing ones! Detecting this is the topic of our latest paper with @serrjoa at @SonyAI Barcelona 😎 tl;dr: multi-track dataset + few tricks = +18% boost over SOTA 🚀 1/N

howariou's tweet image. Eminem sampled Aerosmith, 50 Cent sampled Nina Simone, everybody sampled Chic... Many great songs sampled existing ones!

Detecting this is the topic of our latest paper with @serrjoa at @SonyAI Barcelona 😎

tl;dr: multi-track dataset + few tricks = +18% boost over SOTA 🚀

1/N

Xavi Giró reposted

Deep Learning Barcelona Symposium

@dlbcnai

Oct 27

UPC will award an honoris causa doctorate to Oriol Vinyals (@OriolVinyalsML) from @GoogleDeepMind. telecos.upc.edu/ca/noticies/la… Oriol was a keynote speaker in #DLBCN 2019:

dlbcnai's tweet image. UPC will award an honoris causa doctorate to Oriol Vinyals (@OriolVinyalsML) from @GoogleDeepMind.

telecos.upc.edu/ca/noticies/la…

Oriol was a keynote speaker in #DLBCN 2019:

Xavi Giró reposted

Aishwarya Agrawal

@aagrawalAA

Oct 23

When receiving the Everingham Prize yesterday, I gave a short presentation on the progress of vision-language research over the last decade. Slides (with transcript of my speech in the notes section): docs.google.com/presentation/d… Video of the talk: photos.app.goo.gl/bMrE8hHSiiN98z… Students…

Dhruv Batra ✈️ NeurIPS

@DhruvBatra_

Oct 23

As part of the award ceremony, @aagrawalAA presented a recap of vision-and-language research over the last decade — solved problems, progress, and open-challenges for mutimodal LLMs. Solved: robustness to paraphrasing and false premises, OCR, world-knowledge based reasoning.…

DhruvBatra_'s tweet image. As part of the award ceremony, @aagrawalAA presented a recap of vision-and-language research over the last decade — solved problems, progress, and open-challenges for mutimodal LLMs.

Solved: robustness to paraphrasing and false premises, OCR, world-knowledge based reasoning.…

Xavi Giró reposted

Lei Li

@_TobiasLee

Oct 24

👋Say Hi to MiMo-Audio! Our BREAKTHROUGH in general-purpose audio intelligence. 🎯 Scaling pretraining to 100M+ hours leads to EMERGENCE of few-shot generalization across diverse audio tasks! 🔥 Post-trained MiMo-Audio-7B-Instruct: • crushes benchmarks: SOTA on MMSU, MMAU,…

_TobiasLee's tweet image. 👋Say Hi to MiMo-Audio! Our BREAKTHROUGH in general-purpose audio intelligence.

🎯 Scaling pretraining to 100M+ hours leads to EMERGENCE of few-shot generalization across diverse audio tasks!

🔥 Post-trained MiMo-Audio-7B-Instruct:
• crushes benchmarks: SOTA on MMSU, MMAU,…

Xavi Giró reposted

Dwarkesh Patel

@dwarkesh_sp

Oct 17

The @karpathy interview 0:00:00 – AGI is still a decade away 0:30:33 – LLM cognitive deficits 0:40:53 – RL is terrible 0:50:26 – How do humans learn? 1:07:13 – AGI will blend into 2% GDP growth 1:18:24 – ASI 1:33:38 – Evolution of intelligence & culture 1:43:43 - Why self…

Xavi Giró

@DocXavi

Oct 21

🚀 Cut Your Image Review Costs with Smart AutoQA! ✨ The magic formula: As long as your AutoQA precision beats your GenAI accuracy, you're saving money and time. arxiv.org/abs/2510.16179

DocXavi's tweet image. 🚀 Cut Your Image Review Costs with Smart AutoQA!

✨ The magic formula: As long as your AutoQA precision beats your GenAI accuracy, you're saving money and time.

arxiv.org/abs/2510.16179

Xavi Giró reposted

Sander Dieleman

@sedielem

Oct 14

In my blog post on latents for generative modelling, I pointed out that representation learning and reconstruction are two separate tasks (§6.3), which autoencoders try to solve simultaneously. Separating them makes sense. It opens up a lot of possibilities, as this work shows!

Saining Xie

@sainingxie

Oct 14

three years ago, DiT replaced the legacy unet with a transformer-based denoising backbone. we knew the bulky VAEs would be the next to go -- we just waited until we could do it right. today, we introduce Representation Autoencoders (RAE). >> Retire VAEs. Use RAEs. 👇(1/n)

sainingxie's tweet image. three years ago, DiT replaced the legacy unet with a transformer-based denoising backbone. we knew the bulky VAEs would be the next to go -- we just waited until we could do it right.

today, we introduce Representation Autoencoders (RAE).

&gt;&gt; Retire VAEs. Use RAEs. 👇(1/n)

Xavi Giró reposted

Sam Buchanan

@_sdbuchanan

Oct 2

We wrote a book about representation learning! It’s fully open source, available and readable online, and covers everything from theoretical foundations to practical algorithms. 👷‍♂️ We’re hard at work updating the content for v2.0, and would love your feedback and contributions

_sdbuchanan's tweet image. We wrote a book about representation learning!

It’s fully open source, available and readable online, and covers everything from theoretical foundations to practical algorithms.

👷‍♂️ We’re hard at work updating the content for v2.0, and would love your feedback and contributions

Xavi Giró reposted

Deep Learning Barcelona Symposium

@dlbcnai

Sep 15

José M. Álvarez from NVIDIA will be the keynote speaker of #DLBCN 2025. Dr Álvarez leads the Autonomous Vechicle Applied Research Group at Nvidia, CA, USA (@NVIDIAAI ). Watch his interview with @neurofregides in 2024: youtube.com/shorts/x6HBanJ…

dlbcnai's tweet image. José M. Álvarez from NVIDIA will be the keynote speaker of #DLBCN 2025. Dr Álvarez leads the Autonomous Vechicle Applied Research Group at Nvidia, CA, USA (@NVIDIAAI ).

Watch his interview with @neurofregides in 2024:
youtube.com/shorts/x6HBanJ…

Xavi Giró reposted

clem 🤗

@ClementDelangue

Sep 1

If you think @Apple is not doing much in AI, you're getting blindsided by the chatbot hype and not paying enough attention! They just released FastVLM and MobileCLIP2 on @huggingface. The models are up to 85x faster and 3.4x smaller than previous work, enabling real-time vision…