
Explainable Machine Learning
@ExplainableML
Institute for Explainable Machine Learning @HelmholtzMunich and Interpretable and Reliable Machine Learning group @TU_Muenchen
You might like
2 papers accepted at NeurIPS 2025 🎉 🔹 Manipulating Feature Visualizations with Gradient Slingshots 🔹 Capturing Polysemanticity with PRISM: A Multi-Concept Feature Description Framework

Reward hacking is challenging when fine-tuning few-step Diffusion models. Direct fine-tuning on rewards can create artifacts that game metrics while degrading visual quality. We propose Noise Hypernetworks as a theoretically grounded solution, inspired by test-time optimization.
💫 After four PhD years on all things multimodal, pre- and post-training, I’m super excited for a new research chapter @GoogleDeepMind 🇨🇭! Biggest thanks to @zeynepakata and @OriolVinyalsML for all the guidance, support, and incredibly eventful and defining research years ♥️!



United States Trends
- 1. zendaya 5,922 posts
- 2. trisha paytas 2,100 posts
- 3. Apple TV 10.5K posts
- 4. No Kings 227K posts
- 5. #FursuitFriday 14.4K posts
- 6. #FanCashDropPromotion 1,519 posts
- 7. #FridayVibes 8,130 posts
- 8. #เพียงเธอตอนจบ 2.2M posts
- 9. #Yunho 29.1K posts
- 10. LINGORM ONLY YOU FINAL EP 2.14M posts
- 11. GAME DAY 32.1K posts
- 12. Arc Raiders 5,306 posts
- 13. Trevon Diggs N/A
- 14. Mamdani 293K posts
- 15. Eli Roth N/A
- 16. Cuomo 128K posts
- 17. My President 57.6K posts
- 18. Ramesh 4,485 posts
- 19. Karoline Leavitt 45K posts
- 20. $RANI 5,915 posts
You might like
-
Zeynep Akata
@zeynepakata -
ELLIS
@ELLISforEurope -
AI Conference DL Countdown
@DlCountdown -
Mihaela van der Schaar
@MihaelaVDS -
Michael Black
@Michael_J_Black -
Been Kim
@_beenkim -
Cambridge MLG
@CambridgeMLG -
Yarin
@yaringal -
UvA AMLab
@AmlabUva -
Oana-Maria Camburu
@oanacamb -
Accepted papers at TMLR
@TmlrPub -
Georgios Pavlakos
@geopavlakos -
Federico Tombari
@fedassa -
Christoph Molnar 🦋 christophmolnar.bsky.social
@ChristophMolnar -
Kate Saenko
@kate_saenko_
Something went wrong.
Something went wrong.