Alexander H. Liu
@alex_h_liu
Ph.D. Student @MIT_CSAIL
คุณอาจชื่นชอบ
The Voxtral tech-report is up! arxiv.org/abs/2507.13264 We release these models with a permissive Apache 2.0 license. Feedback is welcome! We have a lot more cooking, this is just the beginning.
💡Bridging speech, sound, & music representations with one universal model? We introduce USAD ✅ 📚 Distills knowledge from domain-specific SSL models 🎯 Matches expert models across speech/audio/music tasks 📄 arxiv.org/abs/2506.18843 🧑💻 huggingface.co/MIT-SLS/USAD-B…
                                            
                                            
                                            
                                            Highly recommended!!! (Happy to chat if you’re curious about the experience with the team)
Our team at NVIDIA is continuously looking for highly motivated interns to work on intelligence in audio understanding and synthesis. Please reach out if you would like to collaborate with us!
Turns out speech self-supervised learning technique can be generalized to sign language! Great work led by @Shester_G (he’s looking for PhD opportunity this year!)
Ever imagined a foundational model for sign language ?! Introducing SHuBERT(Sign Hidden Unit BERT)! With SHuBERT, we get SOTA results on ASL video understanding tasks compared to task-specific models from Google DeepMind, Meta, and Microsft, while using less compute ! 🧵 1/9
💚 Big shoutout to the #FUGATTO team for making this release happen — and to cats like Coltrane and Xenakis, who envisioned a world where "saxophones bark and howl." Together, artists and researchers, let’s build a GPT-like future for audio generation! fugatto.github.io
Q: Why can't we get GPT-level understanding from language models on speech? A: We need better speech tokens! In SyllableLM, *we beat @kyutai_labs Moshi on semantic understanding in 70 hours of training* by making speech tokens at 5 frames/s With @PuyuanPeng, David Harwath 1/n
                                            
                                            
                                            Synthetic labels are amazing! Do you need an audio labelling machine? Audio Flamingo checkpoints are available on github.com/NVIDIA/audio-f… ...and pre-training with synthetic labels from Audio Flamingo gives large improvements in text-to-audio models arxiv.org/abs/2406.15487
Beautiful work by Alex Liu on generative pre-training for speech with Flow Matching. I just realized it's one of the main components in AudioBox! arxiv.org/abs/2310.16338
Recent years have witnessed significant developments in audio codec models (an overview figure from arxiv.org/abs/2402.13236). We introduce Codec-SUPERB (arxiv.org/abs/2402.13071) to boost fair and comprehensive comparison. Leaderboard: codecsuperb.com
                                            Lin-Shan: if no one asked you to attend the closing ceremony, you’re probably not getting the award (and laughed out loud)
Prof. Lin-Shan Lee remembers all his students… amazing…
                                                                            LTU and LTU-AS codes are released. As usual, it is a full release including training and inference code, pretrained checkpoint, and the datasets. We hope these would be useful. Check github.com/YuanGongND/ltu.
I'll have a keynote talk at ASRU'23! asru2023.org/motion.asp?sit… See you soon in Taiwan! Actually, ASRU was the first conference that rejected my first-author paper (in 2003). But 20 years later, I was given the opportunity to be a keynote speaker, haha.
We summarize our lab's activities toward speech foundation models at wavlab.org/activities/202…. We have several other ongoing activities, and they are selected papers presented at ASRU.
🚀 Our upgraded audio large language model LTU-2 is now hosted on HuggingFace Space at lnkd.in/eJDpsBY4. Please have a try and let us know what you think 😀 .
🗣️ Whisper is great for speech recognition, but it only recognizes ~100 languages. What if it wasn't trained on the language that you speak? Happy to introduce my #INTERSPEECH2023 paper comparing Whisper and XLS-R for adaption to unseen languages! arxiv.org/abs/2305.12606
                                            United States เทรนด์
- 1. Cheney 31.2K posts
 - 2. Election Day 83.2K posts
 - 3. Logan Wilson 3,035 posts
 - 4. Good Tuesday 27.1K posts
 - 5. GO VOTE 72.5K posts
 - 6. #tuesdayvibe 1,498 posts
 - 7. Rolex 15.9K posts
 - 8. #Talus_Labs N/A
 - 9. Halliburton 1,695 posts
 - 10. Jerry 48.2K posts
 - 11. #Election2025 2,221 posts
 - 12. George W. Bush 8,459 posts
 - 13. Hogg 7,703 posts
 - 14. Tommy Robinson 26.1K posts
 - 15. Jonathan Bailey 47.7K posts
 - 16. iPads N/A
 - 17. Comey 96.7K posts
 - 18. #AllsFair N/A
 - 19. #WeTVAlwaysMore2026 1.5M posts
 - 20. Sexiest Man Alive 47.2K posts
 
คุณอาจชื่นชอบ
- 
                                                
                                                    
                                                        Yuan Gong
@YGongND - 
                                                
                                                    
                                                        Puyuan Peng
@PuyuanPeng - 
                                                
                                                    
                                                        Yung-Sung Chuang
@YungSungChuang - 
                                                
                                                    
                                                        Cheng-I Jeff Lai
@jefflai108 - 
                                                
                                                    
                                                        Hung-yi Lee (李宏毅)
@HungyiLee2 - 
                                                
                                                    
                                                        Paola Garcia
@leibnyPaola - 
                                                
                                                    
                                                        Cheng Han Chiang (姜成翰)
@dcml0714 - 
                                                
                                                    
                                                        Leo Yang
@leo19941227 - 
                                                
                                                    
                                                        Hongyin Luo
@lhyTHU - 
                                                
                                                    
                                                        jiatongshi
@jiatongshi - 
                                                
                                                    
                                                        Siddhant Arora
@Sid_Arora_18 - 
                                                
                                                    
                                                        JIACHEN LIAN
@LianJiachen - 
                                                
                                                    
                                                        Andrew Rouditchenko 🇺🇦
@arouditchenko - 
                                                
                                                    
                                                        Huck Yang
@huckiyang - 
                                                
                                                    
                                                        Ju-Chieh Chou
@ju_chieh 
Something went wrong.
Something went wrong.