Ajitesh Shukla
@ajitesh_shukla7
Student,Love to solve hardest math problem. LLM's, Mathematical Research(Geometric Topology,Differential Geometry),Quantum Computing.Lord Krishna is God Of Math
You might like
Enjoy this lecture was a real treat youtu.be/we3i5VuoPWk?si…
youtube.com
YouTube
Lecture 37: Introduction to SASS & GPU Microarchitecture
Can AI invent new math? A new paper from DeepMind and renowned mathematician Terence Tao shows how. Using AlphaEvolve, the team merges LLM-generated ideas with automated evaluation to propose, test, and refine mathematical algorithms. In tests on 67 problems across analysis,…
Cosandal, Ulukus: Optimal Source Coding of Markov Chains for Real-Time Remote Es... arxiv.org/abs/2511.02803 arxiv.org/pdf/2511.02803 arxiv.org/html/2511.02803
Presenting today our work "Unsupervised Word-level Quality Estimation Through the Lens of Annotator (Dis)agreement" at the #EMNLP2025 Machine Translation morning session (Room A301, 11:45 China time). See you there! 🤗
I'm recruiting two #ComputerScience #PhD students in #MachineLearning and #AI4Science at @UAlbany @SUNY starting Fall 2026! Ad: chong-l.github.io/hiring.html
🥳🎉Sana-video inference code has been integrated into diffusers! Thanks to @lawrence_cjs @RisingSayak and the team for making it happen. huggingface.co/docs/diffusers…
The training/ Inference code and checkpoints are released. Welcome to try! github.com/NVlabs/Sana
Presenting Interactive Training today (led by @wtzhang0820)! Tune models like cooking: adjust the "heat" when the loss smells off 😄 🕟4:30-6pm • Hall C3 • Demo Session 5 Come talk to us! #EMNLP2025
Every time I watch models train, I wish I could tune LR on the fly. It's like cooking: we adjust the dial when the food smells off. We built Interactive Training to do that, turning loss monitoring into interaction. Paper👉huggingface.co/papers/2510.02… Led by @wtzhang0820 w/ Yang Lu
I’ve been so busy with trying to write my thesis and finish journal revisions that I am glad #EMNLP2025 forced me to take a break and scroll papers. Help me out by posting interesting papers!! Self promotion welcomed!
New paper drop! 🎙️ We beat GPT-5 with a 36B model 🤯🤯 Not just better in terms of completing real-world complex tasks: software engineering (locating code) and deep research. But also substantially better in terms of proactively asking for clarifying questions when necessary…
AI agents are supposed to collaborate with us to solve real-world problems, but can they really? Even the most advanced models can still give us frustrating moments when working with them deeply. We argue that real-world deployment requires more than productivity (e.g., task…
Comments welcome! With @RobinSFWalters and @yuqirose. “Symmetry in Neural Network Parameter Spaces” arxiv.org/abs/2506.13018
Parameter space symmetry describes transformation of parameters that leaves the loss unchanged.
There’s lots of symmetry in neural networks! 🔍 We survey where they appear, how they shape loss landscapes and learning dynamics, and applications in optimization, weight space learning, and much more. ➡️ Symmetry in Neural Network Parameter Spaces arxiv.org/abs/2506.13018
🤗Will present our #EMNLP2025 paper this morning! TLDR: Beyond KV Cache: New Insights on LLM Sparsity. This paper offers not just an efficient inference framework, but a new theoretical lens to understand how information flows inside LLMs. Come & talk to us if you are interested!
I think it's pretty wild that there's still no (publicly known) larger models than the Switch Transformer at 1.6T params, which was: - trained 2020, ie 5y ago - open-weights - by Barret, Liam, and Noam, what a line-up!
Apple just leaked the size of Gemini 3 Pro - 1.2T params
this is what's called "Noam's touch" btw
We ran 1,680 tournaments (25,200 rounds) to evaluate 8 frontier models. Claude Sonnet 4.5 tops the leaderboard, but no model wins across all arenas! GPT-5 dominates Poker. o3 crushes Halite. Claude owns Core War. Every arena reveals different strengths.
🧭 Siren’s Song in the AI Ocean — our survey on LLM hallucination will be presented at #EMNLP2025! We map the space of: • Hallucination phenomena • Detection & explanation • Mitigation strategies & future directions 📍 Poster Session 7 (Hall C) 🗓️ Fri, Nov 7 · 14:00–15:30…
Were you longing for counterintuitive and intriguing results? I have a surprising discovery on core principles of reinforcement learning that directly scales to high-dimensional MDPs! ✨NeurIPS Spotlight✨ Check out: Counteractive Reinforcement Learning #NeurIPS2025…
Don't sleep on PipelineRL -- this is one of the biggest jumps in compute efficiency of RL setups that we found in the ScaleRL paper (also validated by Magistral & others before)! What's the problem PipelineRL solves? In RL for LLMs, we need to send weight updates from trainer to…
In-flight weight updates have gone from a “weird trick” to a must to train LLMs with RL in the last few weeks. If you want to understand the on-policy and throughput benefits here’s the CoLM talk @DBahdanau and I gave: youtu.be/Z1uEuRKACRs
youtube.com
YouTube
Pipeline RL: RL training speed through the roofline
United States Trends
- 1. Lakers 69.5K posts
- 2. Luka 65.8K posts
- 3. Wemby 25.3K posts
- 4. Marcus Smart 5,563 posts
- 5. #LakeShow 5,412 posts
- 6. Blazers 7,974 posts
- 7. Russ 9,977 posts
- 8. Ayton 14.9K posts
- 9. Will Richard 6,149 posts
- 10. Horford 1,888 posts
- 11. #AmphoreusStamp 5,835 posts
- 12. #RipCity N/A
- 13. Podz 2,361 posts
- 14. Champagnie 1,201 posts
- 15. #dispatch 61.5K posts
- 16. Kuminga 3,302 posts
- 17. Thunder 33.6K posts
- 18. Godzilla 32.6K posts
- 19. #AEWDynamite 20.2K posts
- 20. Nico Harrison 1,660 posts
You might like
-
Kasper Green Larsen
@kasperglarsen -
Ziming Liu
@ZimingLiu11 -
Max Zhdanov
@maxxxzdn -
Peter Bloem (@[email protected])
@pbloemesquire -
Zhijing Jin
@ZhijingJin -
Cristian Bodnar
@crisbodnar -
Haifeng Xu
@haifengxu0 -
Douwe Kiela
@douwekiela -
Thomas Ahle
@thomasahle -
Guy Dar
@guy_dar1 -
Alexander Terenin - on the faculty job market
@avt_im -
Patrick Kidger
@PatrickKidger -
Gergely Neu
@neu_rips -
Amin Karbasi
@aminkarbasi -
Jina AI
@JinaAI_
Something went wrong.
Something went wrong.