Ajitesh Shukla

@ajitesh_shukla7

Student,Love to solve hardest math problem. LLM's, Mathematical Research(Geometric Topology,Differential Geometry),Quantum Computing.Lord Krishna is God Of Math

Bharat Mata

Joined April 2018

58KPosts 1KFollowers 6KFollowing

YouTube

Lecture 44: NVIDIA Profiling

Source: youtube.com

Ajitesh Shukla reposted

Mark Saroufim

@marksaroufim

2 h

Enjoy this lecture was a real treat youtu.be/we3i5VuoPWk?si…

marksaroufim's tweet card. Lecture 37: Introduction to SASS & GPU Microarchitecture

youtube.com

YouTube

Lecture 37: Introduction to SASS & GPU Microarchitecture

Source: youtube.com

Ajitesh Shukla reposted

Can AI invent new math? A new paper from DeepMind and renowned mathematician Terence Tao shows how. Using AlphaEvolve, the team merges LLM-generated ideas with automated evaluation to propose, test, and refine mathematical algorithms. In tests on 67 problems across analysis,…

jiqizhixin's tweet image. Can AI invent new math?

A new paper from DeepMind and renowned mathematician Terence Tao shows how.

Using AlphaEvolve, the team merges LLM-generated ideas with automated evaluation to propose, test, and refine mathematical algorithms.

In tests on 67 problems across analysis,…

Ajitesh Shukla reposted

arXiv math.IT Information Theory

@mathITbot

Nov 5

Cosandal, Ulukus: Optimal Source Coding of Markov Chains for Real-Time Remote Es... arxiv.org/abs/2511.02803 arxiv.org/pdf/2511.02803 arxiv.org/html/2511.02803

Ajitesh Shukla reposted

Gabriele Sarti @ EMNLP 🇨🇳

@gsarti_

7 h

Presenting today our work "Unsupervised Word-level Quality Estimation Through the Lens of Annotator (Dis)agreement" at the #EMNLP2025 Machine Translation morning session (Room A301, 11:45 China time). See you there! 🤗

Ajitesh Shukla reposted

Chong Liu

@ChongLiuCS

19 h

I'm recruiting two #ComputerScience #PhD students in #MachineLearning and #AI4Science at @UAlbany @SUNY starting Fall 2026! Ad: chong-l.github.io/hiring.html

Ajitesh Shukla reposted

Enze Xie

@xieenze_jr

3 h

🥳🎉Sana-video inference code has been integrated into diffusers! Thanks to @lawrence_cjs @RisingSayak and the team for making it happen. huggingface.co/docs/diffusers…

SanaVideoPipeline

Source: huggingface.co

Enze Xie

@xieenze_jr

Nov 1

The training/ Inference code and checkpoints are released. Welcome to try! github.com/NVlabs/Sana

xieenze_jr's tweet card. SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer - NVlabs/Sana

GitHub - NVlabs/Sana: SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion...

Source: github.com

Ajitesh Shukla reposted

Yuntian Deng @EMNLP 2025

@yuntiandeng

4 h

Presenting Interactive Training today (led by @wtzhang0820)! Tune models like cooking: adjust the "heat" when the loss smells off 😄 🕟4:30-6pm • Hall C3 • Demo Session 5 Come talk to us! #EMNLP2025

Yuntian Deng @EMNLP 2025

@yuntiandeng

Oct 3

Every time I watch models train, I wish I could tune LR on the fly. It's like cooking: we adjust the dial when the food smells off. We built Interactive Training to do that, turning loss monitoring into interaction. Paper👉huggingface.co/papers/2510.02… Led by @wtzhang0820 w/ Yang Lu

yuntiandeng's tweet image. Every time I watch models train, I wish I could tune LR on the fly.
It's like cooking: we adjust the dial when the food smells off.

We built Interactive Training to do that, turning loss monitoring into interaction.

Paper👉huggingface.co/papers/2510.02…
Led by @wtzhang0820 w/ Yang Lu

Ajitesh Shukla reposted

lynnette ng

@quarbby

6 h

I’ve been so busy with trying to write my thesis and finish journal revisions that I am glad #EMNLP2025 forced me to take a break and scroll papers. Help me out by posting interesting papers!! Self promotion welcomed!

Ajitesh Shukla reposted

Xuhui Zhou

@nlpxuhui

11 h

Paper: arxiv.org/abs/2511.02208

Ajitesh Shukla reposted

Xuhui Zhou

@nlpxuhui

11 h

New paper drop! 🎙️ We beat GPT-5 with a 36B model 🤯🤯 Not just better in terms of completing real-world complex tasks: software engineering (locating code) and deep research. But also substantially better in terms of proactively asking for clarifying questions when necessary…

Weiwei Sun

@sunweiwei12

15 h

AI agents are supposed to collaborate with us to solve real-world problems, but can they really? Even the most advanced models can still give us frustrating moments when working with them deeply. We argue that real-world deployment requires more than productivity (e.g., task…

sunweiwei12's tweet image. AI agents are supposed to collaborate with us to solve real-world problems, but can they really? Even the most advanced models can still give us frustrating moments when working with them deeply.

We argue that real-world deployment requires more than productivity (e.g., task…

Ajitesh Shukla reposted

Bo Zhao

@BoZhao__

11 h

Comments welcome! With @RobinSFWalters and @yuqirose. “Symmetry in Neural Network Parameter Spaces” arxiv.org/abs/2506.13018

Ajitesh Shukla reposted

Bo Zhao

@BoZhao__

11 h

Parameter space symmetry describes transformation of parameters that leaves the loss unchanged.

Ajitesh Shukla reposted

Bo Zhao

@BoZhao__

11 h

There’s lots of symmetry in neural networks! 🔍 We survey where they appear, how they shape loss landscapes and learning dynamics, and applications in optimization, weight space learning, and much more. ➡️ Symmetry in Neural Network Parameter Spaces arxiv.org/abs/2506.13018

BoZhao__'s tweet image. There’s lots of symmetry in neural networks!

🔍 We survey where they appear, how they shape loss landscapes and learning dynamics, and applications in optimization, weight space learning, and much more.

➡️ Symmetry in Neural Network Parameter Spaces arxiv.org/abs/2506.13018

Ajitesh Shukla reposted

Zhijiang Guo✈️EMNLP

@ZhijiangG

7 h

🤗Will present our #EMNLP2025 paper this morning! TLDR: Beyond KV Cache: New Insights on LLM Sparsity. This paper offers not just an efficient inference framework, but a new theoretical lens to understand how information flows inside LLMs. Come & talk to us if you are interested!

ZhijiangG's tweet image. 🤗Will present our #EMNLP2025 paper this morning! TLDR: Beyond KV Cache: New Insights on LLM Sparsity.
This paper offers not just an efficient inference framework, but a new theoretical lens to understand how information flows inside LLMs.
Come &amp; talk to us if you are interested!

Ajitesh Shukla reposted

Lucas Beyer (bl16)

@giffmana

12 h

I think it's pretty wild that there's still no (publicly known) larger models than the Switch Transformer at 1.6T params, which was: - trained 2020, ie 5y ago - open-weights - by Barret, Liam, and Noam, what a line-up!

giffmana's tweet image. I think it's pretty wild that there's still no (publicly known) larger models than the Switch Transformer at 1.6T params, which was:
- trained 2020, ie 5y ago
- open-weights
- by Barret, Liam, and Noam, what a line-up!

Lisan al Gaib

@scaling01

13 h

Apple just leaked the size of Gemini 3 Pro - 1.2T params

Ajitesh Shukla reposted

Igor Kotenkov

@stalkermustang

15 h

this is what's called "Noam's touch" btw

John Yang

@jyangballin

17 h

We ran 1,680 tournaments (25,200 rounds) to evaluate 8 frontier models. Claude Sonnet 4.5 tops the leaderboard, but no model wins across all arenas! GPT-5 dominates Poker. o3 crushes Halite. Claude owns Core War. Every arena reveals different strengths.

jyangballin's tweet image. We ran 1,680 tournaments (25,200 rounds) to evaluate 8 frontier models.

Claude Sonnet 4.5 tops the leaderboard, but no model wins across all arenas!

GPT-5 dominates Poker. o3 crushes Halite. Claude owns Core War.

Every arena reveals different strengths.

Ajitesh Shukla reposted

Yafu Li

@yafuly

6 h

🧭 Siren’s Song in the AI Ocean — our survey on LLM hallucination will be presented at #EMNLP2025! We map the space of: • Hallucination phenomena • Detection & explanation • Mitigation strategies & future directions 📍 Poster Session 7 (Hall C) 🗓️ Fri, Nov 7 · 14:00–15:30…

yafuly's tweet image. 🧭 Siren’s Song in the AI Ocean — our survey on LLM hallucination will be presented at #EMNLP2025!

We map the space of:
• Hallucination phenomena
• Detection &amp; explanation
• Mitigation strategies &amp; future directions
📍 Poster Session 7 (Hall C)
🗓️ Fri, Nov 7 · 14:00–15:30…

Ajitesh Shukla reposted

Ezgi Korkmaz

@EzgiKorkmazAI

18 h

Were you longing for counterintuitive and intriguing results? I have a surprising discovery on core principles of reinforcement learning that directly scales to high-dimensional MDPs! ✨NeurIPS Spotlight✨ Check out: Counteractive Reinforcement Learning #NeurIPS2025…

EzgiKorkmazAI's tweet image. Were you longing for counterintuitive and intriguing results?

I have a surprising discovery on core principles of reinforcement learning that directly scales to high-dimensional MDPs!

✨NeurIPS Spotlight✨

Check out: Counteractive Reinforcement Learning #NeurIPS2025…

Ajitesh Shukla reposted

Rishabh Agarwal

@agarwl_

8 h

Don't sleep on PipelineRL -- this is one of the biggest jumps in compute efficiency of RL setups that we found in the ScaleRL paper (also validated by Magistral & others before)! What's the problem PipelineRL solves? In RL for LLMs, we need to send weight updates from trainer to…

agarwl_'s tweet image. Don't sleep on PipelineRL -- this is one of the biggest jumps in compute efficiency of RL setups that we found in the ScaleRL paper (also validated by Magistral &amp; others before)!

What's the problem PipelineRL solves? In RL for LLMs, we need to send weight updates from trainer to…

Alexandre L.-Piché

@alexpiche_

Nov 4

In-flight weight updates have gone from a “weird trick” to a must to train LLMs with RL in the last few weeks. If you want to understand the on-policy and throughput benefits here’s the CoLM talk @DBahdanau and I gave: youtu.be/Z1uEuRKACRs