uiuc_nlp's profile picture. Natural Language Processing research group at The University of Illinois Urbana-Champaign @IllinoisCS @UofIllinois

UIUC NLP

@uiuc_nlp

Natural Language Processing research group at The University of Illinois Urbana-Champaign @IllinoisCS @UofIllinois

UIUC NLP gönderiyi yeniden yayınladı

🏠🤖 Are our household robotic agents actually safe? BEAT introduces the first visual backdoor attack on MLLM-based embodied agents, where a single object (e.g., 🔪 or 🏺) can silently flip a home robot from normal behavior into harmful multi-step actions. 🚨Check out our work!

🤖 Feeling excited about the future of household robotic agents (i.e., embodied agents)? You should also consider their safety! 🔪Meet BEAT: the first visual backdoor attack on MLLM-based embodied agents. 🧵 1/7



UIUC NLP gönderiyi yeniden yayınladı

Had a great time at the @uiuc_nlp large group on “Sparse Autoencoders: Discoveries, Limitations, and the Bridge between Experiment and Theory” Thanks to Prof. @hengjinlp and @MasterJeongK for kindly hosting, and to everyone for the engaging discussion! 🔗arxiv.org/pdf/2503.24277

sewoong_sam_lee's tweet image. Had a great time at the @uiuc_nlp large group on “Sparse Autoencoders: Discoveries, Limitations, and the Bridge between Experiment and Theory”

Thanks to Prof. @hengjinlp and @MasterJeongK for kindly hosting, and to everyone for the engaging discussion!

🔗arxiv.org/pdf/2503.24277
sewoong_sam_lee's tweet image. Had a great time at the @uiuc_nlp large group on “Sparse Autoencoders: Discoveries, Limitations, and the Bridge between Experiment and Theory”

Thanks to Prof. @hengjinlp and @MasterJeongK for kindly hosting, and to everyone for the engaging discussion!

🔗arxiv.org/pdf/2503.24277

UIUC NLP gönderiyi yeniden yayınladı

What if your policy could reason and think dynamically, especially about uncertainty, enabling better real-world behavior? ⚡️Introducing EBT-Policy, an instantiation of Energy-Based Transformers for Policies! TLDR: - EBT-Policy broadly outperforms Diffusion Policy in both…

AlexiGlad's tweet image. What if your policy could reason and think dynamically, especially about uncertainty, enabling better real-world behavior?

⚡️Introducing EBT-Policy, an instantiation of Energy-Based Transformers for Policies!
TLDR:
- EBT-Policy broadly outperforms Diffusion Policy in both…

UIUC NLP gönderiyi yeniden yayınladı

Many of my students cannot attend EMNLP in person due to visa problems, but the super rising star Cheng Qian @qiancheng1231 will be there presenting multiple papers. Please drop by our posters and talk to him!


UIUC NLP gönderiyi yeniden yayınladı

@ZhiruoW 's research compares AI agents vs humans across real work tasks (data analysis, engineering, design, writing). Key findings: 👉Agents are 88% faster & 90-96% cheaper 👉BUT produce lower quality work, often fabricate data to mask limitations 👉Agents code everything,…

Diyi_Yang's tweet image. @ZhiruoW 's  research compares AI agents vs humans across real work tasks (data analysis, engineering, design, writing). Key findings:

👉Agents are 88% faster & 90-96% cheaper
👉BUT produce lower quality work, often fabricate data to mask limitations
👉Agents code everything,…

Agents are joining us at work -- coding, writing, design. But how do they actually work, especially compared to humans? Their workflows tell a different story: They code everything, slow down human flows, and deliver low-quality work fast. Yet when teamed with humans, they shine…



UIUC NLP gönderiyi yeniden yayınladı

Today, we’re overjoyed to have a 25th Anniversary Reunion of @stanfordnlp. So happy to see so many of our former students back at @Stanford. And thanks to @StanfordHAI for the venue!

stanfordnlp's tweet image. Today, we’re overjoyed to have a 25th Anniversary Reunion of @stanfordnlp. 

So happy to see so many of our former students back at @Stanford. 

And thanks to @StanfordHAI for the venue!

UIUC NLP gönderiyi yeniden yayınladı

🤖💬AI agents can be easily persuaded (like Anthropic’s Claudius often giving discounts). 🤔Previous study on persuasion has been exclusively on text-only modality. We wonder: are AI agents more susceptible when presented with multimodal content? Introducing MMPersuade, a…

HaoyiQiu's tweet image. 🤖💬AI agents can be easily persuaded (like Anthropic’s Claudius often giving discounts).

🤔Previous study on persuasion has been exclusively on text-only modality. We wonder: are AI agents more susceptible when presented with multimodal content?

Introducing MMPersuade, a…

UIUC NLP gönderiyi yeniden yayınladı

World Model Reasoning for VLM Agents (NeurIPS 2025, Score 5544) We release VAGEN to teach VLMs to build internal world models via visual state reasoning: - StateEstimation: what is the current state? - TransitionModeling: what is next? MDP → POMDP shift to handle the partial…


UIUC NLP gönderiyi yeniden yayınladı

Multimodal conversational agents struggle to follow complex policies, which also impose a fixed computational cost. We ask: 👉 How can we achieve stronger policy-following behavior without having to include policies in-context? 🌐: mikewangwzhl.github.io/TriMPI/ 🧵1/3

zhenhailongW's tweet image. Multimodal conversational agents struggle to follow complex policies, which also impose a fixed computational cost.
We ask:
👉 How can we achieve stronger policy-following behavior without having to include policies in-context?
🌐: mikewangwzhl.github.io/TriMPI/ 🧵1/3

UIUC NLP gönderiyi yeniden yayınladı

Thrilled to announce that I'll be joining UIUC CS @siebelschool as an Assistant Professor in Spring 2026! 📢 I’m looking for Fall '26 PhD students who are interested in the intersection of Software Engineering and AI, especially in LLM4Code and Code Agents. Please drop me an…


UIUC NLP gönderiyi yeniden yayınladı

🚀 Introducing BroRL: Scaling Reinforcement Learning via Broadened Exploration When step-scaling hits a plateau, scale rollouts, not steps. BroRL takes reinforcement learning beyond saturation—reviving stalled models by expanding exploration with large-N rollouts. 👇 (1/n)

shizhediao's tweet image. 🚀 Introducing BroRL: Scaling Reinforcement Learning via Broadened Exploration

When step-scaling hits a plateau, scale rollouts, not steps.
BroRL takes reinforcement learning beyond saturation—reviving stalled models by expanding exploration with large-N rollouts.
👇 (1/n)

UIUC NLP gönderiyi yeniden yayınladı

🚨 New preprint out! 🚨 In "BAP v2: An Enhanced Task Framework for Instruction Following in Minecraft Dialogues", we work towards a core AI challenge: how can agents follow complex, conversational instructions in a dynamic 3D world? To this end, we introduce an enhanced task…

p_jayannavar's tweet image. 🚨 New preprint out! 🚨 In "BAP v2: An Enhanced Task Framework for Instruction Following in Minecraft Dialogues", we work towards a core AI challenge: how can agents follow complex, conversational instructions in a dynamic 3D world? To this end, we introduce an enhanced task…

UIUC NLP gönderiyi yeniden yayınladı

🚨 New paper alert at COLM 2025! 🚨 An interesting open problem for those into Sparse Autoencoders (SAEs): "Top-K activation constrains L0 (the number of non-zeros), but how do we obtain E[L0]?" This was even the very first limitation noted in @nabla_theta’s recent paper. (1/8)

sewoong_sam_lee's tweet image. 🚨 New paper alert at COLM 2025! 🚨
An interesting open problem for those into Sparse Autoencoders (SAEs):
"Top-K activation constrains L0 (the number of non-zeros), but how do we obtain E[L0]?"

This was even the very first limitation noted in @nabla_theta’s recent paper. (1/8)

UIUC NLP gönderiyi yeniden yayınladı

Please note that #EMNLP2025 volunteer notifications have been sent. If you haven’t received yours, please check your spam folder or contact the chairs at [email protected] as some email addresses were entered incorrectly in the form


UIUC NLP gönderiyi yeniden yayınladı

Very excited to see Tinker released by @thinkymachines! Even more thrilled that Search-R1 is featured as the tool-use application in Tinker’s recipe 👇 🔗 github.com/thinking-machi… When we first built Search-R1, we opened up everything—data, recipes, models, code, logs—and kept…

Introducing Tinker: a flexible API for fine-tuning language models. Write training loops in Python on your laptop; we'll run them on distributed GPUs. Private beta starts today. We can't wait to see what researchers and developers build with cutting-edge open models!…

thinkymachines's tweet image. Introducing Tinker: a flexible API for fine-tuning language models.

Write training loops in Python on your laptop; we'll run them on distributed GPUs.

Private beta starts today. We can't wait to see what researchers and developers build with cutting-edge open models!…


UIUC NLP gönderiyi yeniden yayınladı

*Human-Like* Creativity is perhaps the most out-of-reach task for modern LLMs I'm super excited to share our new work evaluating LLMs with a creativity framework! We develop a synthetic creativity task to measure LLMs' capabilities in generating novel, creative, combinations,…

1/N Large language models (LLMs) have been widely adopted for closed-ended tasks like reasoning, but can they truly be creative? 📚 Excited to announce our new work — Combinatorial Creativity: A New Frontier in Generalization Abilities. 📝 Paper arxiv.org/abs/2509.21043

samschapiro's tweet image. 1/N Large language models (LLMs) have been widely adopted for closed-ended tasks like reasoning, but can they truly be creative?

📚 Excited to announce our new work — Combinatorial Creativity: A New Frontier in Generalization Abilities.

📝 Paper arxiv.org/abs/2509.21043


UIUC NLP gönderiyi yeniden yayınladı

Ever felt like your GUI agents are dragging their feet? 🧐The culprit? Crunching through endless streams of screenshots, especially in those marathon long-horizon tasks. Thrilled to unveil ⭐️ GUI-KV ⭐️— our plug-and-play powerhouse that taps into the spatial saliency within…

steeve__huang's tweet image. Ever felt like your GUI agents are dragging their feet? 🧐The culprit? Crunching through endless streams of screenshots, especially in those marathon long-horizon tasks. 

Thrilled to unveil ⭐️ GUI-KV ⭐️— our plug-and-play powerhouse that taps into the spatial saliency within…

UIUC NLP gönderiyi yeniden yayınladı

🚀 Introducing UserRL: a new framework to train agents that truly assist users through proactive interaction, not just chase static benchmarking scores. 📄 Paper: arxiv.org/pdf/2509.19736 💻 Code: github.com/SalesforceAIRe…

qiancheng1231's tweet image. 🚀 Introducing UserRL: a new framework to train agents that truly assist users through proactive interaction, not just chase static benchmarking scores.

 📄 Paper: arxiv.org/pdf/2509.19736
 💻 Code: github.com/SalesforceAIRe…

UIUC NLP gönderiyi yeniden yayınladı

🧠 Get $20 to watch videos + share what sparks your curiosity! TIMAN Lab @ UIUC is studying how people seek info while watching content. 🖥️ 30–60 min Zoom session ✅ 18+, fluent in English 💸 $20 reward Sign up: docs.google.com/forms/d/1OcnrM… #PaidStudy #AI


UIUC NLP gönderiyi yeniden yayınladı

Accepted as NeurIPS2025 Spotlight! Existing large multimodal models (LMMs) have very poor visual understanding and reasoning over part-level attributes and affordances (only 5.9% gIoU). We developed novel part-centric LMMs to address these challenges arxiv.org/pdf/2505.20759

hengjinlp's tweet image. Accepted as NeurIPS2025 Spotlight! Existing large multimodal models (LMMs) have very poor visual understanding and reasoning over part-level attributes and affordances (only 5.9% gIoU). We developed novel part-centric LMMs to address these challenges arxiv.org/pdf/2505.20759

Loading...

Something went wrong.


Something went wrong.