codezakh's profile picture. @uncnlp with @mohitban47 working on automating env/data generation + program synthesis
formerly @allenai @neclabsamerica

Zaid Khan

@codezakh

@uncnlp with @mohitban47 working on automating env/data generation + program synthesis formerly @allenai @neclabsamerica

ปักหมุด

How can an agent reverse engineer the underlying laws of an unknown, hostile & stochastic environment in “one life”, without millions of steps + human-provided goals / rewards? In our work, we: 1️⃣ infer an executable symbolic world model (a probabilistic program capturing…


Zaid Khan รีโพสต์แล้ว

How can an agent reverse engineer the underlying laws of an unknown, hostile & stochastic environment in “one life”, without millions of steps + human-provided goals / rewards? In our work, we: 1️⃣ infer an executable symbolic world model (a probabilistic program capturing…


Zaid Khan รีโพสต์แล้ว

🚨 Excited to announce "One Life to Learn: Inferring Symbolic World Models for Stochastic Environments from Unguided Exploration" --> (1) Our agent can infer/reverse engineer the laws of an unknown, stochastic environment from a single, unguided episode -- without requiring…

How can an agent reverse engineer the underlying laws of an unknown, hostile & stochastic environment in “one life”, without millions of steps + human-provided goals / rewards? In our work, we: 1️⃣ infer an executable symbolic world model (a probabilistic program capturing…



Thanks @_akhaliq for sharing our work! We show how an agent can infer a world model as a program for an unknown, stochastic environment from one life and use the resulting world model for planning + simulating future states of environment! For those interested, please feel free…

One Life to Learn Inferring Symbolic World Models for Stochastic Environments from Unguided Exploration



Zaid Khan รีโพสต์แล้ว

🚨 Excited to share new work on inferring symbolic world models from observations! OneLife can infer world models in stochastic, complex environments by proposing rules via LLM and reweighting code-based environment laws from observations collected in a single interaction…

How can an agent reverse engineer the underlying laws of an unknown, hostile & stochastic environment in “one life”, without millions of steps + human-provided goals / rewards? In our work, we: 1️⃣ infer an executable symbolic world model (a probabilistic program capturing…



Zaid Khan รีโพสต์แล้ว

🚨Introducing OneLife, a new framework to learn world dynamics as a executable probabilistic program, from a single, unguided episode in a stochastic, complex environment. ✨Highlights: ➡️ Inference only routes through relevant laws, solving scaling challenges in complex state…

How can an agent reverse engineer the underlying laws of an unknown, hostile & stochastic environment in “one life”, without millions of steps + human-provided goals / rewards? In our work, we: 1️⃣ infer an executable symbolic world model (a probabilistic program capturing…



Zaid Khan รีโพสต์แล้ว

🚨 Excited to share our new work ✨ OneLife ✨, which investigates how an agent can infer executable symbolic world models 🌐 from a single unguided trajectory in a stochastic environment. I’m especially excited about our planning + evaluation contributions: 1️⃣ We support…

How can an agent reverse engineer the underlying laws of an unknown, hostile & stochastic environment in “one life”, without millions of steps + human-provided goals / rewards? In our work, we: 1️⃣ infer an executable symbolic world model (a probabilistic program capturing…



Zaid Khan รีโพสต์แล้ว

🚨 New Paper Alert! Introducing SciVideoBench — a comprehensive benchmark for scientific video reasoning! 🔬SciVideoBench: 1. Spans Physics, Chemistry, Biology & Medicine with authentic experimental videos. 2. Features 1,000 challenging MCQs across three reasoning types:…

shoubin621's tweet image. 🚨 New Paper Alert! Introducing SciVideoBench — a comprehensive benchmark for scientific video reasoning!

🔬SciVideoBench:

1. Spans Physics, Chemistry, Biology & Medicine with authentic experimental videos.

2. Features 1,000 challenging MCQs across three reasoning types:…

Zaid Khan รีโพสต์แล้ว

We welcome Prof. Mohit Bansal (UNC Chapel Hill) as a keynote speaker at #CODS2025! Director of UNC’s MURGe-Lab, he works in multimodal generative models, reasoning agents & faithful language generation. He is an AAAI Fellow, PECASE and multiple best paper awardee.

ikddcods's tweet image. We welcome Prof. Mohit Bansal (UNC Chapel Hill) as a keynote speaker at #CODS2025!

Director of UNC’s MURGe-Lab, he works in multimodal generative models, reasoning agents & faithful language generation. He is an AAAI Fellow, PECASE and multiple best paper awardee.

Zaid Khan รีโพสต์แล้ว

🚨 Thrilled to introduce Self-Improving Demonstrations (SID) for Goal-Oriented Vision-and-Language Navigation — a scalable paradigm where navigation agents learn to explore by teaching themselves. ➡️ Agents iteratively generate and learn from their own successful trajectories ➡️…

ZunWang919's tweet image. 🚨 Thrilled to introduce Self-Improving Demonstrations (SID) for Goal-Oriented Vision-and-Language Navigation — a scalable paradigm where navigation agents learn to explore by teaching themselves.

➡️ Agents iteratively generate and learn from their own successful trajectories
➡️…

Zaid Khan รีโพสต์แล้ว

Thanks for the shoutout! 🇨🇦I’ll be at #COLM2025 presenting two papers: GenerationPrograms (Attribution): Poster Session 4, Oct 8th, 4:30 PM QAPyramid (Summarization Eval): Poster Session 5, Oct 9th, 11:00 AM I’m also on the industry job market for research scientist roles.…

🚨 Check out our awesome students/postdocs' papers at #COLM2025 and say hi to them (several are on the job market or hiring) --> -- Archiki, David are on the post-PhD job market! -- Elias finished his postdoc & is now faculty at UT-Austin CS and looking to admit PhD students!…

mohitban47's tweet image. 🚨 Check out our awesome students/postdocs' papers at #COLM2025 and say hi to them (several are on the job market or hiring) -->

-- Archiki, David are on the post-PhD job market!
-- Elias finished his postdoc & is now faculty at UT-Austin CS and looking to admit PhD students!…


Zaid Khan รีโพสต์แล้ว

❗️Self-evolution is quietly pushing LLM agents off the rails. ⚠️ Even perfect alignment at deployment can gradually forget human alignment and shift toward self-serving strategies. Over time, LLM agents stop following values, imitate bad strategies, and even spread misaligned…

🚨 Introducing ATP — Alignment Tipping Process! 🔥 Beware! Self-Evolution is gradually pushing LLM Agents off the rails! Even perfect alignment at deployment can gradually forget human alignment and shift toward self-serving strategies. #AI #LLM #Agents #SelfEvolving #Alignment

lillianwei423's tweet image. 🚨 Introducing ATP — Alignment Tipping Process!
🔥 Beware! Self-Evolution is gradually pushing LLM Agents off the rails! Even perfect alignment at deployment can gradually forget human alignment and shift toward self-serving strategies.

#AI #LLM #Agents #SelfEvolving #Alignment…


Zaid Khan รีโพสต์แล้ว

I am attending #COLM2025 🇨🇦 this week to present our work on: Unit Test Generation: 📅 Oct 8th (Wed), 4:30 PM, #79 RAG with conflicting evidence: 📅 Oct 9th (Thu), 11 AM, #71 PS: I'm on the industry job market for RS roles, so you can reach me via DM or in-person to chat! 😄

🚨 Check out our awesome students/postdocs' papers at #COLM2025 and say hi to them (several are on the job market or hiring) --> -- Archiki, David are on the post-PhD job market! -- Elias finished his postdoc & is now faculty at UT-Austin CS and looking to admit PhD students!…

mohitban47's tweet image. 🚨 Check out our awesome students/postdocs' papers at #COLM2025 and say hi to them (several are on the job market or hiring) -->

-- Archiki, David are on the post-PhD job market!
-- Elias finished his postdoc & is now faculty at UT-Austin CS and looking to admit PhD students!…


Loading...

Something went wrong.


Something went wrong.