JaneDing_AI's profile picture. Data Science junior @Umich · Starting research in vision-language models and pragmatic generation · Exploring how AI communicates like humans

JaneDing

@JaneDing_AI

Data Science junior @Umich · Starting research in vision-language models and pragmatic generation · Exploring how AI communicates like humans

Repost di JaneDing

🔎 New Toolkit Released: VLM-Lens 🔎 github.com/compling-wat/v… In the past 10 months, our lab, together with collaborators @ziqiao_ma @SLED_AI @jzhou_jz, developed a simple and streamlined interpretability toolkit for VLMs supporting 16 state-of-the-art models across the board!

Over the past few months, I’ve heard the same complaint from nearly every collaborator working on computational cogsci + behavioral and mechanistic interpretability: “Open-source VLMs are a pain to run, let alone analyze.” We finally decided to do something about it (thanks…

ziqiao_ma's tweet image. Over the past few months, I’ve heard the same complaint from nearly every collaborator working on computational cogsci + behavioral and mechanistic interpretability: 

“Open-source VLMs are a pain to run, let alone analyze.”

We finally decided to do something about it (thanks…


Excited to attend #COLM2025 in Montréal this week! I’ll be presenting our paper "Vision-Language Models Are Not Pragmatically Competent in Referring Expression Generation", in Poster Session 4. Looking forward to meeting many of you there! ☺️ vlm-reg.github.io


Repost di JaneDing

Regrettably can’t attend #COLM2025 due to deadlines, but @JaneDing_AI and @SLED_AI will be presenting our work. :) @JaneDing_AI is an exceptional undergraduate researcher and a great collaborator! Go meet her at COLM if you’re curious about her work on mechanistic…

Vision-Language Models (VLMs) can describe the environment, but can they refer within it? Our findings reveal a critical gap: VLMs fall short of pragmatic optimality. We identify 3 key failures of pragmatic competence in referring expression generation with VLMs: (1) cannot…



Repost di JaneDing

+1 on this! Mixed-effects models are such an underrated protocol for behavioral analysis that AI researchers often overlook. Behavioral data are almost never independent: clustering, repeated measures, and hierarchical structures abound. Mixed-effects models account for these…

I'd highlight the point on generalization: to make a "poor generalization" argument, we need systematic evaluations. A promising protocol is prompting multiple LMs and treating each as an individual in mixed-effect models. arxiv.org/pdf/2502.09589 w/ @tom_yixuan_wang (2/n)



Repost di JaneDing

Our study on pragmatic generation is accepted to #COLM2025! Missed the first COLM last year (no suitable ongoing project at the time😅). Heard it’s a great place to connect with LM folks, excited to join for round two finally.

Vision-Language Models (VLMs) can describe the environment, but can they refer within it? Our findings reveal a critical gap: VLMs fall short of pragmatic optimality. We identify 3 key failures of pragmatic competence in referring expression generation with VLMs: (1) cannot…



Repost di JaneDing

Thrilled to finally share SimWorld — the result of over a year’s work of the team. Simulators have been foundational for embodied AI research (I’ve worked with AI2Thor, CARLA, Genesis…), and SimWorld pushes this further with photorealistic Unreal-based rendering and scalable…

🚀 Excited to introduce SimWorld: an embodied simulator for infinite photorealistic world generation 🏙️ populated with diverse agents 🤖 If you are at #CVPR2025, come check out the live demo 👇 Jun 14, 12:00-1:00 pm at JHU booth, ExHall B Jun 15, 10:30 am-12:30 pm, #7, ExHall B



Repost di JaneDing

P.S., We are building @GrowAiLikeChild, an open-source community uniting researchers from computer science, cognitive science, psychology, linguistics, philosophy, and beyond. Instead of putting growing up and scaling up into opposite camps, let's build and evaluate human-like AI…


Repost di JaneDing

Vision-Language Models (VLMs) can describe the environment, but can they refer within it? Our findings reveal a critical gap: VLMs fall short of pragmatic optimality. We identify 3 key failures of pragmatic competence in referring expression generation with VLMs: (1) cannot…


United States Tendenze

Loading...

Something went wrong.


Something went wrong.