juand_r_nlp's profile picture. CS PhD student at UT Austin in #NLP
Interested in language, reasoning, semantics and cognitive science. 

You can also find me over at the other site 🦋

Juan Diego Rodríguez (he/him)

@juand_r_nlp

CS PhD student at UT Austin in #NLP Interested in language, reasoning, semantics and cognitive science. You can also find me over at the other site 🦋

Juan Diego Rodríguez (he/him) gönderiyi yeniden yayınladı

If you're 13 right now here's what you should do: Spend all your time reading philosophy, sci fi, poetry, and history. Go to museums and look at every damn thing in there. Watch all the classic films and maybe every single movie made in the 90s. Learn how to write prose,…

When we sat down with @alexandr_wang at Meta Connect 2025, he shared his advice for young people: “If you’re 13 right now, you should spend all your time vibe coding. This is the Bill Gates, Mark Zuckerberg moment. The people who grow up with these tools will have an immense…



🤣

For maximum alpha, complete with fighting for princesses,

laserboat999's tweet image. For maximum alpha, complete with fighting for princesses,


Juan Diego Rodríguez (he/him) gönderiyi yeniden yayınladı

If you are interested in winning an ACL paper award, it's a very smart move is to write something with @kmahowald. Historically, he has won at least one each year for as long as I can remember.

Delighted Sasha's work using mech interp to study complex syntax constructions won an Outstanding Paper Award at EMNLP! And delighted the ACL community continues to recognize unabashedly linguistic topics like filler-gaps, and the huge potential for LMs to inform such topics!



Juan Diego Rodríguez (he/him) gönderiyi yeniden yayınladı

🚨 Announcing a new LLM calibration method, DINCO, which enforces confidence coherence (that probs must sum to 1) by having the LLM verbalize its confidence independently on self-generated distractors, and normalizing by the total confidence. Major gains on long + short-form QA!

🚨 Introducing DINCO, a zero-resource calibration method for verbalized LLM confidence. We normalize over self-generated distractors to enforce coherence ➡️ better-calibrated and less saturated (more usable) confidence! ⚠️ Problem: Standard verbalized confidence is overconfident…

EliasEskin's tweet image. 🚨 Introducing DINCO, a zero-resource calibration method for verbalized LLM confidence. We normalize over self-generated distractors to enforce coherence ➡️ better-calibrated and less saturated (more usable) confidence!

⚠️ Problem: Standard verbalized confidence is overconfident…


Juan Diego Rodríguez (he/him) gönderiyi yeniden yayınladı

COLM Keynote: Nicholas Carlini Are LLMs worth it? youtube.com/watch?v=PngHcm…

COLM_conf's tweet card. Nicholas Carlini - Are LLMs worth it?

youtube.com

YouTube

Nicholas Carlini - Are LLMs worth it?


Juan Diego Rodríguez (he/him) gönderiyi yeniden yayınladı

I’ll be in Boston attending BUCLD this week — I won’t be presenting but I’ll be cheering on @najoungkim who will present at the prestigious SLD symposium about the awesome work by her group, including our work on LMs as hypotheses generators for language acquisition! 🤠👻

kanishkamisra's tweet image. I’ll be in Boston attending BUCLD this week — I won’t be presenting but I’ll be cheering on @najoungkim who will present at the prestigious SLD symposium about the awesome work by her group, including our work on LMs as hypotheses generators for language acquisition! 

🤠👻

Juan Diego Rodríguez (he/him) gönderiyi yeniden yayınladı

Dear “15-18 yo founder”s sending me DMs, don’t. Go and hug your parents, fall in love, eat chocolate cereal for breakfast, read poetry. Nobody will give you back these years. And sure, do your homework and learn math and code if that feels fun. But stop building SaaS and…


Juan Diego Rodríguez (he/him) gönderiyi yeniden yayınladı

“For a glorious decade in the 2010s, we all worked on better neural network architectures, but after that we just worked on scaling transformers and making them more efficient”

We already called it in 2020!

chrmanning's tweet image. We already called it in 2020!


Juan Diego Rodríguez (he/him) gönderiyi yeniden yayınladı
m2saxon's tweet image.

The Computer Science section of @arxiv is now requiring prior peer review for Literature Surveys and Position Papers. Details in a new blog post



Juan Diego Rodríguez (he/him) gönderiyi yeniden yayınladı

236 direct/indirect PhD students!! Based on OpenReview data, an interactive webpage: prakashkagitha.github.io/manningphdtree/ Your course, NLP with Deep learning - Winter 2017, on YouTube, was my introduction to building deep learning models. This is the least I could do to say Thank You!!

prakashkagitha's tweet image. 236 direct/indirect PhD students!!

Based on OpenReview data, an interactive webpage: prakashkagitha.github.io/manningphdtree/

Your course, NLP with Deep learning - Winter 2017, on YouTube, was my introduction to building deep learning models. This is the least I could do to say Thank You!!

Thanks! 😊 But it’d be really good to generate an updated version of those graphs!



Juan Diego Rodríguez (he/him) gönderiyi yeniden yayınladı

I'm super excited to update that "Lost in Automatic Translation" is now available as an audiobook! 🔊📖 It's currently on Audible: audible.ca/pd/B0FXY8VQX5 Stay tuned (lostinautomatictranslation.com) for more retailers, including Amazon, iTunes, etc., and public libraries! 📚

VeredShwartz's tweet image. I'm super excited to update that "Lost in Automatic Translation" is now available as an audiobook! 🔊📖

It's currently on Audible: 
audible.ca/pd/B0FXY8VQX5

Stay tuned (lostinautomatictranslation.com) for more retailers, including Amazon, iTunes, etc., and public libraries! 📚

Juan Diego Rodríguez (he/him) gönderiyi yeniden yayınladı

A thread on the equilibria of pendulums and their connection to topology. 1/n


Juan Diego Rodríguez (he/him) gönderiyi yeniden yayınladı

We’re drowning in language models — there are over 2 mil. of them on Huggingface! Can we use some of them to understand which computational ingredients — architecture, scale, post-training, etc. – help us build models that align with human representations? Read on to find out 🧵

ZachStuddiford's tweet image. We’re drowning in language models — there are over 2 mil. of them on Huggingface! Can we use some of them to understand which computational ingredients — architecture, scale, post-training, etc. – help us build models that align with human representations? Read on to find out 🧵

Juan Diego Rodríguez (he/him) gönderiyi yeniden yayınladı

one thing that really became clear to me (which admittedly makes me publish much less) is that, especially as academics, "beating the state of the art" is a crap target to aim for. the objective should be to replace the state of the art. (of course, this is unfortunately super…


Juan Diego Rodríguez (he/him) gönderiyi yeniden yayınladı

𝑵𝒆𝒘 𝒃𝒍𝒐𝒈𝒑𝒐𝒔𝒕! In which I give some brief reflections on #COLM2025 and give a rundown of a few great papers I checked out!

m2saxon's tweet image. 𝑵𝒆𝒘 𝒃𝒍𝒐𝒈𝒑𝒐𝒔𝒕!

In which I give some brief reflections on #COLM2025 and give a rundown of a few great papers I checked out!

Juan Diego Rodríguez (he/him) gönderiyi yeniden yayınladı

I will be giving a short talk on this work at the COLM Interplay workshop on Friday (also to appear at EMNLP)! Will be in Montreal all week and excited to chat about LM interpretability + it’s interaction with human cognition and ling theory.

A key hypothesis in the history of linguistics is that different constructions share underlying structure. We take advantage of recent advances in mechanistic interpretability to test this hypothesis in Language Models. New work with @kmahowald and @ChrisGPotts! 🧵👇

SashaBoguraev's tweet image. A key hypothesis in the history of linguistics is that different constructions share underlying structure. We take advantage of recent advances in mechanistic interpretability to test this hypothesis in Language Models.

New work with @kmahowald and @ChrisGPotts!

🧵👇


Juan Diego Rodríguez (he/him) gönderiyi yeniden yayınladı

Happy to announce the first workshop on Pragmatic Reasoning in Language Models — PragLM @ COLM 2025! 🧠🎉 How do LLMs engage in pragmatic reasoning, and what core pragmatic capacities remain beyond their reach? 🌐 sites.google.com/berkeley.edu/p… 📅 Submit by June 23rd


Juan Diego Rodríguez (he/him) gönderiyi yeniden yayınladı

Interested in language models, brains, and concepts? Check out our COLM 2025 🔦 Spotlight paper! (And if you’re at COLM, come hear about it on Tuesday – sessions Spotlight 2 & Poster 2)!

maria_ryskina's tweet image. Interested in language models, brains, and concepts? Check out our COLM 2025 🔦 Spotlight paper!

(And if you’re at COLM, come hear about it on Tuesday – sessions Spotlight 2 & Poster 2)!

Juan Diego Rodríguez (he/him) gönderiyi yeniden yayınladı

Our paper "ChartMuseum 🖼️" is now accepted to #NeurIPS2025 Datasets and Benchmarks Track! Even the latest models, such as GPT-5 and Gemini-2.5-Pro, still cannot do well on challenging 📉chart understanding questions , especially on those that involve visual reasoning 👀!

LiyanTang4's tweet image. Our paper "ChartMuseum 🖼️" is now accepted to #NeurIPS2025 Datasets and Benchmarks Track!

Even the latest models, such as GPT-5 and Gemini-2.5-Pro, still cannot do well on challenging 📉chart understanding questions , especially on those that involve visual reasoning 👀!

Introducing ChartMuseum🖼️, testing visual reasoning with diverse real-world charts! ✍🏻Entirely human-written questions by 13 CS researchers 👀Emphasis on visual reasoning – hard to be verbalized via text CoTs 📉Humans reach 93% but 63% from Gemini-2.5-Pro & 38% from Qwen2.5-72B

LiyanTang4's tweet image. Introducing ChartMuseum🖼️, testing visual reasoning with diverse real-world charts!

✍🏻Entirely human-written questions by 13 CS researchers
👀Emphasis on visual reasoning – hard to be verbalized via text CoTs
📉Humans reach 93% but 63% from Gemini-2.5-Pro & 38% from Qwen2.5-72B
LiyanTang4's tweet image. Introducing ChartMuseum🖼️, testing visual reasoning with diverse real-world charts!

✍🏻Entirely human-written questions by 13 CS researchers
👀Emphasis on visual reasoning – hard to be verbalized via text CoTs
📉Humans reach 93% but 63% from Gemini-2.5-Pro & 38% from Qwen2.5-72B


Juan Diego Rodríguez (he/him) gönderiyi yeniden yayınladı

Accepted at #NeurIPS2025 -- super proud of Yulu and Dheeraj for leading this! Be on the lookout for more "nuanced yes/no" work from them in the future 👀

Does vision training change how language is represented and used in meaningful ways?🤔 The answer is a nuanced yes! Comparing VLM-LM minimal pairs, we find that while the taxonomic organization of the lexicon is similar, VLMs are better at _deploying_ this knowledge. [1/9]

yulu_qin's tweet image. Does vision training change how language is represented and used in meaningful ways?🤔 The answer is a nuanced yes! Comparing VLM-LM minimal pairs, we find that while the taxonomic organization of the lexicon is similar, VLMs are better at _deploying_ this knowledge. [1/9]


Loading...

Something went wrong.


Something went wrong.