Xinyue Liu

@irisiris_l

PhD student @sbucompsc. Prev @LTIatCMU

Stony Brook

cauchy221.github.io

Joined July 2023

66Posts 26Followers 166Following

Xinyue Liu reposted

Niloofar

@niloofar_mire

Nov 21

First long context, compositional agentic memory benchmark for personalization in LLMs. We release synthetic profiles with 150+ attributes per profile and 40+ tasks, evaluating task completion and contextual appropriateness and privacy! We also open sourced the data on hugging…

Rohan Paul

@rohanpaul_ai

Nov 21

New @AIatMeta paper builds a benchmark to check if LLM assistants use their stored user memories safely. These assistants keep long term chat histories, so private facts can appear in answers even when not needed. CIMemories creates fake user profiles with over 100 personal…

rohanpaul_ai's tweet image. New @AIatMeta paper builds a benchmark to check if LLM assistants use their stored user memories safely.

These assistants keep long term chat histories, so private facts can appear in answers even when not needed.

CIMemories creates fake user profiles with over 100 personal…

Xinyue Liu reposted

Sebastian Raschka

@rasbt

Nov 23

Implemented Olmo 3 from scratch (in a standalone notebook) this weekend! If you are a coder, probably the best way to read the architecture details at a glance: github.com/rasbt/LLMs-fro…

rasbt's tweet image. Implemented Olmo 3 from scratch (in a standalone notebook) this weekend!
If you are a coder, probably the best way to read the architecture details at a glance: github.com/rasbt/LLMs-fro…

Sebastian Raschka

@rasbt

Nov 20

Olmo models are always a highlight due to them being fully transparent and their nice, detailed technical reports. I am sure I'll talk more about the interesting training-related aspects from that 100-pager in the upcoming days and weeks. In the meantime, here's the side-by-side…

rasbt's tweet image. Olmo models are always a highlight due to them being fully transparent and their nice, detailed technical reports.
I am sure I'll talk more about the interesting training-related aspects from that 100-pager in the upcoming days and weeks.
In the meantime, here's the side-by-side…

Xinyue Liu reposted

Cas (Stephen Casper)

@StephenLCasper

Nov 12

🚨New paper🚨 From a technical perspective, safeguarding open-weight model safety is AI safety in hard mode. But there's still a lot of progress to be made. Our new paper covers 16 open problems. 🧵🧵🧵

StephenLCasper's tweet image. 🚨New paper🚨

From a technical perspective, safeguarding open-weight model safety is AI safety in hard mode. But there's still a lot of progress to be made. Our new paper covers 16 open problems.

🧵🧵🧵

Xinyue Liu reposted

Christopher Potts

@ChrisGPotts

Nov 12

The Anthropic perspective on interpretability is prominent and significant, but not inevitable. My own take is quite different. (Clip from a talk I gave; YouTube link in the thread):

Christopher Potts

@ChrisGPotts

Nov 10

Severance as a show about interpretability research in AI (a clip from a talk; YouTube link just below):

Xinyue Liu reposted

Kaj Bostrom

@alephic2

Nov 7

This kind of natural argument autoformalization system, with the ability to build a schema on-demand, has been kind of a holy grail of mine since start of PhD. Was so sick to see Yu pull it off in the span of a summer! Grateful to have played a part!

Yu Feng

@AnnieFeng6

Nov 7

LLM CoT reasoning looks smart but can be logically flawed or... just made up. It's time to hold reasoning accountable! We built VeriCoT to do just that. VeriCoT extracts the core argument of the CoT using well-formed symbolic notions of logical support. It formalizes every CoT…

AnnieFeng6's tweet image. LLM CoT reasoning looks smart but can be logically flawed or... just made up. It's time to hold reasoning accountable!

We built VeriCoT to do just that. VeriCoT extracts the core argument of the CoT using well-formed symbolic notions of logical support. It formalizes every CoT…

Xinyue Liu reposted

Jack Merullo

@jack_merullo_

Nov 6

How is memorized data stored in a model? We disentangle MLP weights in LMs and ViTs into rank-1 components based on their curvature in the loss, and find representational signatures of both generalizing structure and memorized training data

jack_merullo_'s tweet image. How is memorized data stored in a model? We disentangle MLP weights in LMs and ViTs into rank-1 components based on their curvature in the loss, and find representational signatures of both generalizing structure and memorized training data

Xinyue Liu reposted

Tuhin Chakrabarty

@TuhinChakr

Nov 5

I am recruiting 1/2 PhD students to work on how GenerativeAI dilutes Creative Labor Markets/ AI and CopyrightLaw / Proliferation of AI slop at Stony Brook Computer Science @sbucompsc starting fall 2026! Come join us :) We are not far from NYC 🗽 (1 hr train to Queens) 🧵

TuhinChakr's tweet image. I am recruiting 1/2 PhD students to work on how GenerativeAI dilutes Creative Labor Markets/ AI and CopyrightLaw / Proliferation of AI slop at Stony Brook Computer Science
@sbucompsc
starting fall 2026! Come join us :) We are not far from NYC 🗽 (1 hr train to Queens) 🧵

Xinyue Liu reposted

Anthropic

@AnthropicAI

Oct 29

New Anthropic research: Signs of introspection in LLMs. Can language models recognize their own internal thoughts? Or do they just make up plausible answers when asked about them? We found evidence for genuine—though limited—introspective capabilities in Claude.

AnthropicAI's tweet image. New Anthropic research: Signs of introspection in LLMs.

Can language models recognize their own internal thoughts? Or do they just make up plausible answers when asked about them? We found evidence for genuine—though limited—introspective capabilities in Claude.

Xinyue Liu reposted

Hanane Nour Moussa

@HananeNMoussa

Oct 28

📢 As AI becomes increasingly explored for research idea generation, how can we rigorously evaluate the ideas it generates before committing time and resources to them? We introduce ScholarEval, a literature grounded framework for research idea evaluation across disciplines 👇!

Xinyue Liu reposted

ueaj

@_ueaj

Oct 26

publish.obsidian.md/ueaj/Machine+L…

_ueaj's tweet card. There's been a lot of discussion and research in trying to mitigate mode collapse in RL. I think frankly all of the approaches that I have seen are ill-justified and approaching this from the wrong a…

Theory of Diversity (RL) - Obsidian Publish

Source: publish.obsidian.md

Xinyue Liu reposted

Jesse Hoogland

@jesse_hoogland

Oct 22

How does training data shape model behavior? Well, it’s complicated… 1/10

Xinyue Liu reposted

Mohit Iyyer

@MohitIyyer

Oct 22

My fave part of this project was going to local grocery stores this summer to spot AI-generated newspaper articles "in the wild". Seeing AI slop in print is... weirdly jarring. Few reporters disclose AI use, so many ppl who never use ChatGPT still unknowingly consume AI content!

Jenna Russell

@jennajrussell

Oct 22

AI is already at work in American newsrooms. We examine 186k articles published this summer and find that ~9% are either fully or partially AI-generated, usually without readers having any idea. Here's what we learned about how AI is influencing local and national journalism:

jennajrussell's tweet image. AI is already at work in American newsrooms.

We examine 186k articles published this summer and find that ~9% are either fully or partially AI-generated, usually without readers having any idea.

Here's what we learned about how AI is influencing local and national journalism:

Xinyue Liu reposted

isha

@is_h_a

Oct 20

New work! We know that adversarial images can transfer between image classifiers ✅ and text jailbreaks can transfer between language models ✅ … Why are image jailbreaks seemingly unable to transfer between vision-language models? ❌ We might know why… 🧵

is_h_a's tweet image. New work!

We know that adversarial images can transfer between image classifiers ✅ and text jailbreaks can transfer between language models ✅ …

Why are image jailbreaks seemingly unable to transfer between vision-language models? ❌

We might know why… 🧵

Xinyue Liu reposted

Tuhin Chakrabarty

@TuhinChakr

Oct 21

🚨New paper on AI and copyright Several authors have sued LLM companies for allegedly using their books without permission for model training. 👩‍⚖️Courts, however, require empirical evidence of harm (e.g., market dilution). Our new pre-registered study addresses exactly this…

TuhinChakr's tweet image. 🚨New paper on AI and copyright

Several authors have sued LLM companies for allegedly using their books without permission for model training.

👩‍⚖️Courts, however, require empirical evidence of harm (e.g., market dilution). Our new pre-registered study addresses exactly this…

Xinyue Liu reposted

Aayush Karan

@aakaran31

Oct 17

We found a new way to get language models to reason. 🤯 No RL, no training, no verifiers, no prompting. ❌ With better sampling, base models can achieve single-shot reasoning on par with (or better than!) GRPO while avoiding its characteristic loss in generation diversity.

Xinyue Liu reposted

Matthew Finlayson

@mattf1n

Oct 17

We discovered that language models leave a natural "signature" on their API outputs that's extremely hard to fake. Here's how it works 🔍 📄 arxiv.org/abs/2510.14086 1/

Xinyue Liu reposted

Taylor Sorensen

@ma_tay_

Oct 15

My best hypothesis for the mechanism is: Chat LLMs are hyperoptimized to approximate the single "best" (most-preferred) response. When you prompt it for a single story, it gives the single best story it can. When you ask it to give FIVE stories, you recast the "best" response to…

Xinyue Liu reposted

Weiyan Shi

@shi_weiyan

Oct 15

New paper: You can make ChatGPT 2x as creative with one sentence. Ever notice how LLMs all sound the same? They know 100+ jokes but only ever tell one. Every blog intro: "In today's digital landscape..." We figured out why – and how to unlock the rest 🔓 Copy-paste prompt: 🧵

Xinyue Liu reposted

Taylor Sorensen

@ma_tay_

Oct 13

🤖➡️📉 Post-training made LLMs better at chat and reasoning—but worse at distributional alignment, diversity, and sometimes even steering(!) We measure this with our new resource (Spectrum Suite) and introduce Spectrum Tuning (method) to bring them back into our models! 🌈 1/🧵

ma_tay_'s tweet image. 🤖➡️📉 Post-training made LLMs better at chat and reasoning—but worse at distributional alignment, diversity, and sometimes even steering(!)

We measure this with our new resource (Spectrum Suite) and introduce Spectrum Tuning (method) to bring them back into our models! 🌈

1/🧵

Xinyue Liu reposted

Diego Calanzone

@diegocalanzone

Oct 9

Many this morning came early to attend Nicholas Carlini's keynote at #COLM2025. He spoke up to bring awareness on a range of problems with LLM use, summarized by the question: are they worth it? Here I report his main points and why they deserve attention. 🧵

diegocalanzone's tweet image. Many this morning came early to attend Nicholas Carlini's keynote at #COLM2025. He spoke up to bring awareness on a range of problems with LLM use, summarized by the question: are they worth it?
Here I report his main points and why they deserve attention. 🧵

karte

@Karte081054

Yu Feng

@AnnieFeng6

Paramveer Dhillon

@dhillon_p

Sarubi Thillainathan

@SarubiT

Shangbin Feng

@shangbinfeng

Zhen Wu

@ZhenWu66231

Chau Minh Pham

@chautmpham

Adithya Bhaskar

@AdithyaNLP

Hongli Zhan ✈️ ICML (on the job market)

@HongliZhan

Greg Durrett

@gregd_nlp

Tianjian Li

@tli104

Brihi Joshi

@BrihiJ

Anirudh Atmakuru

@aatmakuru6

Tuhin Chakrabarty

@TuhinChakr

Clara Na

@claranahhh

Yoonsang Lee

@yoonsang_

Roy Lee

@SRoyLee

Yu Ying Chiu (Kelly Chiu)

@kellychiuyy

Jinghan Zhang

@jinghan23

Yi-Hao Peng

@yolohao

Yiming Zhang

@yimingz0

Chinmaya Andukuri

@chinmaya_mohan

Lorenzo Xiao

@lrzneedresearch

Ksenia_TuringPost

@TheTuringPost

Liqiya.L

@Ciaraa_Liu

Celine Cao

@littlecc_ovo

John Hewitt

@johnhewtt

humans&

@humansand

Christopher Potts

@ChrisGPotts

Alexander Spangher

@AlexanderSpangh

Yu Feng

@AnnieFeng6

dr. jack morris

@jxmnop

Shunyu Yao

@ShunyuYao12

Jessy Lin

@realJessyLin

James Zou

@james_y_zou

Yuandong Tian

@tydsh

Aaron Roth

@Aaroth

Mohit Iyyer

@MohitIyyer

Paramveer Dhillon

@dhillon_p

Manling Li

@ManlingLi_

Shangbin Feng

@shangbinfeng

Ravid Shwartz Ziv

@ziv_ravid

Alex L Zhang

@a1zhang

Mengdi Wang

@MengdiWang10

Lucy Li

@lucy3_li

The GenLaw Center

@genlawcenter

$sarahookr's profile picture. Adaptive Intelligence. Built @Cohere_Labs, @GoogleBrain, @GoogleDeepmind. ML Efficiency, Multimodal\lingual. Changing spaces where breakthroughs happen.$