Harshit Joshi

@harshitj__

CS phd @StanfordNLP, @StanfordOVAL | prev: @MSFTResearch | LLM systems for knowledge access, discovery and curation

Stanford, CA

harshitjoshi.com

於五月 2012 加入

1K貼文 2K位跟隨者 377個跟隨中

你可能會喜歡

@Roprajo

@anwesh_bh

@ZenMoore1

@sumanthd17

@partha_p_t

@shaily99

@ai4bharat

@trippy_hustler

@arkil_patel

@siyan_zhao

@TsingYoga

@Jiachen_Gu

@fooobar

@ihsrahedid

@BhatiRupali

Harshit Joshi

@harshitj__

年9月10日

had fun using burst!! also my new name is Sanjay!

Do you hold back from posting because the audience never feels right? Small groups feel safe but limiting. Public platforms feel risky and performative. Our #CSCW2025 paper introduces Burst: a design that connects private and public spaces. We found that posters felt safer and…

zhangyt0704's tweet image. Do you hold back from posting because the audience never feels right?
Small groups feel safe but limiting.
Public platforms feel risky and performative.

Our #CSCW2025 paper introduces Burst: a design that connects private and public spaces.
We found that posters felt safer and…

Harshit Joshi 已轉發

Liana

@lianapatel_

年8月29日

Interested in building and benchmarking deep research systems? Excited to introduce DeepScholar-Bench, a live benchmark for generative research synthesis, from our team at Stanford and Berkeley! 🏆Live Leaderboard guestrin-lab.github.io/deepscholar-le… 📚 Paper: arxiv.org/abs/2508.20033 🛠️…

lianapatel_'s tweet image. Interested in building and benchmarking deep research systems?

Excited to introduce DeepScholar-Bench, a live benchmark for generative research synthesis, from our team at Stanford and Berkeley!

🏆Live Leaderboard guestrin-lab.github.io/deepscholar-le…
📚 Paper: arxiv.org/abs/2508.20033
🛠️…

Harshit Joshi 已轉發

Yanzhe Zhang

@StevenyzZhang

年8月27日

Introducing Generative Interfaces - a new paradigm beyond chatbots. We generate interfaces on the fly to better facilitate LLM interaction, so no more passive reading of long text blocks. Adaptive and Interactive: creates the form that best adapts to your goals and needs!

Harshit Joshi

@harshitj__

年8月26日

which model will first get to 10%? taking wagers

Ken Liu

@kenziyuliu

年8月26日

New paper! We explore a radical paradigm for AI evals: assessing LLMs on *unsolved* questions. Instead of contrived exams where progress ≠ value, we eval LLMs on organic, unsolved problems via reference-free LLM validation & community verification. LLMs solved ~10/500 so far:

kenziyuliu's tweet image. New paper! We explore a radical paradigm for AI evals: assessing LLMs on *unsolved* questions.

Instead of contrived exams where progress ≠ value, we eval LLMs on organic, unsolved problems via reference-free LLM validation &amp; community verification. LLMs solved ~10/500 so far:

Harshit Joshi 已轉發

Niklas Muennighoff

@Muennighoff

年8月26日

Can AI solve open problems in math, physics, coding, medical sciences & beyond? We collected unsolved questions (UQ) & tested frontier LLMs. Some solutions passed expert validation…

Muennighoff's tweet image. Can AI solve open problems in math, physics, coding, medical sciences &amp; beyond?

We collected unsolved questions (UQ) &amp; tested frontier LLMs. Some solutions passed expert validation…

Harshit Joshi

@harshitj__

年8月15日

it would be nice to have personal agents that can take care of mundane/complex works for us while interacting with other agents — but how much can we rely on them? how can I trust them to not share my personal details? GO CHECK OUT THIS CRAZY WORK!

Yanzhe Zhang

@StevenyzZhang

年8月15日

Soon, AI agents will act for us—collaborating, negotiating, and sharing data. But can they truly protect our privacy? We simulate privacy-critical scenarios, using alternating search to evolve attacks and defenses, uncovering severe vulnerabilities and building protections.

Harshit Joshi

@harshitj__

年8月3日

nothing is constant in travels, except desi family kalesh

Harshit Joshi 已轉發

Zhengxuan Wu

@ZhengxuanZenWu

年8月1日

if you find this post interesting, please also check out AxBench, @aryaman2020 and I have a few more steering options to show. these are supervised methods, but automated with less than 50 LM generated examples to train.

ZhengxuanZenWu's tweet image. if you find this post interesting, please also check out AxBench, @aryaman2020 and I have a few more steering options to show. these are supervised methods, but automated with less than 50 LM generated examples to train.

Anthropic

@AnthropicAI

年8月1日

New Anthropic research: Persona vectors. Language models sometimes go haywire and slip into weird and unsettling personas. Why? In a new paper, we find “persona vectors"—neural activity patterns controlling traits like evil, sycophancy, or hallucination.

AnthropicAI's tweet image. New Anthropic research: Persona vectors.

Language models sometimes go haywire and slip into weird and unsettling personas. Why? In a new paper, we find “persona vectors"—neural activity patterns controlling traits like evil, sycophancy, or hallucination.

Harshit Joshi 已轉發

Stanford NLP Group

@stanfordnlp

年7月22日

.@stanfordnlp papers at @aclmeeting in Vienna next week: • HumT DumT: Measuring and controlling human-like language in LLMs @chengmyra1 @sunnyyuych @jurafsky • Controllable and Reliable Knowledge-Intensive Task Agents with Declarative GenieWorksheets @harshitj__ @ShichengGLiu…

stanfordnlp's tweet image. .@stanfordnlp papers at @aclmeeting in Vienna next week:
• HumT DumT: Measuring and controlling human-like language in LLMs @chengmyra1 @sunnyyuych @jurafsky
• Controllable and Reliable Knowledge-Intensive Task Agents with Declarative GenieWorksheets
@harshitj__ @ShichengGLiu…

Harshit Joshi

@harshitj__

年7月10日

Claude code just added a bunch of `pytest.skip` lol 😭

Harshit Joshi 已轉發

CLS @COLM2025

@ChengleiSi

年6月30日

Are AI scientists already better than human researchers? We recruited 43 PhD students to spend 3 months executing research ideas proposed by an LLM agent vs human experts. Main finding: LLM ideas result in worse projects than human ideas.

ChengleiSi's tweet image. Are AI scientists already better than human researchers?

We recruited 43 PhD students to spend 3 months executing research ideas proposed by an LLM agent vs human experts.

Main finding: LLM ideas result in worse projects than human ideas.

Harshit Joshi 已轉發

Yutong Zhang

@zhangyt0704

年6月18日

AI companions aren’t science fiction anymore 🤖💬❤️ Thousands are turning to AI chatbots for emotional connection – finding comfort, sharing secrets, and even falling in love. But as AI companionship grows, the line between real and artificial relationships blurs. 📰 “Can A.I.…

zhangyt0704's tweet image. AI companions aren’t science fiction anymore 🤖💬❤️
Thousands are turning to AI chatbots for emotional connection – finding comfort, sharing secrets, and even falling in love. But as AI companionship grows, the line between real and artificial relationships blurs.

📰 “Can A.I.…

Harshit Joshi

@harshitj__

年6月10日

plis give them a place to say so that they can do great work 🫡

Aryaman Arora

@aryaman2020

年6月10日

anybody have a place in SF to rent/sublease till mid-September? for me and @ZhengxuanZenWu (or either of us individually) need to move ASAP

Harshit Joshi 已轉發

Omar Shaikh

@oshaikh13

年6月9日

What if LLMs could learn your habits and preferences well enough (across any context!) to anticipate your needs? In a new paper, we present the General User Model (GUM): a model of you built from just your everyday computer use. 🧵

Harshit Joshi

@harshitj__

年6月7日

i was recently told that i cannot angel invest 10k in a very promising startup because i do not have at least 30k followers 😭 😭

Harshit Joshi 已轉發

Jordan Juravsky

@jordanjuravsky

年6月5日

Happy Throughput Thursday! We’re excited to release Tokasaurus: an LLM inference engine designed from the ground up for high-throughput workloads with large and small models. (Joint work with @achakravarthy01, @ryansehrlich, @EyubogluSabri, @brad19brown, @jshetaye,…