shashank_nlp's profile picture. NLP+Education | @UCF @RiceECE @rice_dsp @OpenStax @IITKanpur

Shashank Sonkar

@shashank_nlp

NLP+Education | @UCF @RiceECE @rice_dsp @OpenStax @IITKanpur

Excited to be part of this tutorial on Large Language Models for Education at #ACL2025! Join us tomorrow at 9 AM at the BEA Workshop (Rooms 1.85 & 1.86, Austria Center) to explore how NLP and AI are transforming education.

If you're at ACL, join us for the tutorial "LLMs for Education: Understanding the Needs of Stakeholders, Current Capabilities and the Path Forward" at the BEA workshop (Room 1.85–86) 9:00-12:30am tomorrow (July 31st) @aclmeeting



Shashank Sonkar 已轉發

Mistakes are key learning opportunities!🧑‍🎓 Can LLMs help students learn from them through dialog? 💬 While they often struggle to diagnose student errors when generating responses directly, adding a verification step ✅ could make a difference. #EMNLP2024

𝗖𝗮𝗻 𝗟𝗟𝗠𝘀 𝗵𝗲𝗹𝗽 𝘀𝘁𝘂𝗱𝗲𝗻𝘁𝘀 𝗹𝗲𝗮𝗿𝗻 𝗳𝗿𝗼𝗺 𝗺𝗶𝘀𝘁𝗮𝗸𝗲𝘀? Models struggle to spot student errors, but a verification step could help. More below! 🧵(1/9) #EMNLP2024 📰 arxiv.org/abs/2407.09136

UKPLab's tweet image. 𝗖𝗮𝗻 𝗟𝗟𝗠𝘀 𝗵𝗲𝗹𝗽 𝘀𝘁𝘂𝗱𝗲𝗻𝘁𝘀 𝗹𝗲𝗮𝗿𝗻 𝗳𝗿𝗼𝗺 𝗺𝗶𝘀𝘁𝗮𝗸𝗲𝘀?
Models struggle to spot student errors, but a verification step could help.
More below!

🧵(1/9) #EMNLP2024

📰  arxiv.org/abs/2407.09136


Shashank Sonkar 已轉發

🚨"Towards Aligning Language Models with Textual Feedback" has been accepted at #EMNLP2024! We explore if textual feedback can better align LLMs vs. numeric rewards. Our approach, ALT, adapts the Decision Transformer to condition responses on textual feedback. What we find 👇🧵

shehzaadzd's tweet image. 🚨"Towards Aligning Language Models with Textual Feedback" has been accepted at #EMNLP2024!
We explore if textual feedback can better align LLMs vs. numeric rewards. Our approach, ALT, adapts the Decision Transformer to condition responses on textual feedback. What we find 👇🧵

Shashank Sonkar 已轉發

🚀 New paper on LLM reasoning 🚀 We present MathGAP, a framework for evaluating LLMs on math word problems with arbitrarily complex proof structures--resulting in problems that are challenging even for GPT-4o and OpenAI o1 💥 A thread 🧵 arxiv.org/pdf/2410.13502

OpedalAndreas's tweet image. 🚀 New paper on LLM reasoning 🚀 We present MathGAP, a framework for evaluating LLMs on math word problems with arbitrarily complex proof structures--resulting in problems that are challenging even for GPT-4o and OpenAI o1 💥 A thread 🧵
arxiv.org/pdf/2410.13502

Shashank Sonkar 已轉發

🚀✨ OpenStax is proud to announce we have partnered with @GeminiApp to enable our library of resources to be discovered, searched, and available to users 18+ in the U.S.! Read more here: blog.google/products/gemin…

OpenStax's tweet image. 🚀✨ OpenStax is proud to announce we have partnered with @GeminiApp to enable our library of resources to be discovered, searched, and available to users 18+ in the U.S.! 

Read more here: blog.google/products/gemin…

Shashank Sonkar 已轉發

📚 We're also introducing new #Gemini features to help you learn more confidently. For example, Gemini will soon provide trustworthy responses based on textbooks from @OpenStax, a division of @RiceUniversity—including in-line citations and links to relevant peer-reviewed content.


Shashank Sonkar 已轉發

AGI is gonna be wild! meantime we have some problems.

Same analysis holds for prepositions of movement like (e) up, (f) down, (g) from, and (h) towards. Unsurprisingly, SDMs also fail for the hardest abstract category of particles, which include (i) on, (j) off, (k) with, and (l) without.

shashank_nlp's tweet image. Same analysis holds for prepositions of movement like (e) up, (f) down, (g) from, and (h) towards. Unsurprisingly, SDMs also fail for the hardest abstract category of particles, which include (i) on, (j) off, (k) with, and (l) without.


Shashank Sonkar 已轉發

Some times you wait years to see whether anyone can replicate your work. Some times you discover another paper the next day that pretty much reaches the same conclusion. Further evidence that deep learning still has deep trouble with comprehension: dsp.rice.edu/2022/10/25/a-v…

Claims that Dall-E 2 understands human language do not withstand scrutiny. New experimental work by @EvelinaLeivada @ElliotMurphy91 and myself shows systematic failure in mapping syntax to semantics, across wide range of common linguistic constructions. arxiv.org/abs/2210.12889



Loading...

Something went wrong.


Something went wrong.