Manuel_NLP's profile picture. NLP - specifically (sentence) embeddings

Manuel

@Manuel_NLP

NLP - specifically (sentence) embeddings

Manuel reposted

We are aware of low-quality and LLM-generated reviews and are currently deliberating on appropriate courses of action. For now, authors who receive very poor quality or LLM-generated reviews should flag them to their ACs. We appreciate the community's efforts in reporting these!


Manuel reposted

I wanted to uncover some interactions in ICLR data on AI-usage in submissions and reviews, so I analyzed it further. What surprised me is that even the fully AI reviews gave lower scores to submissions with more AI-generated content on average. AI still prefers human-written…

dogacel0's tweet image. I wanted to uncover some interactions in ICLR data on AI-usage in submissions and reviews, so I analyzed it further.

What surprised me is that even the fully AI reviews gave lower scores to submissions with more AI-generated content on average. AI still prefers human-written…

ICLR authors, want to check if your reviews are likely AI generated? ICLR reviewers, want to check if your paper is likely AI generated? Here are AI detection results for every ICLR paper and review from @pangramlabs! It seems that ~21% of reviews may be AI?

gneubig's tweet image. ICLR authors, want to check if your reviews are likely AI generated?
ICLR reviewers, want to check if your paper is likely AI generated?

Here are AI detection results for every ICLR paper and review from @pangramlabs!

It seems that ~21% of reviews may be AI?


Manuel reposted

This paper has been desk rejected. LLM-generated papers that hallucinate references and do not report LLM usage will be desk rejected per ICLR policy (blog.iclr.cc/2025/08/26/pol…) Reviewers of other versions of this submission have been notified.

This LLM-generated paper was submitted at least 4 times to ICLR 2026 with different titles, each with slight variations in content, but very similar core claims and (incorrect) proofs. 1/n



Manuel reposted

🌉 #EMNLP2026 will be October 24-29th in Budapest!🌉 Thanks all for a great conference, and see you at the next one!

emnlpmeeting's tweet image. 🌉 #EMNLP2026 will be October 24-29th in Budapest!🌉

Thanks all for a great conference, and see you at the next one!

Manuel reposted

We are looking into reports about an unexpected system issue with OpenReview. Stay tuned.


Manuel reposted

The Computer Science section of @arxiv is now requiring prior peer review for Literature Surveys and Position Papers. Details in a new blog post


Manuel reposted

Training LLMs end to end is hard. Very excited to share our new blog (book?) that cover the full pipeline: pre-training, post-training and infra. 200+ pages of what worked, what didn’t, and how to make it run reliably huggingface.co/spaces/Hugging…

eliebakouch's tweet image. Training LLMs end to end is hard. Very excited to share our new blog (book?) that cover the full pipeline: pre-training, post-training and infra. 200+ pages of what worked, what didn’t, and how to make it run reliably

huggingface.co/spaces/Hugging…

Manuel reposted

📝 Discontinuation of the MS Word Template ARR will now fully adopt the LaTeX template to streamline formatting and reduce review workload. Starting March 2026, submissions using the MS Word template will be desk-rejected. Check details here: aclrollingreview.org/discontinuatio… #ARR #NLProc


Manuel reposted

LLMs are injective and invertible. In our new paper, we show that different prompts always map to different embeddings, and this property can be used to recover input tokens from individual embeddings in latent space. (1/6)

GladiaLab's tweet image. LLMs are injective and invertible.

In our new paper, we show that different prompts always map to different embeddings, and this property can be used to recover input tokens from individual embeddings in latent space.

(1/6)

Manuel reposted

🤗 Sentence Transformers is joining @huggingface! 🤗 This formalizes the existing maintenance structure, as I've personally led the project for the past two years on behalf of Hugging Face. I'm super excited about the transfer! Details in 🧵

tomaarsen's tweet image. 🤗 Sentence Transformers is joining @huggingface! 🤗 

This formalizes the existing maintenance structure, as I've  personally led the project for the past two years on behalf of Hugging Face. I'm super excited about the transfer!

Details in 🧵

Manuel reposted

The MTEB team has just released MTEB v2, an upgrade to their evaluation suite for embedding models! Their blogpost covers all changes, including easier evaluation, multimodal support, rerankers, new interfaces, documentation, dataset statistics, a migration guide, etc. 🧵

tomaarsen's tweet image. The MTEB team has just released MTEB v2, an upgrade to their evaluation suite for embedding models!

Their blogpost covers all changes, including easier evaluation, multimodal support, rerankers, new interfaces, documentation, dataset statistics, a migration guide, etc.

🧵

Manuel reposted

PTEB: Towards Robust Text Embedding Evaluation via Stochastic Paraphrasing at Evaluation Time with LLMs Introduces a dynamic evaluation protocol that generates meaning-preserving paraphrases at evaluation time to assess embedding model robustness. 📝arxiv.org/abs/2510.06730


Manuel reposted

The list of workshops for #EACL2026 is out! #NLProc

eaclmeeting's tweet image. The list of workshops for #EACL2026 is out!

 #NLProc

Manuel reposted

📢📢The submission deadline for #LREC2026 Main conference papers, workshop and tutorial proposals is extended to October 24, 2025. All deadlines are 11:59PM UTC-12:00 (“anywhere on Earth”) Contact us @ [email protected] lrec2026.info #nlp #NLProc


Manuel reposted

✈️ Visa Letter Requests for ACL 2026 If you intend to commit your paper to ACL 2026 and require an invitation letter for visa purposes, please fill out the visa request form as soon as possible. (docs.google.com/forms/d/e/1FAI…) #ARR #ACL #NLProc


HUME finds humans score 77.6% vs. 80.1% for the best embedding model, but models excel most on ambiguous tasks where humans disagree, indicating pattern-matching rather than semantic understanding. arxiv.org/abs/2510.10062


Manuel reposted

Are you a student wanting to present your @NeurIPSConf paper at #EurIPS, but lacking the funds? Then we have great news for you! We are offering 100 free student author registrations for students who might otherwise not afford to present their paper. eurips.cc/registration-w…


Manuel reposted

For those traveling to #EMNLP2025 in person, you may find this additional Travel Information useful! 2025.emnlp.org/travel/


Loading...

Something went wrong.


Something went wrong.