yongmini97's profile picture. PhD. student at the University of Tokyo. at Matuso ・ Iwasawa Lab

Yongmin Kim

@yongmini97

PhD. student at the University of Tokyo. at Matuso ・ Iwasawa Lab

Fijado

🥳Excited to share that our paper has been accepted to @COLM_conf, which will be held next month! Great thanks to my advisors @kojima_tks, Yusuke Iwasawa, and @ymatsuo. Looking forward to meeting many researchers and having insightful discussions at @COLM_conf!

While LMs excel across domains, they risk generating toxic content. Detoxification methods exist but degrade performance. Can model merging decouple noise/toxic parameters for detoxification? We explore this by comparing merged vs. existing models. openreview.net/pdf?id=TBNYjdO… 🧵



Yongmin Kim reposteó

💍 LFM2-ColBERT-350M: One Model to Embed Them All Very happy to announce our first embedding model! It's a late interaction retriever with excellent multilingual performance. Available today on @huggingface!

maximelabonne's tweet image. 💍 LFM2-ColBERT-350M: One Model to Embed Them All

Very happy to announce our first embedding model!

It's a late interaction retriever with excellent multilingual performance.

Available today on @huggingface!

Yongmin Kim reposteó

New LFM2 release 🥳 It's a Japanese PII extractor with only 350M parameters. It's extremely fast and on par with GPT-5 (!) in terms of quality. Check it out, it's available today on @huggingface!


Yongmin Kim reposteó

LFM2-8B-A1B just dropped on @huggingface! 8.3B params with only 1.5B active/token 🚀 > Quality ≈ 3–4B dense, yet faster than Qwen3-1.7B > MoE designed to run on phones/laptops (llama.cpp / vLLM) > Pre-trained on 12T tokens → strong math/code/IF


Yongmin Kim reposteó

Hello Japan! 🇯🇵 Want to join a 2-day in-person hackathon co-hosted by Liquid AI and @weights_biases? Developers and AI researchers from around the world will gather to help shape the future of AI in Japan! This hackathon’s theme is “Push SLM models to the limit.” The selected…

LiquidAI_'s tweet image. Hello Japan! 🇯🇵

Want to join a 2-day in-person hackathon co-hosted by Liquid AI and @weights_biases? Developers and AI researchers from around the world will gather to help shape the future of AI in Japan!

This hackathon’s theme is “Push SLM models to the limit.” The selected…

Yongmin Kim reposteó

Discord invite? Is Discord open to all? It seems gated (unless I was looking at the wrong invite perma url)


Yongmin Kim reposteó

Liquid just released two 450M and 1.6B param VLMs! They're super fast and leverage SigLIP2 NaFlex encoders to handle native resolutions without distortion. Available today on @huggingface!

maximelabonne's tweet image. Liquid just released two 450M and 1.6B param VLMs!

They're super fast and leverage SigLIP2 NaFlex encoders to handle native resolutions without distortion.

Available today on @huggingface!

Yongmin Kim reposteó

Try LFM2 with llama.cpp today! We released today a collection of GGUF checkpoints for developers to run LFM2 everywhere with llama.cpp Select the most relevant precision for your use case and start building today. huggingface.co/LiquidAI/LFM2-…

LiquidAI_'s tweet image. Try LFM2 with llama.cpp today!

We released today a collection of GGUF checkpoints for developers to run LFM2 everywhere with llama.cpp

Select the most relevant precision for your use case and start building today.

huggingface.co/LiquidAI/LFM2-…

Yongmin Kim reposteó

Liquid AI open-sources a new generation of edge LLMs! 🥳 I'm so happy to contribute to the open-source community with this release on @huggingface! LFM2 is a new architecture that combines best-in-class inference speed and quality into 350M, 700M, and 1.2B models.

maximelabonne's tweet image. Liquid AI open-sources a new generation of edge LLMs! 🥳

I'm so happy to contribute to the open-source community with this release on @huggingface! 

LFM2 is a new architecture that combines best-in-class inference speed and quality into 350M, 700M, and 1.2B models.

Yongmin Kim reposteó

🤏 Can small models be strong reasoners? We created a 1B reasoning model at @LiquidAI_ that is both accurate and concise We applied a combination of SFT (to raise quality) and GRPO (to control verbosity) The result is a best-in-class model without specific math pre-training

maximelabonne's tweet image. 🤏 Can small models be strong reasoners?

We created a 1B reasoning model at @LiquidAI_  that is both accurate and concise

We applied a combination of SFT (to raise quality) and GRPO (to control verbosity)

The result is a best-in-class model without specific math pre-training

Yongmin Kim reposteó

What is the key difference between recent reasoning models and their base counterparts? Our new preprint reveals that reasoning models build distinctive “Reasoning Graphs” from their hidden states, characterized by more cycles, larger diameters, and stronger local connectivity.


Yongmin Kim reposteó

Our paper "Beyond Induction Heads: In-Context Meta Learning Induces Multi-Phase Circuit Emergence" has accepted at #ICML2025🎉 We found training Transformers in a few-shot setting leads to the emergence of 3 circuits. Joint work with @frt03_ @ishohei220 @yusuke_iwasawa_ @ymatsuo


Yongmin Kim reposteó

🍻Two co-authored papers have been accepted to #ACL2025NLP🍻 Congrats to Andrew-san and Takashiro-san!

kojima_tks's tweet image. 🍻Two co-authored papers have been accepted to #ACL2025NLP🍻

Congrats to Andrew-san and Takashiro-san!

Yongmin Kim reposteó

🐥Our paper "Continual Pre-training on Character-Level Noisy Texts Makes Decoder-based Language Models Robust Few-shot Learners" has been accepted to #TACL🐥 We are now planning to make a presentation at #ACL2025NLP ! weblab.t.u-tokyo.ac.jp/news/2025-0502/

kojima_tks's tweet image. 🐥Our paper "Continual Pre-training on Character-Level Noisy Texts Makes Decoder-based Language Models Robust Few-shot Learners" has been accepted to #TACL🐥

We are now planning to make a presentation at #ACL2025NLP !

weblab.t.u-tokyo.ac.jp/news/2025-0502/

Yongmin Kim reposteó

🌈Our paper "Slender-Mamba: Fully Quantized Mamba in 1.58 Bits From Head to Toe" has been accepted to #coling2025 🌈 We apply BitNet b1.58 quantization to Mamba2 architecture including embedding and head layers to accelerate further lightweighting. Congrats @UTLLM_zxYu !

kojima_tks's tweet image. 🌈Our paper "Slender-Mamba: Fully Quantized Mamba in 1.58 Bits From Head to Toe" has been accepted to #coling2025 🌈

We apply BitNet b1.58 quantization to Mamba2 architecture including embedding and head layers to accelerate further lightweighting.

Congrats @UTLLM_zxYu !

Yongmin Kim reposteó

🤗Our paper "Which Programming Language and What Features at Pre-training Stage Affect Downstream Logical Inference Performance?" has been accepted to #EMNLP2024 🤗 Congrats @FumiyaUchiyama! Read the following post for the details!

Excited to show that our paper has been accepted to #EMNLP2024 ! We investigated the effect of pretraining on code data by training LMs from scratch with a single programming language / natural language corpus and confirmed consistent improvements on FLD and bAbi benchmark.

FumiyaUchiyama's tweet image. Excited to show that our paper has been accepted to #EMNLP2024 !

We investigated the effect of pretraining on code data by training LMs from scratch with a single programming language / natural language corpus and confirmed consistent improvements on FLD and bAbi benchmark.


United States Tendencias

Loading...

Something went wrong.


Something went wrong.