sokeyer's profile picture. Developer, enjoy the life!
Machine learning and Large Language Models

Xin Liu

@sokeyer

Developer, enjoy the life! Machine learning and Large Language Models

Xin Liu reposted

New 3h31m video on YouTube: "Deep Dive into LLMs like ChatGPT" This is a general audience deep dive into the Large Language Model (LLM) AI technology that powers ChatGPT and related products. It is covers the full training stack of how the models are developed, along with mental…

karpathy's tweet image. New 3h31m video on YouTube:
"Deep Dive into LLMs like ChatGPT"

This is a general audience deep dive into the Large Language Model (LLM) AI technology that powers ChatGPT and related products. It is covers the full training stack of how the models are developed, along with mental…

Why I Won't Use Next.js, article epicweb.dev/why-i-wont-use…


Xin Liu reposted

Official post on Mixtral 8x7B: mistral.ai/news/mixtral-o… Official PR into vLLM shows the inference code: github.com/vllm-project/v… New HuggingFace explainer on MoE very nice: huggingface.co/blog/moe In naive decoding, performance of a bit above 70B (Llama 2), at inference speed…

Very excited to release our second model, Mixtral 8x7B, an open weight mixture of experts model. Mixtral matches or outperforms Llama 2 70B and GPT3.5 on most benchmarks, and has the inference speed of a 12B dense model. It supports a context length of 32k tokens. (1/n)

GuillaumeLample's tweet image. Very excited to release our second model, Mixtral 8x7B, an open weight mixture of experts model.
Mixtral matches or outperforms Llama 2 70B and GPT3.5 on most benchmarks, and has the inference speed of a 12B dense model. It supports a context length of 32k tokens. (1/n)
GuillaumeLample's tweet image. Very excited to release our second model, Mixtral 8x7B, an open weight mixture of experts model.
Mixtral matches or outperforms Llama 2 70B and GPT3.5 on most benchmarks, and has the inference speed of a 12B dense model. It supports a context length of 32k tokens. (1/n)
GuillaumeLample's tweet image. Very excited to release our second model, Mixtral 8x7B, an open weight mixture of experts model.
Mixtral matches or outperforms Llama 2 70B and GPT3.5 on most benchmarks, and has the inference speed of a 12B dense model. It supports a context length of 32k tokens. (1/n)


I just got 830 points✨ on nextjs.org/learn 来自 @vercel


Sometimes, I foget why I do it.


还记得你何时加入 Twitter 吗?我知道!#我的Twitter周年纪念日

sokeyer's tweet image. 还记得你何时加入 Twitter 吗?我知道!#我的Twitter周年纪念日

Back to China. Everything fine.


在东京的第21天。


Loading...

Something went wrong.


Something went wrong.