_ssmanjunath's profile picture.

Manjunath

@_ssmanjunath

Manjunath 已轉發

Following up on my reasoning model article, I just read the new "s1: Simple Test-Time Scaling" paper, which describes an interesting method for improving reasoning models using a combination of pure supervised finetuning (SFT) and scaling inference compute. In short, their…

rasbt's tweet image. Following up on my reasoning model article, I just read the new "s1: Simple Test-Time Scaling" paper, which describes an interesting method for improving reasoning models using a combination of pure supervised finetuning (SFT) and scaling inference compute.
In short, their…

Manjunath 已轉發

Here is Tülu 3 405B 🐫 our open-source post-training model that surpasses the performance of DeepSeek-V3! The last member of the Tülu 3 family demonstrates that our recipe, which includes Reinforcement Learning from Verifiable Rewards (RVLR) scales to 405B - with performance on…

allen_ai's tweet image. Here is Tülu 3 405B 🐫 our open-source post-training model that surpasses the performance of DeepSeek-V3! The last member of the Tülu 3 family demonstrates that our recipe, which includes Reinforcement Learning from Verifiable Rewards (RVLR) scales to 405B - with performance on…

Manjunath 已轉發

We reproduced DeepSeek R1-Zero in the CountDown game, and it just works Through RL, the 3B base LM develops self-verification and search abilities all on its own You can experience the Ahah moment yourself for < $30 Code: github.com/Jiayi-Pan/Tiny… Here's what we learned 🧵

jiayi_pirate's tweet image. We reproduced DeepSeek R1-Zero in the CountDown game, and it just works 

Through RL, the 3B base LM develops self-verification and search abilities all on its own 

You can experience the Ahah moment yourself for &amp;lt; $30 
Code: github.com/Jiayi-Pan/Tiny…

Here&apos;s what we learned 🧵

Manjunath 已轉發

For those trying to understand @deepseek_ai Group Relative Policy Optimization (GRPO). Here, in simple steps: 1️⃣ Generate multiple outputs for each prompt using the current policy 2️⃣ Score these outputs using a reward model (rule or outcome) 3️⃣ Average the rewards and use it as…

_philschmid's tweet image. For those trying to understand @deepseek_ai Group Relative Policy Optimization (GRPO). Here, in simple steps: 

1️⃣ Generate multiple outputs for each prompt using the current policy
2️⃣ Score these outputs using a reward model (rule or outcome)
3️⃣ Average the rewards and use it as…

Manjunath 已轉發

Introducing a high-quality open-preference dataset to further this line of research for image generation. Despite being such an inseparable component for modern image generation, open preference datasets are a rarity! So, we decided to work on one with the community!

RisingSayak's tweet image. Introducing a high-quality open-preference dataset to further this line of research for image generation. 

Despite being such an inseparable component for modern image generation, open preference datasets are a rarity!

So, we decided to work on one with the community!

Manjunath 已轉發

The highest-scored paper at ICLR 2025 with full scores, 10, 10, 10, 10! The first time in ICLR history? IC-Light is designed to control image lighting. They managed to collect >10 million images for training illumination editing models, with amazing results on SDXL and Flux…

Yuchenj_UW's tweet image. The highest-scored paper at ICLR 2025 with full scores, 10, 10, 10, 10! The first time in ICLR history?

IC-Light is designed to control image lighting. They managed to collect &amp;gt;10 million images for training illumination editing models, with amazing results on SDXL and Flux…

Manjunath 已轉發

We just released Pixtral 12B paper on Arxiv: arxiv.org/abs/2410.07073

dchaplot's tweet image. We just released Pixtral 12B paper on Arxiv:
arxiv.org/abs/2410.07073

Manjunath 已轉發

Physicists think AI is physics. Statisticians think AI is statistics. Mathematicians think AI is mathematics. Psychologists think AI is psychology. Neuroscientists think AI is neuroscience. And they’re all right.


Manjunath 已轉發

📚Introduction to a new paper "Performance Law of Large Language Models"🤖 This paper presents a new empirical equation that directly predicts the performance (i.e., MMLU score) of LLMs by fitting a law on top of several hyper-parameters ⬇️. Leveraging❗️10 open-source models…

sivil_taram's tweet image. 📚Introduction to a new paper &quot;Performance Law of Large Language Models&quot;🤖

This paper presents a new empirical equation that directly predicts the performance (i.e., MMLU score) of LLMs by fitting a law on top of several hyper-parameters ⬇️. Leveraging❗️10 open-source models…
sivil_taram's tweet image. 📚Introduction to a new paper &quot;Performance Law of Large Language Models&quot;🤖

This paper presents a new empirical equation that directly predicts the performance (i.e., MMLU score) of LLMs by fitting a law on top of several hyper-parameters ⬇️. Leveraging❗️10 open-source models…
sivil_taram's tweet image. 📚Introduction to a new paper &quot;Performance Law of Large Language Models&quot;🤖

This paper presents a new empirical equation that directly predicts the performance (i.e., MMLU score) of LLMs by fitting a law on top of several hyper-parameters ⬇️. Leveraging❗️10 open-source models…
sivil_taram's tweet image. 📚Introduction to a new paper &quot;Performance Law of Large Language Models&quot;🤖

This paper presents a new empirical equation that directly predicts the performance (i.e., MMLU score) of LLMs by fitting a law on top of several hyper-parameters ⬇️. Leveraging❗️10 open-source models…

Manjunath 已轉發

🚀 Scribble SDXL ControlNet with Gradio ImageEditor component works like magic! Check out the model and cool Spaces👇


Manjunath 已轉發

Llama 3 released! 🚨🔔@AIatMeta just released their best open LLM! 👑🚀 Llama 3 is the next iteration of Llama with a ~10% relative improvement to its predecessor! 🤯 Llama 3 comes in 2 different sizes 8B and 70B with a new extended tokenizer and commercially permissive license!…

_philschmid's tweet image. Llama 3 released! 🚨🔔@AIatMeta  just released their best open LLM! 👑🚀 Llama 3 is the next iteration of Llama with a ~10% relative improvement to its predecessor! 🤯 Llama 3 comes in 2 different sizes 8B and 70B with a new extended tokenizer and commercially permissive license!…

Manjunath 已轉發

The new @MistralAI is now #1 on the openLLM leaderboard. Apache 2.0 license too! 🔥🔥🔥

ClementDelangue's tweet image. The new @MistralAI is now #1 on the openLLM leaderboard. Apache 2.0 license too! 🔥🔥🔥

Manjunath 已轉發

New Instances, New Region, New Capabilities! 🧠 @Google Cloud is now generally available on @huggingface! 🤗 We are excited to launch @GoogleCloudTech as an official backend for Inference Endpoints, offering you more options to power your Generative AI applications. 🚀 🌍 New…

_philschmid's tweet image. New Instances, New Region, New Capabilities! 🧠 @Google  Cloud is now generally available on @huggingface! 🤗 We are excited to launch @GoogleCloudTech as an official backend for Inference Endpoints, offering you more options to power your Generative AI applications. 🚀

🌍 New…

Manjunath 已轉發

Introducing Sora, our text-to-video model. Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions. openai.com/sora Prompt: “Beautiful, snowy…


Manjunath 已轉發

Code Llama 70B Instruct available in Hugging Chat! 💬 Try and experiment with @AIatMeta new Code Llama 70B for free in the @huggingface chat! 😍 👉 huggingface.co/chat?model=cod… Share your experience in this thread! 🤗

_philschmid's tweet image. Code Llama 70B Instruct available in Hugging Chat! 💬

Try and experiment with @AIatMeta new Code Llama 70B for free in the @huggingface chat! 😍

👉 huggingface.co/chat?model=cod…

Share your experience in this thread! 🤗

Manjunath 已轉發

You can now access AI directly from your database! Here is a step-by-step demo that uses GPT-4 to classify customer reviews from a MySQL dataset. And I'm only writing SQL instructions! You have to see it! The model acts as another table in the database. I can query it and join…


Manjunath 已轉發

Another deep learning breakthrough: Deep TDA, a new algorithm using self-supervised learning, overcomes the limitations of traditional dimensionality reduction algorithms. t-SNE and UMAP have long been the favorites. Deep TDA might change that forever. Here are the details:

svpino's tweet image. Another deep learning breakthrough:

Deep TDA, a new algorithm using self-supervised learning, overcomes the limitations of traditional dimensionality reduction algorithms.

t-SNE and UMAP have long been the favorites. Deep TDA might change that forever.

Here are the details:

Manjunath 已轉發

Training Diffusion Models with Reinforcement Learning Presents an RL-based framework for training denoising diffusion models to directly optimize a variety of reward functions arxiv.org/abs/2305.13301

arankomatsuzaki's tweet image. Training Diffusion Models with Reinforcement Learning

Presents an RL-based framework for training denoising diffusion models to directly optimize a variety of reward functions

arxiv.org/abs/2305.13301

Manjunath 已轉發

New open-source chat-GPT model alert! 🚨 @togethercompute released a new version of their chatGPT-NeoX 20B model with higher quality by fine-tuning on user feedback. 🚀🔥 Demo: huggingface.co/spaces/togethe… Model: huggingface.co/togethercomput…

_philschmid's tweet image. New open-source chat-GPT model alert! 🚨 @togethercompute released a new version of their chatGPT-NeoX 20B model with higher quality by fine-tuning on user feedback. 🚀🔥  

Demo: huggingface.co/spaces/togethe…
Model: huggingface.co/togethercomput…

Manjunath 已轉發

Everyone I know who has visited a western economy in the last few months can't stop comparing how bleak everything is compared to #India. Incredible that Indian innovation could power critical backbones of foreign economies; kudos to the Indian govt and all involved in #UPI.

nikhilkamathcio's tweet image. Everyone I know who has visited a western economy in the last few months can&apos;t stop comparing how bleak everything is compared to #India.

Incredible that Indian innovation could power critical backbones of foreign economies; kudos to the Indian govt and all involved in #UPI.
nikhilkamathcio's tweet image. Everyone I know who has visited a western economy in the last few months can&apos;t stop comparing how bleak everything is compared to #India.

Incredible that Indian innovation could power critical backbones of foreign economies; kudos to the Indian govt and all involved in #UPI.
nikhilkamathcio's tweet image. Everyone I know who has visited a western economy in the last few months can&apos;t stop comparing how bleak everything is compared to #India.

Incredible that Indian innovation could power critical backbones of foreign economies; kudos to the Indian govt and all involved in #UPI.

United States 趨勢

Loading...

Something went wrong.


Something went wrong.