melonkernel

@melonkernel

Loves to DIY

Bilim ve Teknoloji

Jakobstad, Finland

Joined February 2008

260Posts 99Followers 146Following

You might like

@Dagen

@akshat_gurnani

@saurau

@AndrewTouchet

@buc

melonkernel reposted

Satvik Paramkusham

@satvikps

Jul 28, 2024

🤯🤯 You can now create a chatbot on ANY GitHub repo using the Llama 3.1 405B model with @huggingface assistants -- FOR FREE! 💰 This is insane! 🚀 Link: hf.co/chat/assistants @ClementDelangue @julien_c

melonkernel reposted

Bojan Tunguz

@tunguz

Jul 2, 2024

Congratulations!!! This is the worst kept secret in all of ML.

melonkernel reposted

LLMs can memorize training data, causing copyright/privacy risks. Goldfish loss is a nifty trick for training an LLM without memorizing training data. I can train a 7B model on the opening of Harry Potter for 100 gradient steps in a row, and the model still doesn't memorize.

tomgoldsteincs's tweet image. LLMs can memorize training data, causing copyright/privacy risks. Goldfish loss is a nifty trick for training an LLM without memorizing training data.

I can train a 7B model on the opening of Harry Potter for 100 gradient steps in a row, and the model still doesn't memorize.

melonkernel

@melonkernel

Apr 18, 2024

Testing and comparing the new Llama3:8b Text We have designed Llama 3 models to be maximally helpful while ensuring an industry leading approach to responsibly deploying them. To achieve this, we have adopted a new, system-level approach to the responsible development and…

melonkernel

@melonkernel

Apr 17, 2024

Got Premium+ so I can test Grok

melonkernel reposted

Sanyam Bhutani

@bhutanisanyam1

Jul 12, 2023

The biggest LLM release!🚨 @AnthropicAI have given us the “dream” Large Language model with Claude 2: - Trained till early 2023 - Context length upto 200k tokens - Ability to upload documents - Better code abilities - Free beta for public Please see my demo below of comparing…

melonkernel reposted

Silvio Savarese

@silviocinguetta

Jun 27, 2023

Excited to share a new position paper I wrote on a recent exciting trend in generative AI: autonomous AI agents. These are are capable of accomplishing tasks entirely on their own and we at @SFResearch call them LAMs—Large Action Models. For more details: blog.salesforceairesearch.com/p/9f1f3323-dbc…

melonkernel reposted

Enrico Shippole

@EnricoShippole

Jun 13, 2023

With Reddit and many other sites shutting down access to their APIs it is now more important than ever to release quality open-source conversational data. I worked with @ShayneRedford to generate ~80GB of labeled FLAN dialog data. huggingface.co/datasets/conce…

melonkernel reposted

Sharan Narang

@sharan0909

Jun 8, 2023

Nice work to reproduce LLaMa results and point out to the problem with the OpenLLM leaderboard. I'm glad someone reproduced MMLU results. Inspired by this and a tweet exchange with @YiTayML , I thought I'd share a few more problems about the OpenLLM leaderboard. 1/5

Yao Fu

@Francis_YAO_

Jun 8, 2023

Is Falcon really better than LLaMA? Short take: probably not. Longer take: we reproduced LLaMA 65B eval on MMLU and we got 61.4, close to the official number (63.4), much higher than its Open LLM Leaderboard number (48.8), and clearly higher than Falcon (52.7). Code and prompt…

melonkernel reposted

Brian Lester

@blester125

Jun 8, 2023

Introducing Git-Theta, a Git extension that enables collaborative and continual development of ML models with merges, diffs, and parameter-efficient updates—all using the standard Git workflow! 📄 arxiv.org/abs/2306.04529 💽 github.com/r-three/git-th… 🗣️ cccml.zulipchat.com 🧵⬇️

blester125's tweet image. Introducing Git-Theta, a Git extension that enables collaborative and continual development of ML models with merges, diffs, and parameter-efficient updates—all using the standard Git workflow!

📄 arxiv.org/abs/2306.04529
💽 github.com/r-three/git-th…
🗣️ cccml.zulipchat.com
🧵⬇️

melonkernel reposted

Nathan Lands

@NathanLands

May 28, 2023

So...I thought Photoshop was in trouble because of AI. But now, Adobe's new AI-enhanced Photoshop beta is out. And it's one of the most magical things I've ever seen. 13 game-changing examples that'll blow your mind:

melonkernel reposted

EleutherAI

@AiEleuther

May 23, 2023

Everyone knows that transformers are synonymous with large language models… but what if they weren’t? Over the past two years @BlinkDL_AI and team have been hard at work scaling RNNs to unprecedented scales. Today we are releasing a preprint on our work arxiv.org/abs/2305.13048

melonkernel

@melonkernel

May 12, 2023

Good summary

Andrej Karpathy

@karpathy

May 6, 2023

Oops haven't tweeted too much recently; I'm mostly watching with interest the open source LLM ecosystem experiencing early signs of a cambrian explosion. Roughly speaking the story as of now: 1. Pretraining LLM base models remains very expensive. Think: supercomputer + months.…

melonkernel reposted

Andrej Karpathy

@karpathy

May 10, 2023

The creator of the trailer for Star Wars by Wes Anderson [1] is back with a new trailer for The Lord of the Rings. Highly amusing. Cited as ~25 hours of work. Guess at tools: - Midjourney / Stable Diffusion - ControlNet depth map for parallax - ElevenLabs for text-to-voice…

Curious Refuge

@CuriousRefuge

May 9, 2023

What if Wes Anderson directed The Lord of the Rings? We asked the community which video they want to see next and Lord of the Rings took the cake… or should we say Elven bread. We hope you enjoy this Midjourney to Middle-Earth. #LordOfTheRings #WesAnderson #MovieTrailer #LOTR

melonkernel

@melonkernel

Apr 14, 2020

Om du har tid @anna_maja så skulle jag varmt rekommendera denna forskningssummering om hur smittspridning påverkas ifall alla bär mask. Tack för allt. fast.ai/2020/04/13/mas…

melonkernel reposted

Rachel Thomas

@math_rachel

May 31, 2019

Was just rereading @aylin_cim @j2bryson @random_walker paper on bias in word embeddings. They use "small baskets" of words (from heavily cited psychology papers) to represent a concept, and compare the distance/similarity between different concepts. arxiv.org/abs/1608.07187

Rachel Thomas

@math_rachel

Jun 9, 2017

Thought-provoking argument against de-biasing word embeddings: debiasing alters AI's model of the world, rather than how it acts on that

math_rachel's tweet image. Thought-provoking argument against de-biasing word embeddings: debiasing alters AI's model of the world, rather than how it acts on that

melonkernel reposted

Max Pechyonkin

@max_pechyonkin

May 27, 2019

If you are into deep learning, then Swift is a language you should probably start learning. Learn why in this post that I just published: medium.com/@pechyonkin/wh…

max_pechyonkin's tweet card. If you are into deep learning, then Swift is a language you should probably start learning.

Why Swift May Be the Next Big Thing in Deep Learning

Source: medium.com

melonkernel reposted

Jeremy Howard

@jeremyphoward

May 28, 2019

fastai v2 (coming in the next couple of months) will have lots more tutorials showing use of fastai's lower-level APIs for more advanced users. Here's an example of using the new `Pipeline` class to create data for a Siamese model.