melonkernel's profile picture. Loves to DIY

melonkernel

@melonkernel

Loves to DIY

melonkernel reposted

🤯🤯 You can now create a chatbot on ANY GitHub repo using the Llama 3.1 405B model with @huggingface assistants -- FOR FREE! 💰 This is insane! 🚀 Link: hf.co/chat/assistants @ClementDelangue @julien_c


melonkernel reposted

Congratulations!!! This is the worst kept secret in all of ML.

tunguz's tweet image. Congratulations!!!

This is the worst kept secret in all of ML.

melonkernel reposted

LLMs can memorize training data, causing copyright/privacy risks. Goldfish loss is a nifty trick for training an LLM without memorizing training data. I can train a 7B model on the opening of Harry Potter for 100 gradient steps in a row, and the model still doesn't memorize.

tomgoldsteincs's tweet image. LLMs can memorize training data, causing copyright/privacy risks. Goldfish loss is a nifty trick for training an LLM without memorizing training data.

I can train a 7B model on the opening of Harry Potter for 100 gradient steps in a row, and the model still doesn't memorize.

Testing and comparing the new Llama3:8b Text We have designed Llama 3 models to be maximally helpful while ensuring an industry leading approach to responsibly deploying them. To achieve this, we have adopted a new, system-level approach to the responsible development and…


Got Premium+ so I can test Grok


melonkernel reposted

The biggest LLM release!🚨 @AnthropicAI have given us the “dream” Large Language model with Claude 2: - Trained till early 2023 - Context length upto 200k tokens - Ability to upload documents - Better code abilities - Free beta for public Please see my demo below of comparing…


melonkernel reposted

Excited to share a new position paper I wrote on a recent exciting trend in generative AI: autonomous AI agents. These are are capable of accomplishing tasks entirely on their own and we at @SFResearch call them LAMs—Large Action Models. For more details: blog.salesforceairesearch.com/p/9f1f3323-dbc…


melonkernel reposted

With Reddit and many other sites shutting down access to their APIs it is now more important than ever to release quality open-source conversational data. I worked with @ShayneRedford to generate ~80GB of labeled FLAN dialog data. huggingface.co/datasets/conce…


melonkernel reposted

Nice work to reproduce LLaMa results and point out to the problem with the OpenLLM leaderboard. I'm glad someone reproduced MMLU results. Inspired by this and a tweet exchange with @YiTayML , I thought I'd share a few more problems about the OpenLLM leaderboard. 1/5

Is Falcon really better than LLaMA? Short take: probably not. Longer take: we reproduced LLaMA 65B eval on MMLU and we got 61.4, close to the official number (63.4), much higher than its Open LLM Leaderboard number (48.8), and clearly higher than Falcon (52.7). Code and prompt…



melonkernel reposted

Introducing Git-Theta, a Git extension that enables collaborative and continual development of ML models with merges, diffs, and parameter-efficient updates—all using the standard Git workflow! 📄 arxiv.org/abs/2306.04529 💽 github.com/r-three/git-th… 🗣️ cccml.zulipchat.com 🧵⬇️

blester125's tweet image. Introducing Git-Theta, a Git extension that enables collaborative and continual development of ML models with merges, diffs, and parameter-efficient updates—all using the standard Git workflow!

📄 arxiv.org/abs/2306.04529
💽 github.com/r-three/git-th…
🗣️ cccml.zulipchat.com
🧵⬇️

melonkernel reposted

So...I thought Photoshop was in trouble because of AI. But now, Adobe's new AI-enhanced Photoshop beta is out. And it's one of the most magical things I've ever seen. 13 game-changing examples that'll blow your mind:


melonkernel reposted

Everyone knows that transformers are synonymous with large language models… but what if they weren’t? Over the past two years @BlinkDL_AI and team have been hard at work scaling RNNs to unprecedented scales. Today we are releasing a preprint on our work arxiv.org/abs/2305.13048


Good summary

Oops haven't tweeted too much recently; I'm mostly watching with interest the open source LLM ecosystem experiencing early signs of a cambrian explosion. Roughly speaking the story as of now: 1. Pretraining LLM base models remains very expensive. Think: supercomputer + months.…



melonkernel reposted

The creator of the trailer for Star Wars by Wes Anderson [1] is back with a new trailer for The Lord of the Rings. Highly amusing. Cited as ~25 hours of work. Guess at tools: - Midjourney / Stable Diffusion - ControlNet depth map for parallax - ElevenLabs for text-to-voice…

What if Wes Anderson directed The Lord of the Rings? We asked the community which video they want to see next and Lord of the Rings took the cake… or should we say Elven bread. We hope you enjoy this Midjourney to Middle-Earth. #LordOfTheRings #WesAnderson #MovieTrailer #LOTR



Om du har tid @anna_maja så skulle jag varmt rekommendera denna forskningssummering om hur smittspridning påverkas ifall alla bär mask. Tack för allt. fast.ai/2020/04/13/mas…


melonkernel reposted

Was just rereading @aylin_cim @j2bryson @random_walker paper on bias in word embeddings. They use "small baskets" of words (from heavily cited psychology papers) to represent a concept, and compare the distance/similarity between different concepts. arxiv.org/abs/1608.07187

Thought-provoking argument against de-biasing word embeddings: debiasing alters AI's model of the world, rather than how it acts on that

math_rachel's tweet image. Thought-provoking argument against de-biasing word embeddings: debiasing alters AI's model of the world, rather than how it acts on that


melonkernel reposted

If you are into deep learning, then Swift is a language you should probably start learning. Learn why in this post that I just published: medium.com/@pechyonkin/wh…


melonkernel reposted

fastai v2 (coming in the next couple of months) will have lots more tutorials showing use of fastai's lower-level APIs for more advanced users. Here's an example of using the new `Pipeline` class to create data for a Siamese model.

jeremyphoward's tweet image. fastai v2 (coming in the next couple of months) will have lots more tutorials showing use of fastai's lower-level APIs for more advanced users.

Here's an example of using the new `Pipeline` class to create data for a Siamese model.

United States Trends

Loading...

Something went wrong.


Something went wrong.