TasksWithCode's profile picture. We spotlight ML researchers & practitioners.  High (S) fact: ~50% code contributors to ML paper implementations are practitioners collaborating with researchers

TasksWithCode

@TasksWithCode

We spotlight ML researchers & practitioners. High (S) fact: ~50% code contributors to ML paper implementations are practitioners collaborating with researchers

고정된 트윗

A lesser-known fact about ML open source contributors: About 50% of code contributors to ML paper implementations are practitioners collaborating with researchers. Here are the topk researchers & practitioners contributing to open source and open to sponsorship.…


Jacob Pachocki, (OpenAI's new chief scientist's) paper contributions. authorswithcode.org/researchers/?a…


We have been running authorswithcode.com service for over a year now. The number of paper authors contributing to machine learining has grown to 433k+ and code authors to 170k+. More features spotlighting researchers to come.

TasksWithCode's tweet image. We have been running authorswithcode.com service for over a year now. The number of paper authors contributing to machine learining has grown to 433k+ and code authors to 170k+. More features spotlighting researchers to come.

This approach builds on the recent method of using 3D Gaussians to model scenes, enabling the quick and accurate creation of high-quality 3D images from photos or videos, even for large and complex scenes @xiaolonw's contributions authorswithcode.org/researchers/?a…

3D Gaussian Splatting is great, but can it work without the pre-computed camera poses? Introducing: COLMAP-Free 3D Gaussian Splatting Our recent work shows not only it can, but 3D Gaussians make camera pose estimation easy (compared to NeRF) along with reconstruction. 👇🧵



It has been 10 years since the release of the word2vec paper. Word2vec is arguably the first model (if we can even call it that, considering its simplicity—it consists of just two arrays of vectors) to demonstrate the power of distributed representation learning. The training…

Congratulations to Jeff Dean, Greg Corrado, & co-authors of the paper “Distributed Representations of Words and Phrases and their Compositionality”, for winning the #NeurIPS2023 Test of Time Award! This prize recognizes a highly impactful paper published at NeurIPS 10 years ago.

GoogleAI's tweet image. Congratulations to Jeff Dean, Greg Corrado, & co-authors of the paper “Distributed Representations of Words and Phrases and their Compositionality”, for winning the #NeurIPS2023 Test of Time Award! This prize recognizes a highly impactful paper published at NeurIPS 10 years ago.


This paper introduces a new objective function for self-supervised learning (SSL) that leverages the geometric properties of manifolds to which input classes are mapped. The focus is on maximizing manifold capacity, which essentially measures the number of object categories that…

Excited to share our results on Efficient Coding of Natural Images using Maximum Manifold Capacity Representations, a collaboration with @KuangYilun @EeroSimoncelli and @s_y_chung to be presented at #NeurIPS2023 1/n

tedyerxa's tweet image. Excited to share our results on Efficient Coding of Natural Images using Maximum Manifold Capacity Representations, a collaboration with @KuangYilun @EeroSimoncelli and @s_y_chung to be presented at #NeurIPS2023  

1/n


This work shows that Large Language Models (LLMs) aligned with fine-tuning or Reinforcement Learning from Human Feedback (RLHF) are still susceptible to prompt attacks that can reveal the training data memorized by the model. It is a known fact that base models tend to memorize…

TasksWithCode's tweet image. This work shows that Large Language Models (LLMs) aligned with fine-tuning or Reinforcement Learning from Human Feedback (RLHF) are still susceptible to prompt attacks that can reveal the training data memorized by the model. It is a known fact that base models tend to memorize…

Andrej Karpathy continues to be the few biological agents whose world model is worth sampling from for nuggets of insights on LLMs youtu.be/zjkBMFhNj_g?si…

TasksWithCode's tweet card. [1hr Talk] Intro to Large Language Models

youtube.com

YouTube

[1hr Talk] Intro to Large Language Models


A thoughtful answer to the question - LLMs have been practically exposed to the entire corpus of human knowledge and still have not made connections that have led to a discovery, at least not yet. Eric's contributions to research authorswithcode.org/researchers/?a…

tl;dr: Maybe learning simple things (basic knowledge, heuristics, etc) actually lowers the loss more than learning sophisticated things (algorithms associated with higher cognition that we really care about), and the sophisticated things will eventually be learned as scaling…



For anyone planning to efficiently fine tune a LLM, this article by @rasbt on LoRA could be helpful. He explains with clarity the trade-offs to consider such as choice of quantizing pretrained weights, choice of optimizers (Adam vs SGD), impact of schedulers etc. What is LoRA?…


An approach for distributed training of LLM where pretraining is performed in parallel on multiple nodes and can have a large number of training steps (compared to typical Federated averaging). The model parameters are sent to a central server that finds the average change in…


This paper explores a combination of approaches to filter the input to a generator to improve RAG performance Zora's (@ZhiruoW) contribution to research. Code is also released (models used FLAN-T5 and LLAMA-2) It suggests a combination of methods to filter irrelevant content…

TasksWithCode's tweet image. This paper explores a combination of approaches to filter the input to a generator to improve RAG performance

Zora's (@ZhiruoW)  contribution to research. Code is also released (models used FLAN-T5 and LLAMA-2)

It suggests a combination of methods to filter irrelevant content…

Everyone is using RAG, but most of the retrieved context is noisy! 🚨 Introducing FilCo: “Learning to Filter Context for Retrieval-Augmented Generation” TL;DR: Get rid of the irrelevant content using FilCo, and you'll get better outputs. Preprint: arxiv.org/abs/2311.08377



This unforgiving retrospective analysis on the anniversary of Galactica release, by its first author, @rosstaylor90, sets a high bar for researchers - it underscores the importance of releasing work for feedback and continual improvement, even if it is a work in progress.…

I am the first author of the Galactica paper and have been quiet about it for a year. Maybe I will write a blog post talking about what actually happened, but if you want the TLDR: 1. Galactica was a base model trained on scientific literature and modalities. 2. We approached…



It is worth noting that the Consistency Decoder, which OpenAI open-sourced last week, builds on prior research from late December 2021, also open-sourced by @robrombach and @pess_r. The Consistency Decoder generates images by working on the latent space output of a VQGAN, an…

TasksWithCode's tweet image. It is worth noting that the Consistency Decoder, which OpenAI open-sourced last week, builds on prior research from late December 2021, also open-sourced by @robrombach and @pess_r. The Consistency Decoder generates images by working on the latent space output of a VQGAN, an…

Very excited to release the consistency decoder for DALLE-3, a consistency model that transforms VQGAN latents to images with improved quality at remarkable speeds!



Loading...

Something went wrong.


Something went wrong.