Felipe Ferreira

@feliferrgo

AI Researcher at http://globo.com | PhD Student at PUC-Rio

Rio de Janeiro, Brazil

linkedin.com/in/feliferr

於十月 2009 加入

303貼文 114位跟隨者 221個跟隨中

你可能會喜歡

@timotta

@Agelesschronicl

@Leo678232370

@abhishekztweet

Felipe Ferreira 已轉發

Christopher Manning

@chrmanning

年11月2日

We already called it in 2020!

Felipe Ferreira 已轉發

ICLR 2026

@iclr_conf

年8月6日

ICLR 2026 will take place in 📍Rio de Janeiro, Brazil 📅 April 23–27, 2026 Save the date - see you in Rio! #ICLR2026

Felipe Ferreira 已轉發

If you're getting into LLMs, PyTorch is essential. And lot of folks asked for beginner-friendly material, so I put this together: PyTorch in One Hour: From Tensors to Multi-GPU Training (sebastianraschka.com/teaching/pytor…) 📖 ~1h to read through 💡 Maybe the perfect weekend project!? I’ve…

rasbt's tweet card. A curated introduction to PyTorch that gets you up to speed in about an hour.

PyTorch in One Hour: From Tensors to Training Neural Networks on Multiple GPUs

來源: sebastianraschka.com

Felipe Ferreira 已轉發

elvis

@omarsar0

年6月20日

Future of Work with AI Agents Stanford's new report analyzes what 1500 workers think about working with AI Agents. What types of AI Agents should we build? A few surprises! Let's take a closer look:

omarsar0's tweet image. Future of Work with AI Agents

Stanford's new report analyzes what 1500 workers think about working with AI Agents.

What types of AI Agents should we build?

A few surprises!

Let's take a closer look:

Felipe Ferreira 已轉發

elvis

@omarsar0

年3月17日

Thinking Machines: A Survey of LLM-based Reasoning Strategies Great survey to catch up on LLM-based reasoning strategies. It provides an overview and comparison of existing reasoning techniques and presents a systematic survey of reasoning-imbued language models.

omarsar0's tweet image. Thinking Machines: A Survey of LLM-based Reasoning Strategies

Great survey to catch up on LLM-based reasoning strategies.

It provides an overview and comparison of existing reasoning techniques and presents a systematic survey of reasoning-imbued language models.

Felipe Ferreira 已轉發

Avi Chawla

@_avichawla

年2月28日

Generative vs. discriminative models in ML: Generative models: - learn the distribution so they can generate new samples. - possess discriminative properties—we can use them for classification. Discriminative models don't have generative properties.

_avichawla's tweet image. Generative vs. discriminative models in ML:

Generative models:
- learn the distribution so they can generate new samples.
- possess discriminative properties—we can use them for classification.

Discriminative models don't have generative properties.

Felipe Ferreira 已轉發

Avi Chawla

@_avichawla

年2月25日

Transformer vs. Mixture of Experts in LLMs, clearly explained (with visuals):

Felipe Ferreira 已轉發

Math Cafe

@Riazi_Cafe_en

年2月21日

MIT's "Matrix Calculus for Machine Learning" 🗒️Lecture Notes: ocw.mit.edu/courses/18-s09… 📽️Lecture Videos: youtube.com/playlist?list=…

Riazi_Cafe_en's tweet image. MIT's "Matrix Calculus for Machine Learning"

🗒️Lecture Notes: ocw.mit.edu/courses/18-s09…
📽️Lecture Videos: youtube.com/playlist?list=…

Felipe Ferreira 已轉發

Valeriy M., PhD, MBA, CQF

@predict_addict

年2月19日

Step-by-Step Diffusion: An Elementary Tutorial

Felipe Ferreira 已轉發

Aran Komatsuzaki

@arankomatsuzaki

年2月13日

Apple presents: Distillation Scaling Laws Presents a distillation scaling law that estimates distilled model performance based on a compute budget and its allocation between the student and teacher

arankomatsuzaki's tweet image. Apple presents:

Distillation Scaling Laws

Presents a distillation scaling law that estimates distilled model performance based on a compute budget and its allocation between the student and teacher

Felipe Ferreira 已轉發

AmishMagic

@MagicAmish

年1月29日

Like everyone, I've been a bit distracted exploring @deepseek_ai R1 and experimenting with it locally. I've spoken to a few people recently who don't know how to run local LLMs - this thread will cover a few different tools to get up and running easily.

MagicAmish's tweet image. Like everyone, I've been a bit distracted exploring @deepseek_ai R1 and experimenting with it locally.

I've spoken to a few people recently who don't know how to run local LLMs - this thread will cover a few different tools to get up and running easily.

Felipe Ferreira 已轉發

Akshay 🚀

@akshay_pachaar

年12月29日

DeepSeekV3, Gemini, Mixtral and many others are all Mixture of Experts (MoEs). But what exactly are MoEs? 🤔 A Mixture of Experts (MoE) is a machine learning framework that resembles a team of specialists, each adept at handling different aspects of a complex task. It's like…

akshay_pachaar's tweet image. DeepSeekV3, Gemini, Mixtral and many others are all Mixture of Experts (MoEs).

But what exactly are MoEs? 🤔

A Mixture of Experts (MoE) is a machine learning framework that resembles a team of specialists, each adept at handling different aspects of a complex task.

It's like…

Felipe Ferreira 已轉發

Andrew Ng

@AndrewYNg

年11月25日

Announcing new open-source Python package: aisuite! This makes it easy for developers to use large language models from multiple providers. When building applications I found it a hassle to integrate with multiple providers. Aisuite lets you pick a "provider:model" just by…

AndrewYNg's tweet image. Announcing new open-source Python package: aisuite!

This makes it easy for developers to use large language models from multiple providers. When building applications I found it a hassle to integrate with multiple providers. Aisuite lets you pick a "provider:model" just by…

Felipe Ferreira 已轉發

AK

@_akhaliq

2024年8月1日

Tora Trajectory-oriented Diffusion Transformer for Video Generation Recent advancements in Diffusion Transformer (DiT) have demonstrated remarkable proficiency in producing high-quality video content. Nonetheless, the potential of transformer-based diffusion models for

Felipe Ferreira 已轉發

zed

@zmkzmkz

2024年6月5日

it's 1:28AM and I just finished this abomination. fully illustrated toy calculation of 1 transformer layer. why would I make this? idk ask my thesis advisor, "not everyone knows how a transformer works, you have to give an example"

zmkzmkz's tweet image. it's 1:28AM and I just finished this abomination. fully illustrated toy calculation of 1 transformer layer.

why would I make this? idk ask my thesis advisor, "not everyone knows how a transformer works, you have to give an example"

Felipe Ferreira 已轉發

Dr. April Khademi, PEng

@aprilkhademi

2024年4月5日

I love this tutorial on the self-attention mechanism used in transformers. It shows how all matrices are computed, along with the matrix sizes and code to implement it. sebastianraschka.com/blog/2023/self…

aprilkhademi's tweet card. In this article, we are going to understand how self-attention works from scratch. This means we will code it ourselves one step at a time. Since its introdu...

Understanding and Coding the Self-Attention Mechanism of Large Language Models From Scratch

來源: sebastianraschka.com

Felipe Ferreira 已轉發

Akshay 🚀

@akshay_pachaar

2024年3月11日

What is a Mixture-of-Experts (MoE)? A Mixture of Experts (MoE) is a machine learning framework that resembles a team of specialists, each adept at handling different aspects of a complex task. It's like dividing a large problem into smaller, more manageable parts and assigning…

akshay_pachaar's tweet image. What is a Mixture-of-Experts (MoE)?

A Mixture of Experts (MoE) is a machine learning framework that resembles a team of specialists, each adept at handling different aspects of a complex task.

It's like dividing a large problem into smaller, more manageable parts and assigning…

Felipe Ferreira 已轉發

Sumit

@_reachsumit

2024年3月11日

Is Cosine-Similarity of Embeddings Really About Similarity? Netflix cautions against blindly using cosine similarity as a measure of semantic similarity between learned embeddings, as it can yield arbitrary and meaningless results. 📝arxiv.org/abs/2403.05440

_reachsumit's tweet image. Is Cosine-Similarity of Embeddings Really About Similarity?

Netflix cautions against blindly using cosine similarity as a measure of semantic similarity between learned embeddings, as it can yield arbitrary and meaningless results.

📝arxiv.org/abs/2403.05440

Felipe Ferreira 已轉發

Saining Xie

@sainingxie

2024年2月16日

Here's my take on the Sora technical report, with a good dose of speculation that could be totally off. First of all, really appreciate the team for sharing helpful insights and design decisions – Sora is incredible and is set to transform the video generation community. What we…

sainingxie's tweet image. Here's my take on the Sora technical report, with a good dose of speculation that could be totally off. First of all, really appreciate the team for sharing helpful insights and design decisions – Sora is incredible and is set to transform the video generation community.

What we…

Felipe Ferreira 已轉發

Jon Barron

@jon_barron

2024年1月19日

We just finished a joint code release for CamP (camp-nerf.github.io) and Zip-NeRF (jonbarron.info/zipnerf/). As far as I know, this code is SOTA in terms of image quality (but not speed) among all the radiance field techniques out there. Have fun! github.com/jonbarron/camp…