feliferrgo's profile picture. AI Researcher at http://globo.com | PhD Student at PUC-Rio

Felipe Ferreira

@feliferrgo

AI Researcher at http://globo.com | PhD Student at PUC-Rio

Felipe Ferreira 已轉發

Haha! Good one.

We already called it in 2020!

chrmanning's tweet image. We already called it in 2020!


Felipe Ferreira 已轉發

ICLR 2026 will take place in 📍Rio de Janeiro, Brazil 📅 April 23–27, 2026 Save the date - see you in Rio! #ICLR2026

iclr_conf's tweet image. ICLR 2026 will take place in

📍Rio de Janeiro, Brazil
📅 April 23–27, 2026

Save the date - see you in Rio!  #ICLR2026

Felipe Ferreira 已轉發

If you're getting into LLMs, PyTorch is essential. And lot of folks asked for beginner-friendly material, so I put this together: PyTorch in One Hour: From Tensors to Multi-GPU Training (sebastianraschka.com/teaching/pytor…) 📖 ~1h to read through 💡 Maybe the perfect weekend project!? I’ve…


Felipe Ferreira 已轉發

Future of Work with AI Agents Stanford's new report analyzes what 1500 workers think about working with AI Agents. What types of AI Agents should we build? A few surprises! Let's take a closer look:

omarsar0's tweet image. Future of Work with AI Agents

Stanford's new report analyzes what 1500 workers think about working with AI Agents.

What types of AI Agents should we build?

A few surprises!

Let's take a closer look:

Felipe Ferreira 已轉發

Thinking Machines: A Survey of LLM-based Reasoning Strategies Great survey to catch up on LLM-based reasoning strategies. It provides an overview and comparison of existing reasoning techniques and presents a systematic survey of reasoning-imbued language models.

omarsar0's tweet image. Thinking Machines: A Survey of LLM-based Reasoning Strategies

Great survey to catch up on LLM-based reasoning strategies.

It provides an overview and comparison of existing reasoning techniques and presents a systematic survey of reasoning-imbued language models.

Felipe Ferreira 已轉發

Generative vs. discriminative models in ML: Generative models: - learn the distribution so they can generate new samples. - possess discriminative properties—we can use them for classification. Discriminative models don't have generative properties.

_avichawla's tweet image. Generative vs. discriminative models in ML:

Generative models:
- learn the distribution so they can generate new samples.
- possess discriminative properties—we can use them for classification.

Discriminative models don't have generative properties.

Felipe Ferreira 已轉發

Transformer vs. Mixture of Experts in LLMs, clearly explained (with visuals):


Felipe Ferreira 已轉發

MIT's "Matrix Calculus for Machine Learning" 🗒️Lecture Notes: ocw.mit.edu/courses/18-s09… 📽️Lecture Videos: youtube.com/playlist?list=…

Riazi_Cafe_en's tweet image. MIT's "Matrix Calculus for Machine Learning"

🗒️Lecture Notes: ocw.mit.edu/courses/18-s09…
📽️Lecture Videos: youtube.com/playlist?list=…

Felipe Ferreira 已轉發

Step-by-Step Diffusion: An Elementary Tutorial

predict_addict's tweet image. Step-by-Step Diffusion: An Elementary Tutorial

Felipe Ferreira 已轉發

Apple presents: Distillation Scaling Laws Presents a distillation scaling law that estimates distilled model performance based on a compute budget and its allocation between the student and teacher

arankomatsuzaki's tweet image. Apple presents:

Distillation Scaling Laws

Presents a distillation scaling law that estimates distilled model performance based on a compute budget and its allocation between the student and teacher

Felipe Ferreira 已轉發

Like everyone, I've been a bit distracted exploring @deepseek_ai R1 and experimenting with it locally. I've spoken to a few people recently who don't know how to run local LLMs - this thread will cover a few different tools to get up and running easily.

MagicAmish's tweet image. Like everyone, I've been a bit distracted exploring @deepseek_ai R1 and experimenting with it locally.

I've spoken to a few people recently who don't know how to run local LLMs - this thread will cover a few different tools to get up and running easily.

Felipe Ferreira 已轉發

DeepSeekV3, Gemini, Mixtral and many others are all Mixture of Experts (MoEs). But what exactly are MoEs? 🤔 A Mixture of Experts (MoE) is a machine learning framework that resembles a team of specialists, each adept at handling different aspects of a complex task. It's like…

akshay_pachaar's tweet image. DeepSeekV3, Gemini, Mixtral and many others are all Mixture of Experts (MoEs).

But what exactly are MoEs? 🤔

A Mixture of Experts (MoE) is a machine learning framework that resembles a team of specialists, each adept at handling different aspects of a complex task.

It's like…

Felipe Ferreira 已轉發

Announcing new open-source Python package: aisuite! This makes it easy for developers to use large language models from multiple providers. When building applications I found it a hassle to integrate with multiple providers. Aisuite lets you pick a "provider:model" just by…

AndrewYNg's tweet image. Announcing new open-source Python package: aisuite!  

This makes it easy for developers to use large language models from multiple providers. When building applications I found it a hassle to integrate with multiple providers. Aisuite lets you pick a "provider:model" just by…

Felipe Ferreira 已轉發

Tora Trajectory-oriented Diffusion Transformer for Video Generation Recent advancements in Diffusion Transformer (DiT) have demonstrated remarkable proficiency in producing high-quality video content. Nonetheless, the potential of transformer-based diffusion models for


Felipe Ferreira 已轉發

it's 1:28AM and I just finished this abomination. fully illustrated toy calculation of 1 transformer layer. why would I make this? idk ask my thesis advisor, "not everyone knows how a transformer works, you have to give an example"

zmkzmkz's tweet image. it's 1:28AM and I just finished this abomination. fully illustrated toy calculation of 1 transformer layer.

why would I make this? idk ask my thesis advisor, "not everyone knows how a transformer works, you have to give an example"

Felipe Ferreira 已轉發

I love this tutorial on the self-attention mechanism used in transformers. It shows how all matrices are computed, along with the matrix sizes and code to implement it. sebastianraschka.com/blog/2023/self…


Felipe Ferreira 已轉發

What is a Mixture-of-Experts (MoE)? A Mixture of Experts (MoE) is a machine learning framework that resembles a team of specialists, each adept at handling different aspects of a complex task. It's like dividing a large problem into smaller, more manageable parts and assigning…

akshay_pachaar's tweet image. What is a Mixture-of-Experts (MoE)?

A Mixture of Experts (MoE) is a machine learning framework that resembles a team of specialists, each adept at handling different aspects of a complex task. 

It's like dividing a large problem into smaller, more manageable parts and assigning…

Felipe Ferreira 已轉發

Is Cosine-Similarity of Embeddings Really About Similarity? Netflix cautions against blindly using cosine similarity as a measure of semantic similarity between learned embeddings, as it can yield arbitrary and meaningless results. 📝arxiv.org/abs/2403.05440

_reachsumit's tweet image. Is Cosine-Similarity of Embeddings Really About Similarity?

Netflix cautions against blindly using cosine similarity as a measure of semantic similarity between learned embeddings, as it can yield arbitrary and meaningless results.

📝arxiv.org/abs/2403.05440

Felipe Ferreira 已轉發

Here's my take on the Sora technical report, with a good dose of speculation that could be totally off. First of all, really appreciate the team for sharing helpful insights and design decisions – Sora is incredible and is set to transform the video generation community. What we…

sainingxie's tweet image. Here's my take on the Sora technical report, with a good dose of speculation that could be totally off. First of all, really appreciate the team for sharing helpful insights and design decisions – Sora is incredible and is set to transform the video generation community.

What we…

Felipe Ferreira 已轉發

We just finished a joint code release for CamP (camp-nerf.github.io) and Zip-NeRF (jonbarron.info/zipnerf/). As far as I know, this code is SOTA in terms of image quality (but not speed) among all the radiance field techniques out there. Have fun! github.com/jonbarron/camp…


United States 趨勢

Loading...

Something went wrong.


Something went wrong.