JFPuget 🇺🇦🇨🇦🇬🇱

@JFPuget

Machine Learning at @Nvidia, 6x Kaggle Grandmaster CPMP. ENS Ulm alumni. ML PhD. Ex ILOG CPLEX, IBM. Views are my own. Blocking ad hominem attacks.

Science & Technology

France

bsky.app/profile/jfpuge…

Tham gia vào Tháng 3 2012

31KBài đăng 17KNgười theo dõi 2KĐang theo dõi

Bạn có thể thích

@kagglingdieter

@bhutanisanyam1

@GuggerSylvain

@ykilcher

@Thom_Wolf

@jeremyphoward

@Rob_Mulla

@julien_c

@tunguz

@abhi1thakur

@JayAlammar

@ph_singer

@weights_biases

@iScienceLuvr

@giffmana

Ghim

JFPuget 🇺🇦🇨🇦🇬🇱

@JFPuget

21 thg 1

I put Canada and Groenland flags next to Ukraine flag in my profile because both are discussed on some US TV exactly as Ukraine is discussed on some Russia TV.

Let me rephrase. One of AI grand challenge is how to get human experts work on systems that can replace them (bad view), or systems that can help them (much better view). I faced similar dilemma when working on mathematical optimization a while ago. We were enabling the…

Alex Ratner

@ajratner

12 thg 10

One of the fundamental grand challenges in AI is: how to best get human expertise into the loop? We see this playing out in a number of fundamental design decisions today: - How to get human experts involved in data + environment development (AI data dev) - How to get tools…

JFPuget 🇺🇦🇨🇦🇬🇱

@JFPuget

16 giờ

I just got an interesting job offer. Well, the job is not interesting to me, what is of interest is this: Fixing AI generated code to generate training data for some coding agent is now seen as a legit job for people with ML experience. TL;DR this job would be for a ML engineer…

JFPuget 🇺🇦🇨🇦🇬🇱

@JFPuget

12 thg 10

As much as I doubt current LLMs can do new math, this looks like a very powerful use of LLMs in math: proof reading proofs.

Paata Ivanisvili

@PI010101

11 thg 10

GPT 5 Pro is extremely good in identifying serious gaps in published papers.

JFPuget 🇺🇦🇨🇦🇬🇱 đã đăng lại

The AI Investor

@The_AI_Investor

11 thg 10

NVIDIA vs AMD - Token Throughput per GPU vs. Interactivity DeepSeek R1 0528 • FP4 - Token Throughput per GPU vs. End-to-end Latency Even competitor chips are offered for free it's still not cheap enough (Jensen), especially given the power limit of each data center.

The_AI_Investor's tweet image. NVIDIA vs AMD

- Token Throughput per GPU vs. Interactivity DeepSeek R1 0528 • FP4

- Token Throughput per GPU vs. End-to-end Latency

Even competitor chips are offered for free it's still not cheap enough (Jensen), especially given the power limit of each data center.

JFPuget 🇺🇦🇨🇦🇬🇱 đã đăng lại

The AI Investor

@The_AI_Investor

10 thg 10

AMD Instinct MI355X was supposed to compete with NVIDIA Blackwell right? So much for AMD having an advantage in inference.

The_AI_Investor's tweet image. AMD Instinct MI355X was supposed to compete with NVIDIA Blackwell right?

So much for AMD having an advantage in inference.

JFPuget 🇺🇦🇨🇦🇬🇱

@JFPuget

12 thg 10

Cant wait for when vibe coders will discover version control. Best would be using git of course.

Jeffrey Emanuel

@doodlestein

12 thg 10

Public Service Announcement: You can greatly reduce the chances of something catastrophic like this (i.e., Codex or Claude Code deleting your project or files, wrecking your database, etc.) by adding this stuff to the very beginning of your AGENTS dot md file:

doodlestein's tweet image. Public Service Announcement:

You can greatly reduce the chances of something catastrophic like this (i.e., Codex or Claude Code deleting your project or files, wrecking your database, etc.) by adding this stuff to the very beginning of your AGENTS dot md file:

JFPuget 🇺🇦🇨🇦🇬🇱 đã đăng lại

Ahmad Beirami

@abeirami

10 thg 10

In a previous work last year, we had shown that when you're doing KL-regularized fine-tuning, it is theoretically equivalent to multitasking between the tasks from the pre-trained checkpoint and the new task you're fine-tuning for. So, if you're a startup trying to fine-tune a…

Ravid Shwartz Ziv

@ziv_ravid

9 thg 10

Ahmad is explaining why we should use RL for fine-tuning our model for a specific problem...

JFPuget 🇺🇦🇨🇦🇬🇱 đã đăng lại

Ningyu Zhang@ZJU

@zxlzr

10 thg 10

🚀 We just released a new version of AutoMind, our knowledge-augmented data science agent framework. #AutoMind #NLP #Agent #DataScience #LLM Paper: arxiv.org/abs/2506.10974 Code: github.com/InnovatingAI/A… Data Logs: drive.google.com/drive/folders/… Running MLE-Bench was painfully…

zxlzr's tweet card. AutoMind: Adaptive Knowledgeable Agent for Automated Data Science - InnovatingAI/AutoMind

GitHub - InnovatingAI/AutoMind: AutoMind: Adaptive Knowledgeable Agent for Automated Data Science

Nguồn: github.com

Ningyu Zhang@ZJU

@zxlzr

14 thg 6

Introducing AutoMind: Adaptive Knowledgeable Agent for Automated Data Science Paper: arxiv.org/abs/2506.10974 Code (will be released soon): github.com/innovatingAI/A… Our latest work AutoMind is a new LLM agent framework that automates end-to-end machine learning pipelines by…

zxlzr's tweet image. Introducing AutoMind: Adaptive Knowledgeable Agent for Automated Data Science

Paper: arxiv.org/abs/2506.10974

Code (will be released soon): github.com/innovatingAI/A…

Our latest work AutoMind is a new LLM agent framework that automates end-to-end machine learning pipelines by…

JFPuget 🇺🇦🇨🇦🇬🇱

@JFPuget

11 thg 10

Id' be surprised if the actual watermark wasn't invariant by translation. Clearly not the case here.

Mark Kretschmann

@mark_k

10 thg 10

Some crazy people on Reddit managed to extract the "SynthID" watermark that Nano Banana applies to every image. It's possible to make the watermark visible by oversaturating the generated images. This is the Google SynthID watermark:

mark_k's tweet image. Some crazy people on Reddit managed to extract the "SynthID" watermark that Nano Banana applies to every image. It's possible to make the watermark visible by oversaturating the generated images.

This is the Google SynthID watermark:

JFPuget 🇺🇦🇨🇦🇬🇱

@JFPuget

9 thg 10

Homeopathy for LLMs.

Samuel Marks

@saprmarks

8 thg 10

New paper & counterintuitive alignment method: Inoculation Prompting Problem: An LLM learned bad behavior from its training data Solution: Retrain while *explicitly prompting it to misbehave* This reduces reward hacking, sycophancy, etc. without harming learning of capabilities

saprmarks's tweet image. New paper &amp; counterintuitive alignment method: Inoculation Prompting

Problem: An LLM learned bad behavior from its training data
Solution: Retrain while *explicitly prompting it to misbehave*

This reduces reward hacking, sycophancy, etc. without harming learning of capabilities

JFPuget 🇺🇦🇨🇦🇬🇱 đã đăng lại

Charlie Marsh

@charliermarsh

8 thg 10

As of Python 3.14, the free-threaded (or no-GIL) version of the Python interpreter is no longer considered experimental.

JFPuget 🇺🇦🇨🇦🇬🇱 đã đăng lại

JFPuget 🇺🇦🇨🇦🇬🇱

@JFPuget

8 thg 10

Doing math is research. Using math is engineering. These are two very different activities. Engineers apply recipes. Researchers invent new recipes. LLMs may become good engineers. Question is: can they be researchers?

JFPuget 🇺🇦🇨🇦🇬🇱 đã đăng lại

snow

@snowclipsed

8 thg 10

did you mean... π-thon?

Charlie Marsh

@charliermarsh

8 thg 10

Python 3.14 stable dropped today! Congratulations to everyone involved. You can install it now with `uv python upgrade 3.14`.

JFPuget 🇺🇦🇨🇦🇬🇱 đã đăng lại

Jürgen Schmidhuber

@SchmidhuberAI

8 thg 10

2025 update: A Nobel Prize for Plagiarism (Technical Report IDSIA-24-24). Sadly, the 2024 Nobel Prize in Physics awarded to Hopfield & Hinton is effectively a prize for plagiarism. They republished foundational methodologies for artificial neural networks developed by Ivakhnenko,…

SchmidhuberAI's tweet image. 2025 update: A Nobel Prize for Plagiarism (Technical Report IDSIA-24-24). Sadly, the 2024 Nobel Prize in Physics awarded to Hopfield &amp; Hinton is effectively a prize for plagiarism. They republished foundational methodologies for artificial neural networks developed by Ivakhnenko,…

JFPuget 🇺🇦🇨🇦🇬🇱 đã đăng lại

Jackson Atkins

@JacksonAtkinsX

7 thg 10

My brain broke when I read this paper. A tiny 7 Million parameter model just beat DeepSeek-R1, Gemini 2.5 pro, and o3-mini at reasoning on both ARG-AGI 1 and ARC-AGI 2. It's called Tiny Recursive Model (TRM) from Samsung. How can a model 10,000x smaller be smarter? Here's how…

JacksonAtkinsX's tweet image. My brain broke when I read this paper.

A tiny 7 Million parameter model just beat DeepSeek-R1, Gemini 2.5 pro, and o3-mini at reasoning on both ARG-AGI 1 and ARC-AGI 2.

It's called Tiny Recursive Model (TRM) from Samsung.

How can a model 10,000x smaller be smarter?

Here's how…

JFPuget 🇺🇦🇨🇦🇬🇱 đã đăng lại

ben

@benhylak

6 thg 10

OpenAI says that Codex wrote most of the UI code for their their new agent builder the ui is a hot garbage fire. it makes sense now.

benhylak's tweet image. OpenAI says that Codex wrote most of the UI code for their their new agent builder

the ui is a hot garbage fire. it makes sense now.

Steven Heidel

@stevenheidel

6 thg 10

it’s difficult to overstate how important Codex has been to our team’s ability to ship new products. for example: the drag and drop agent builder we launched today was built end to end in under 6 weeks, thanks to Codex writing 80% of the PRs

JFPuget 🇺🇦🇨🇦🇬🇱 đã đăng lại

Pushmeet Kohli

@pushmeet

6 thg 10

Following up on the AlphaEvolve code opt. agent, I am happy to share how our team at @GoogleDeepMind has developed the CodeMender agent to design/apply patches to fix security vulnerabilities in large scale open source projects. #AI4code Read more at: deepmind.google/discover/blog/…

pushmeet's tweet card. CodeMender is a new AI-powered agent that improves code security automatically. It instantly patches new software vulnerabilities, and rewrites and secures existing code, eliminating entire...

Google DeepMind introduces new AI agent for code security

Nguồn: deepmind.google

JFPuget 🇺🇦🇨🇦🇬🇱 đã đăng lại

Lucas Beyer (bl16)

@giffmana

7 thg 10

Quite the contrary: We're using the language that was designed as a glue language for gluing pieces together that are written in the language(s) that were designed for peak performance. Everything working exactly as designed.

Jerry Tworek

@MillionInt

6 thg 10

its an ironic twist of fate that the most performance intensive workloads on the planet running on eye wateringly expensive hardware are run via one of the slowest programming languages with a precarious parallelism story