Satyabrat Singh

@satyabratsingh

Interest in Software, ML, Quant Research, MSc in ML from UCL, MSc Maths from IIT

Financial Services

London, Giridih

github.com/satyabratkumar…

انضم في مايو 2009

814المنشورات 280المتابعون 1ألفالمتابَعون

قد يعجبك

@randy_rem

@RohitGhosh_7703

@nick_nico14

@VikashP11427405

@bobompunga

Satyabrat Singh

@satyabratsingh

٥ ديسمبرم

Some learnings while training large #NeuralNets for Quantitative dataset - Data, very important - ensure that data is correctly distributed and has variety of signals - Normalise before feeding to NN - Condense features before feeding - Attention is difficult - more to come :)…

Satyabrat Singh

@satyabratsingh

٣ ديسمبرم

Looks like will miss #NeurIPS2025 , dint get leave :(

Satyabrat Singh

@satyabratsingh

٣ ديسمبرم

So, some insights for me from Ilya's podcast - Continual learning would be more effective as it's same as evolution for humans - Some value functions might be ingrained in human genes. - Might have hit limit of scaling, so need new ways of pre-training

Satyabrat Singh أعاد

Yoonho Lee

@yoonholeee

٢ ديسمبرم

Following the Text Gradient at Scale We wrote a @StanfordAILab blog post about the limitations of RL methods that learn solely from scalar rewards + a new method that addresses this Blog: ai.stanford.edu/blog/feedback-… Paper: arxiv.org/abs/2511.07919

yoonholeee's tweet image. Following the Text Gradient at Scale

We wrote a @StanfordAILab blog post about the limitations of RL methods that learn solely from scalar rewards + a new method that addresses this

Blog: ai.stanford.edu/blog/feedback-…
Paper: arxiv.org/abs/2511.07919

Satyabrat Singh

@satyabratsingh

١ ديسمبرم

So contextual-based retrieval turns out to be effective. Even with very granular chunking, search performance improved, LLM judge gave higher scores.. More details on anthropic.com/engineering/co…

satyabratsingh's tweet card. Explore how Anthropic enhances AI systems through advanced contextual retrieval methods. Learn about our approach to improving information access and relevance in large language models.

Contextual Retrieval in AI Systems

المصدر: anthropic.com

Satyabrat Singh

@satyabratsingh

١٩ نوفمبرم

#Gemini3 is indeed good in reasoning tasks, helped to optimize a neural network, where it suggested, rather than optimizing the network, we could tweak the loss function, pretty smart !!!

Satyabrat Singh أعاد

Google DeepMind

@GoogleDeepMind

٢٢ أكتوبرم

If you want to learn AI from the experts, keep reading. 💡 Together with @UCL, we made a free AI Research Foundations curriculum – available now on Google Skills. With lessons from a Gemini Lead like @OriolVinyalsML, you'll explore how to code better, fine-tune an AI model and…

Satyabrat Singh أعاد

Xiaohang Tang

@xiaohang_tang

١٨ أبريلم

Glad to introduce our new work "Game-Theoretic Regularized Self-Play Alignment of Large Language Models". arxiv.org/abs/2503.00030 🎉 We introduce RSPO, a general, provably convergent framework to bring different regularization strategies into self-play alignment. 🧵👇

xiaohang_tang's tweet image. Glad to introduce our new work "Game-Theoretic Regularized Self-Play Alignment of Large Language Models". arxiv.org/abs/2503.00030 🎉

We introduce RSPO, a general, provably convergent framework to bring different regularization strategies into self-play alignment. 🧵👇

Satyabrat Singh أعاد

Xiaohang Tang

@xiaohang_tang

١٤ مارسم

Thrilled to introduce our test-time algorithm for robust multi-objective alignment! Huge kudos to my incredible collaborators for making this happen!

Seongho Son

@seongho_son_ml

١٤ مارسم

❓No clue about the priorities of the objectives? ❗️ Focus on robustness at test-time! 🚀Robust Multi-Objective Decoding (RMOD) is a novel inference-time alignment algorithm that produces robust responses under multiple objectives to consider.

seongho_son_ml's tweet image. ❓No clue about the priorities of the objectives?
❗️ Focus on robustness at test-time!
🚀Robust Multi-Objective Decoding (RMOD) is a novel inference-time alignment algorithm that produces robust responses under multiple objectives to consider.

Satyabrat Singh أعاد

Seongho Son

@seongho_son_ml

١٤ مارسم

Satyabrat Singh أعاد

Sangwoong Yoon

@WoongSSang

٢٥ فبرايرم

🚀Sampling = Reinforcement Learning🤖 This means you can train a neural sampler using RL! We introduce the Value Gradient Sampler (VGS)—a novel diffusion sampler that leverages value functions to generate samples from an unnormalized density. 📄 Paper: arxiv.org/abs/2502.13280

WoongSSang's tweet image. 🚀Sampling = Reinforcement Learning🤖
This means you can train a neural sampler using RL!

We introduce the Value Gradient Sampler (VGS)—a novel diffusion sampler that leverages value functions to generate samples from an unnormalized density.
📄 Paper: arxiv.org/abs/2502.13280

Satyabrat Singh أعاد

Shyam Sundhar Ramesh

@shyam91019594

٩ ديسمبر ٢٠٢٤ م

(1/9) Flying to #NeurIPS2024 ? Our paper arxiv.org/abs/2405.20304 and blog shorturl.at/aIShm might be an interesting read on ur long flight to Vancouver! Accepted at #NeurIPS2024 and excited to present it as a poster on 13th December (1-4pm)!

shyam91019594's tweet card. Interjection: Recent studies have found that large language models (LLMs) are biased, with many articles demonstrating these biases and…

Pluralistic Alignment of LLMs: Fix your Algorithm not just your data

المصدر: medium.com

Satyabrat Singh أعاد

Ilija Bogunovic

@ilijabogunovic

٩ ديسمبر ٢٠٢٤ م

On my way to #NeurIPS2024 ✈️ We are presenting several papers this year, including REDUCER, ARDT, GR-DPO/IPO, invariant BO. I’d love to connect and chat about topics like Alignment, RL/RLHF, LLM deception, robustness, and reasoning!

Satyabrat Singh أعاد

Xiaohang Tang

@xiaohang_tang

٩ ديسمبر ٢٠٢٤ م

🚀🚀🚀 Introducing Adversarially Robust Decision Transformer (ARDT) 🚀🚀🚀 The first Decision Transformer for adversarial game-solving and robust decision-making, accepted to #NeurIPS #NeurIPS2024 🚨Change slightly : Replacing returns-to-go with minimax return. 🚨 Improve…

xiaohang_tang's tweet image. 🚀🚀🚀 Introducing Adversarially Robust Decision Transformer (ARDT) 🚀🚀🚀

The first Decision Transformer for adversarial game-solving and robust decision-making, accepted to #NeurIPS #NeurIPS2024

🚨Change slightly : Replacing returns-to-go with minimax return.
🚨 Improve…

Satyabrat Singh

@satyabratsingh

٢٦ نوفمبر ٢٠٢٤ م

15 years ago today, I got a second chance at life… never realized how close death could be #MumbaiTerrorAttack #GratefulForLife

Satyabrat Singh أعاد

Theo Brown

@the_obrown

٢١ نوفمبر ٢٠٢٤ م

📣 If you've got an objective that exhibits symmetries, you should be using invariant kernel BO 📣 🚀 More sample efficient than constrained/naive BO! 🚀 More compute efficient than data augmentation! 🧵 1/4 #NeurIPS2024 #BayesianOptimisation #ai

Satyabrat Singh

@satyabratsingh

٩ نوفمبر ٢٠٢٤ م

This book is an absolute gem for understanding the intricacies of neural nets. Huge thanks to @SimonScardapane #MachineLearning #DeepLearning #AI

satyabratsingh's tweet image. This book is an absolute gem for understanding the intricacies of neural nets. Huge thanks to @SimonScardapane #MachineLearning #DeepLearning #AI

Satyabrat Singh

@satyabratsingh

٧ نوفمبر ٢٠٢٤ م

DeepSets are useful where we need permutation invariance. Imagine a batch of data with shape (n,m) —we split this batch into k sets, each of size (k,m) feed them through a neural network, and aggregate the outputs as: f(X) = ∑(i=1 to k) g(x_i). This method captures the essence…

Satyabrat Singh أعاد

Dimitri Bertsekas

@DBertsekas

٢٠ أكتوبر ٢٠٢٤ م

The 2nd edition of my #ReinforcementLearning 477-page textbook for my course at ASU has just been published and is freely available at the book's website web.mit.edu/dimitrib/www/R… which also contains slides, videolectures, and supporting material

Satyabrat Singh أعاد

Kaggle

@kaggle

١٤ أكتوبر ٢٠٢٤ م

Competition Launch Alert! Realtime Marketdata hosted by @JaneStreetGroup 🎯 Challenge: Develop an ML forecasting model using real-world data derived from production systems 💰 Prize Pool: $120,000 ⏰ Entry Deadline: 12/30/2024 Explore the difficult dynamics that shape financial…