_hanlin_zhang_'s profile picture. CS PhD candidate @Harvard, @googleai

Hanlin Zhang

@_hanlin_zhang_

CS PhD candidate @Harvard, @googleai

Pinned

Critical batch size is crucial for reducing the wall-clock time of large-scale training runs with data parallelism. We find that it depends primarily on data size. 🧵 [1/n] Paper 📑: arxiv.org/abs/2410.21676 Blog 📝: kempnerinstitute.harvard.edu/research/deepe…


Hanlin Zhang reposted

Do language models have algorithmic creativity? To find out, we built AlgoTune, a benchmark challenging agents to optimize 100+ algorithms like gzip compression, AES encryption and PCA. Frontier models struggle, finding only surface-level wins. Lots of headroom here!🧵⬇️

ori_press's tweet image. Do language models have algorithmic creativity?

To find out, we built AlgoTune, a benchmark challenging agents to optimize 100+ algorithms like gzip compression, AES encryption and PCA. Frontier models struggle, finding only surface-level wins. Lots of headroom here!🧵⬇️

Loading...

Something went wrong.


Something went wrong.