Mao Song(毛松)'s Homepage
Delving into the latent unknown.
Mao Song's technical blog covering machine learning, large language models (LLMs), deep learning research, and AI innovations.
News
- Browse posts by tags, category, and search
The new /browse page filters everything client-side—no extra build steps.
- Investigating Redundancy in Multimodal Large Language Models with Multiple Vision Encoders
We invalidate the hypothesis that 'adding more vision encoders will always improve the performance of multimodal large language models'. Accetpted by ICLR2026.
Latest
- LLM KL divergence: from definition to application
Why unbiased KL estimates need not give unbiased KL gradients; forward vs reverse KL, estimators in on/off-policy RL, and experiments.
- LLM Reinforcement Learning for Large Language models: An Overview
Publish‑ready workflow that lets you focus on ideas, not infrastructure
- LLM Notes on OpenMath-Nemotron
MNVIDIA 在 AIMO-2 比赛中的 winning solution.
- Infra Performance and Scalability
本文介绍了strong scaling和weak scaling