Homepage
Delving into the latent unknown.
Mao Song's technical blog covering machine learning, large language models (LLMs), deep learning research, and AI innovations.
News
- Make this blog framework as a blog template
A blog template for research articles and blog posts.
- Update the blog framework from Hugo to Astro
I update the blog framework from Hugo to Astro. Now the blog is more flexible and easier to maintain.
- Investigating Redundancy in Multimodal Large Language Models with Multiple Vision Encoders
We invalidate the hypothesis that 'adding more vision encoders will always improve the performance of multimodal large language models'. Accetpted by ICLR2026.
Latest
- Math Math Foundations
- LLM Overview of Attention Mechanism
A hands-on guide to understanding and implementing the Attention mechanism in deep learning models.
- LLM MoE tutorial
本 blog 详细介绍了 MoE 模型的一些关键设计与相关实验结果,为 MoE 模型的学习提供基础。
- LLM Notes on olmoe
AllenAI 在 24 年 9 月提出了 olmoe, 一个全开源的基于 MoE 架构的大语言模型,参数量为 7B-A1B,作者详细介绍了模型的设计,数据以及训练策略. 论文获得了ICLR2025 oral