Skip to main content
Avatar

Mao Song(毛松)'s Homepage

Delving into the Latent Unknown.

  1. Bilibili
  2. Google Scholar
  1. Home
  2. Archives
  3. Search
  4. Tags
  5. About
    1. Dark Mode

Categories

LLM LeetCode MLLM Infra Math Tutorial Machine Learning Reasoning RL NLP Terminal Agent Deep Learning RAG Unified MLLM 随笔

Tags

MoE Qwen Medium Reasoning Attention Google Deepseek DFS Position Encoding Matrix Array DP RL String Transformer Tree BFS Bit Manipulation Hard Kimi

Archives

2026 19
2025 102
2024 51
2023 1
LLM

Notes on Qwen-LLM

Qwen技术报告总结
July 3, 2025
2 min read
Qwen
LLM   Tutorial

Hands on LLM(2) Transformer

基于Qwen3讲解transformer的架构以及核心代码
June 29, 2025
4 min read
cs336 transformer
LLM

Unified perspective on dLLM and LLM

MLE和KL divergence之间的等价性推导
June 28, 2025
1 min read
diffusion
Machine Learning

Relationship between MLE and KL divergence

MLE和KL divergence之间的等价性推导
June 27, 2025
2 min read
MLE KL divergence
MLLM   Reasoning

Notes on MiMo-VL

MiMo-VL基于MiMo-7B,是一个多模态推理大语言模型
June 5, 2025
3 min read
xiaomi
1 … 18 19 20 … 35
© 2020 - 2026 Mao Song(毛松)'s Homepage
Built with Hugo
Theme Stack designed by Jimmy