Overview of Hunyuan series

Author

Mao Song

Updated

May, 29, 2026

Category

Hunyuan-Large

Hunyuan-TurboS

Hunyuan-A13B

架构：Deepseek-moe, GQA

pre-training: 20T tokens, 4096 context length
annealing: 300B tokens, 8192 context length
long context: NTK-aware, 32K -> 256K
post-training: reasoning SFT -> reasoning RL -> all SFT -> all RL

RL: GRPO

SFT 150k samples: Mathematics : Coding : Logic : Science ratio 2:2:1:1