Hunyuan-Large
Hunyuan-TurboS
Hunyuan-A13B
架构:Deepseek-moe, GQA
- pre-training: 20T tokens, 4096 context length
- annealing: 300B tokens, 8192 context length
- long context: NTK-aware, 32K -> 256K
- post-training: reasoning SFT -> reasoning RL -> all SFT -> all RL
RL: GRPO
SFT 150k samples: Mathematics : Coding : Logic : Science ratio 2:2:1:1