DeepSeek-R1: the first open sourced reasoning models

Author

Updated

May, 27, 2026

Category

Introduction

DeepSeek-R1 (Guo et al., 2025) 是 DeepSeek 在 2025 年 1 月发布的一个使用 pure reinforcement learning 来提高模型 reasoning 能力的 LLM.

Method

Implementation

Result

  1. Guo, D., Yang, D., Zhang, H., Song, J., Wang, P., Zhu, Q., Xu, R., Zhang, R., Ma, S., Bi, X., Zhang, X., Yu, X., Wu, Y., Wu, Z. F., Gou, Z., Shao, Z., Li, Z., Gao, Z., Liu, A., … Zhang, Z. (2025). DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning. Nature, 645(8081), 633–638. 10.1038/s41586-025-09422-z