Chinese AI lab DeepSeek has announced the release of its DeepSeek-R1-Lite-Preview model, which it claims rivals OpenAI’s o1 model. The new model offers a unique feature: transparency in its reasoning process, allowing users to see its step-by-step problem-solving methods. This announcement comes two months after OpenAI launched its o1-preview model, highlighting a growing competition in the AI reasoning space.
DeepSeek launches reasoning model to rival OpenAIDeepSeek-R1-Lite-Preview can be accessed via a web chatbot, DeepSeek Chat, where users can interact with the model, limited to 50 messages per day. While detailed benchmarks and a model card have yet to be released, early assessments indicate that the reasoning model exhibits performance comparable to OpenAI’s benchmarks on AIME and MATH tasks. DeepSeek asserts that it achieves a state-of-the-art accuracy of 91.6% on the MATH benchmark.
The introduction of DeepSeek-R1 comes as traditional scaling laws in AI, which suggest that increasing data and computational power will improve performance, begin to show diminishing returns. In response, companies are seeking new approaches, such as those underlying reasoning models like DeepSeek-R1. Unlike traditional models, reasoning models extend their computational processing during inference to enhance decision-making capabilities.
Despite its promising features, the new model also adheres to strict censorship protocols common in Chinese AI technology. Observations confirmed that DeepSeek-R1 avoids sensitive political topics, such as inquiries regarding Xi Jinping or Taiwan. Users have reported successful attempts to bypass these restrictions, allowing the model to provide unfiltered content in certain scenarios. This aspect raises ongoing questions about the balance between functionality and regulatory compliance for AI models developed in regions with stringent governmental oversight.
DeepSeek asserts that its DeepSeek-R1 model—or more specifically, the DeepSeek-R1-Lite-Preview—matches OpenAI’s o1-preview model on two prominent AI benchmarks, AIME and MATH. AIME evaluates a model’s performance using other AI models, while MATH tests problem-solving with a collection of word problems. However, the model has its shortcomings. Some users on X pointed out that DeepSeek-R1, like o1, faces challenges with tic-tac-toe and other logic-based tasks.
Looking ahead, DeepSeek plans to release open-source versions of its R1 models and extend access via APIs, continuing its commitment to the open-source AI community. The company is backed by High-Flyer Capital Management, which follows a strategy of integrating AI into trading decisions. High-Flyer’s operations include substantial investment in hardware infrastructure, boasting clusters of Nvidia A100 GPUs for model training.
Featured image credit: DeepSeek