Vue d'ensemble

  • Missions postés 0

Description de l'entreprise

DeepSeek-R1 · GitHub Models · GitHub

DeepSeek-R1 excels at thinking tasks using a detailed training procedure, such as language, scientific thinking, and coding jobs. It includes 671B overall specifications with 37B active parameters, and 128k context length.

DeepSeek-R1 constructs on the development of earlier reasoning-focused models that improved performance by extending Chain-of-Thought (CoT) reasoning. DeepSeek-R1 takes things further by combining reinforcement learning (RL) with fine-tuning on carefully picked datasets. It evolved from an earlier variation, DeepSeek-R1-Zero, which relied solely on RL and revealed strong reasoning abilities but had concerns like hard-to-read outputs and language disparities. To resolve these constraints, DeepSeek-R1 a little quantity of cold-start data and follows a refined training pipeline that blends reasoning-oriented RL with supervised fine-tuning on curated datasets, resulting in a model that achieves advanced efficiency on thinking criteria.

Usage Recommendations

We suggest adhering to the following setups when using the DeepSeek-R1 series models, consisting of benchmarking, to accomplish the anticipated efficiency:

– Avoid adding a system prompt; all instructions need to be contained within the user timely.
– For mathematical issues, it is recommended to include a directive in your prompt such as: « Please reason step by step, and put your last response within boxed . ».
– When assessing model performance, it is recommended to perform multiple tests and average the outcomes.

Additional recommendations

The design’s reasoning output (consisted of within the tags) may consist of more damaging content than the model’s final reaction. Consider how your application will utilize or show the reasoning output; you may desire to suppress the thinking output in a production setting.