
Ohrling
Overview
-
Founded Date December 19, 2014
-
Sectors Health Professional
-
Posted Jobs 0
-
Viewed 15
Company Description
DeepSeek’s First-generation Reasoning Models
DeepSeek’s first-generation reasoning models, accomplishing performance similar to OpenAI-o1 throughout mathematics, code, and thinking tasks.
Models
DeepSeek-R1
Distilled models
DeepSeek group has shown that the thinking patterns of larger designs can be distilled into smaller sized designs, resulting in better performance compared to the thinking patterns found through RL on small models.
Below are the designs developed via fine-tuning versus numerous thick models widely used in the research neighborhood using thinking data generated by DeepSeek-R1. The examination results show that the distilled smaller sized dense designs carry out remarkably well on criteria.
DeepSeek-R1-Distill-Qwen-1.5 B
DeepSeek-R1-Distill-Qwen-7B
DeepSeek-R1-Distill-Llama-8B
DeepSeek-R1-Distill-Qwen-14B
DeepSeek-R1-Distill-Qwen-32B
DeepSeek-R1-Distill-Llama-70B
License
The design weights are accredited under the MIT License. DeepSeek-R1 series assistance usage, permit for any adjustments and derivative works, including, but not restricted to, distillation for training other LLMs.