DeepSeek: Everything you need to know about the AI chatbot app


Deepseek became viral.

Chinese AI Lab Deepseek broke into the main consciousness this week after Its chatbot program rose to the top of the Apple App Store lists (And Google Play, too). Deepseek’s AI models that were trained by computer efficient techniques, led analysts from Wall Streetand technologists -Question whether the United States can keep its lead in the AI ​​breed and whether the demand for AI -chips will continue.

But where did Deepseek come from, and how did it rise to international fame so fast?

Deepseek merchants

Deepseek is supported by high wing-capital management, Chinese Quantity Hedge Fund, which uses AI to inform its trade decisions.

Ai -enthusiastic Liang Wenfeng Co-founded high wing in 2015. Wenfeng, who reportedly began trading while a student at Zhejiang University, launched high-wing capital management as Hedge Fund in 2019 focused on developing and deploying AI algorithms.

In 2023, High-Flyer began Deepseek as a laboratory dedicated to exploring AI tools separate from his financial business. With a high wing as one of its investors, the laboratory started in its own company, also called Deepseek.

From the first day, Deepseek built his own data centers for model training. But like other AI companies in China, Deepseek was hit by US export bans on hardware. To train one of its more recent models, the company was forced to use NVIDIA H800 pieces, a less powerful version of Chip, the H100, available for US companies.

Deepseek’s technical team is said to swing Young. The company reportedly aggressively recruit Doctoral AI -Researchers from top Chinese universities. Deepseek also hires people without a computer background To help its technology better understand a wide range of topics, according to the New York Times.

Deepseek’s strong models

Deepseek revealed its first set of Models-Depseek Coder, Deepseek LLM and Deepseek Chat-in November 2023. But it wasn’t until last spring, when the starting released its next Geneek-V2 family of models that the AI ​​industry began to notice.

Deepseek-V2, a general-purpose text and image analysis system, performed well in various AI references-and was much cheaper to run than comparable models at the time. It forced Deepseek’s domestic competition, including Bytedance and Alibaba, cutting the use prices for some of their models, and make others completely free.

Deepseek-V3Launched in December 2024, only added to Deepseek’s notoriety.

According to Deepseek’s internal reference testing, Deepseek V3 exceeds both downloadable, openly available models such as Meta Lame and “closed” models accessible only by API, such as Openai GPT-4O.

Equally impressive is Deepseek’s R1 “reasoning”. Released in January, Deepseek claims R1 acts as well as an O1 model of OpenAI on key references.

Being a reasoning model, R1 effectively bill itself, which helps it avoid some of the pits that usually travel models. Reasoning models last a little longer-usual seconds to minutes longer-to reach solutions compared to a typical unreasonable model. The disadvantage is that they tend to be more reliable in domains such as physics, science and mathematics.

However, there is a disadvantage to R1, Deepseek V3, and the other models of Deepseek. Being a Chinese-developed AI, they are subject to Benchmarking through China’s Internet regulator to ensure that its answers “embodies core socialist values.” In Deepseek’s Chatbot program, for example, R1 will not answer questions about Tiananmen Square or Taiwan’s autonomy.

Disruptive access

If Deepseek has a business model, it is not clear what this model is. The company prices its products and services far below market value – and gives others for free. It also doesn’t take money from an investorDespite a ton of VC interest.

The way Deepseek says it, efficiency advances has enabled it to maintain extreme cost competitiveness. Some experts Dispute However the figures the company provided.

Whatever the case, developers carried the Deepseek models, which are not an open source, because the sentence is often understood, but is available under permissible licenses that enable commercial use. According to Clem Delangue, the general manager of a hugging face, one of Deepseek’s platforms hosting models, Developers on hugging face created more than 500 “derivatives” models of R1 This composed 2.5 million downloads combined.

Deepseek’s success against larger and more established rivals were described as “upending ai” And “Tro-Hyped.” The company’s success was at least partly responsible for causing Nvidia’s share price to drop 18% in January, and for issuing a public response by the general manager of Openai Sam Altman.

Microsoft announced that Deepseek is available in his service Azure Ai FoundryThe Microsoft platform, which joins AI services for businesses under a single banner. When asked about Deepseek’s impact on Meta’s AI expense during its first-quarter win-call, Director General Mark Zuckerberg said The expense for AI infrastructure will continue to be a “strategic advantage” for meta. In March, Openai called Deepseek “subsidized” and “state-controlled”, “ and recommends that the US government consider banning Deepseek models.

During Nvidia’s fourth quarter, call, CEO Jensen Huang emphasized Deepseek’s “excellent innovation”, “ To say that it and other “reasoning” models are great for Nvidia, because they need all the more computation.

At the same time, Some companies prohibit Deepseekand so are whole Countries And Governments,, including South Korea. New York State also Prohibited Deepseek to be used on government devices.

As for what the future of Deepseek could hold, it is not clear. Improved models are given. But the US government seems to be growing cautious about what it perceives as harmful foreign influence. In March, The Wall Street Journal reported that The United States is likely to ban Deepseek on government devices.

This story was originally published on January 28, 2025, and will be updated regularly.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *