DeepSeek AI: What you need to know about the ChatGPT rival
In a mere week, DeepSeek's R1 large language model has dethroned ChatGPT on the App Store, shaken up the stock market, and posed a serious threat to OpenAI and, by extension, U.S. dominance of the AI industry.
Last Monday, Chinese AI company DeepSeek released an open-source LLM called DeepSeek R1, becoming the buzziest AI chatbot since ChatGPT. It's purportedly just as good — if not better — than OpenAI's models, cheaper to use, and allegedly developed with way fewer chips than its competitors. Here's what you need to know about DeepSeek R1 and why everyone is suddenly talking about it.
DeepSeek R1 claims to surpass OpenAI models in key benchmarks
With the release of DeepSeek R1, the company published a report on its capabilities, including performance on industry-standard benchmarks. DeepSeek claims its LLM beat OpenAI's reasoning model o1 on advanced math and coding tests (AIME 2024, MATH-500, SWE-bench Verified) and earned just below o1 on another programming benchmark (Codeforces), graduate-level science (GPQA Diamond), and general knowledge (MMLU).
Mashable's Stan Schroeder put DeepSeek R1 to the test by asking it to "code a fairly complex web app which needed to parse publicly available data, and create a dynamic website with travel and weather information for tourists," and came away impressed with its capabilities.
At this point, several LLMs exist that perform comparably to OpenAI's models, like Anthropic Claude, Meta's open-source Llama models, and Google Gemini. But DeepSeek R1's performance, combined with other factors, makes it such a strong contender.
Unlike OpenAI models, DeepSeek R1 is open source
Because DeepSeek R1 is open source, anyone can access and tweak it for their own purposes. It also allows programmers to look under the hood and see how it works. Open-source models are considered critical for scaling AI use and democratizing AI capabilities since programmers can build off them instead of requiring millions of dollars worth of computing power to build their own.
Meta took this approach by releasing Llama as open source, compared to Google and OpenAI, which are criticized by open-source advocates as gatekeeping. Google's Gemini model is closed source, but it does have an open-source model family called Gemma.
It's cheap to use and was cheap to build
DeepSeek R1 has a free web app version, accessible via chat.deepseek.com, and an API that costs significantly less than OpenAI's API access to its most advanced model. Its reasoning model costs $0.14 for one million cached input tokens, compared to $7.50 per one million cached input tokens for OpenAI's o1 model. That's an absolute steal that unsurprisingly has programmers flocking to it.
For AI industry insiders and tech investors, DeepSeek R1's most significant accomplishment is how little computing power was (allegedly) required to build it. According to DeepSeek engineers via The New York Times, the R1 model required only 2,000 Nvidia chips. That's compared to a reported 10,000 Nvidia GPUs required for OpenAI's models as of 2023, so it's undoubtedly more now.
That's quite a bold claim, but if true, it calls into question how much investment is needed to develop data centers like the $500 billion Stargate project currently underway. The stock market certainly noticed DeepSeek R1's alleged cost efficiency, with Nvidia taking a 13 percent dip in stock price on Monday.
DeepSeek R1 is the new king on Apple's App Store
Clearly, users have noticed DeepSeek R1's prowess. By Monday, the new kid on the block topped the Apple App Store as the number one free app, replacing ChatGPT as the reigning free app.
Who knows if DeepSeek R1's momentum will continue, but it has definitely reignited the AI race and taken the competition to global heights.