This DeepSeek demo shows how good the Chinese AI model is at math and reasoning
- DeepSeek's AI models rival top Silicon Valley offerings, excelling in some complex tasks.
- The models use inference-time compute, breaking queries into smaller, manageable tasks.
- DeepSeek's DeepThink mode shows its thought process, solving challenging math problems.
Chinese AI lab DeepSeek recently released AI models that match or exceed some of Silicon Valley's top offerings.
DeepSeek uses an approach called test-time or inference-time compute, which slices queries into smaller tasks, turning each into a new prompt that the model tackles. Each step requires running a new request, which is known as the inference stage in AI.
Business Insider recently tested one of DeepSeek's models using its DeepThink mode, which shows users every step of its thought process.
We gave the model a multistep math problem that was suggested by Charlie Snell, an AI researcher at UC Berkeley.
Snell said DeepSeek performs well on complex math problems that must be solved by thinking for longer and in stages. The researcher picked problems from the American Invitational Mathematics Examination, a challenging test for high school math whizzes.
"I put questions from that into the DeepSeek model," Snell told BI in an interview. "I read through the chain of thought. It was understandable."
A DeepSeek demo
Here's the math problem Snell suggested for BI's DeepSeek demo: "Find a sequence of +, -, /, * which can be applied to the numbers 7, 3, 11, 5 to get to 24, where each of the given numbers is used exactly once."
BI put that prompt into DeepSeek's chat window on its website. The model responded initially by laying out the challenge ahead.
"Alright, so I've got this problem here: I need to use the numbers 7, 3, 11, and 5, and combine them with the operations of addition, subtraction, multiplication, and division, using each number exactly once, to get to 24," it replied. "At first glance, this seems a bit tricky, but I think with some systematic thinking, I can figure it out."
It then proceeded through multiple steps over roughly 16 pages of discussion that included mathematical calculations and equations. The model sometimes got it wrong, but it spotted this and didn't give up. Instead, it swiftly moved on to try another possible solution, then another.
"Almost got close there with 33 / 7 * 5 ≈ 23.57, but not quite 24. Maybe I need to try a different approach," it wrote at one point.
Later on, the DeepSeek model seemed to catch itself repeating a potential solution.
"Wait, I already did that one," the model wrote. "Okay, maybe I need to consider using division in a different way."
After a few minutes, it found the correct answer.
"You can see it try different ideas and backtrack," Snell said. He highlighted this part of DeepSeek's chain of thought as particularly noteworthy:
"This is getting really time-consuming. Maybe I need to consider a different strategy," the AI model wrote. "Instead of combining two numbers at a time, perhaps I should look for a way to group them differently or use operations in a nested manner."