Chinese AI Lab Deepseek Grinds Ahead…Allegedly

December 31, 2024

Is the world’s most innovative AI company a low-profile Chinese startup? ChinaTalk examines “Deepseek: The Quiet Giant Leading China’s AI Race.” The Chinese-tech news site shares an annotated translation of a rare interview with DeepSeek CEO Liang Wenfeng. The journalists note the firm’s latest R1 model just outperformed OpenAI’s o1. In their introduction to the July interview, they write:

“Before Deepseek, CEO Liang Wenfeng’s main venture was High-Flyer, a top 4 Chinese quantitative hedge fund last valued at $8 billion. Deepseek is fully funded by High-Flyer and has no plans to fundraise. It focuses on building foundational technology rather than commercial applications and has committed to open sourcing all of its models. It has also singlehandedly kicked off price wars in China by charging very affordable API rates. Despite this, Deepseek can afford to stay in the scaling game: with access to High-Flyer’s compute clusters, Dylan Patel’s best guess is they have upwards of ‘50k Hopper GPUs,’ orders of magnitude more compute power than the 10k A100s they cop to publicly. Deepseek’s strategy is grounded in their ambition to build AGI. Unlike previous spins on the theme, Deepseek’s mission statement does not mention safety, competition, or stakes for humanity, but only ‘unraveling the mystery of AGI with curiosity’. Accordingly, the lab has been laser-focused on research into potentially game-changing architectural and algorithmic innovations.”

For example, we learn:

“They proposed a novel MLA (multi-head latent attention) architecture that reduces memory usage to 5-13% of the commonly used MHA architecture. Additionally, their original DeepSeekMoESparse structure minimized computational costs, ultimately leading to reduced overall costs.”

Those in Silicon Valley are well aware of this “mysterious force from the East,” with several AI head honchos heaping praise on the firm. The interview is split into five parts. The first examines the large-model price war set off by Deepseek’s V2 release. Next, Wenfeng describes how an emphasis on innovation over imitation sets his firm apart but, in part three, notes that more money does not always lead to more innovation. Part four takes a look at the talent behind DeepSeek’s work, and in part five the CEO looks to the future. Interested readers should check out the full interview. Headquartered in Hangzhou, China, the young firm was founded in 2023.

Cynthia Murrell, December 31, 2024

Comments

Got something to say?





  • Archives

  • Recent Posts

  • Meta