China Smart, US Dumb: Some AI Readings in English
January 28, 2025
A blog post from an authentic dinobaby. He’s old; he’s in the sticks; and he is deeply skeptical.
I read a short post in YCombinator’s Hacker News this morning (January 23, 2025). The original article is titled “Deepseek and the Effects of GPU Export Controls.” If you are interested in the poli sci approach to smart software, dive in. However, in the couple of dozen comments on Hacker News to the post, a contributor allegedly named LHL posted some useful links. I have pulled these from the comments and displayed them for your competitive intelligence large language model. On the other hand, you can read them because you are interested in what’s shaking in the Lin-gang Free Trade Zone in the Middle Kingdom:
Deepseek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Deepseek Coder V2: Breaking the Barrier of Closed Source Models in Code Intelligence
Deepseek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Deepseek LLM Scaling Open-Source Language Models with Longtermism
First, a thanks to the poster LHL. The search string links timed out, so you may already be part of the HN herd who is looking at the generated bibliography.
Second, several observations:
- China has lots of people. There are numerous highly skilled mathematicians, Monte Carlo and gradient descent wonks, and darned good engineers. One should not assume that wizardry ends with big valuations and tie ups among Oracle, Open AI and the savvy funder of Banjo, an intelware outfit of some repute.
- Computing resource constraints translate into one outcome. Example: Howard Flank, one of my team members, received the Information Industry Association Award decades ago for cramming a searchable index of the Library of Congress’ holdings. Remember those wonderful machines in the early 1980s. Yeah, Howard did wonders with limited resources. The Chinese professionals can too and have. (Note to US government committee members: Keep Howard and similar engineering whiz kids in mind when thinking about how curtailing computer resources will stop innovation.)
- Deepseek’s methods are likely to find there way into some US wrapper products presented as groundbreaking AI. Nope. These innovations are enabled by an open source technology. Now what happens if an outfit like Telegram or one of the many cyber gangs which Microsoft’s Brad Smith references? Yeah. Innovation of a type that is not salubrious.
- The authors of the papers are important. Should these folks be cross correlated with other information about grants, academic affiliations with US institutions, and conference attendance?
In case anyone is curious, from my dinobaby point of view, the most important paper in the bunch is the one about a “mixture of experts.”
Stephen E Arnold, January 28, 2025
Comments
Got something to say?