Chinese startup DeepSeek has disrupted the AI industry with its low-cost, open-source model, prompting industry giants to reassess their approaches. Sega Cheng, AI expert, and iKala co-founder, emphasized the importance of reinforcement learning (RL) while noting ongoing debate over DeepSeek's actual GPU usage. He expects US tech giants, particularly Meta Platforms, to reconsider their open-source strategies.
In an interview, Cheng pointed to OpenAI researcher Jason Wei's social media hint that the team had discovered an "unstoppable" RL optimization method—a development that may explain DeepSeek's advancements, which surfaced earlier than expected.
"The key question for these tech firms is how far they should go with open-source without undermining their core business," Cheng said, highlighting the balance between staying competitive and protecting proprietary assets. He noted that while the disruption is significant, it hasn't fundamentally threatened major firms' market positions, which are protected by their scale advantages.
Regarding the controversy over DeepSeek's GPU usage, Cheng called for closer observation of its performance on Hugging Face. Meanwhile, Meta's Chief AI Scientist Yann LeCun dismissed claims that DeepSeek's rise signals China's AI overtaking the US, instead framing it as proof that open-source models are outpacing closed systems. Industry sources reveal Meta has set up a "war room" to analyze DeepSeek's breakthroughs.
Cheng sees Meta's Llama as the strongest contender to reclaim dominance in open-source AI, given Meta's relatively lighter burden in navigating the open-source dilemma. According to Cheng, the community will evolve into a showdown between DeepSeek and Meta. DeepSeek's boldest move, he said, was open-sourcing everything, leveling the playing field. While US firms may adopt DeepSeek's technology, he added, "This time, it would be a rather humbling decision for them."
In a separate analysis, Seasalt.ai co-founder and CTO Guoguo Chen described DeepSeek as a testament to engineering prowess, having leveraged every available optimization technique. Chen explained that the company's R1 model was likely trained from scratch and distilled into smaller variants using supervised fine-tuning on Llama or Qwen—though the compact versions function independently.