Lex Fridman Podcast

State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast

Guest: Sebastian Raschka and Nathan LambertJanuary 31, 2026

large language models (llms)artificial intelligence (ai)scaling laws open-weight models ai competition (us vs. china)rlhf (reinforcement learning from human feedback)rlvr (reinforcement learning with verifiable rewards)transformer architecture data quality ai ethics ai in coding inference time scaling

State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490

Episode Summary

AI-generated · Mar 2026

AI-generated summary — may contain inaccuracies. Not a substitute for the full episode or professional advice.

Lex Fridman hosts Sebastian Raschka and Nathan Lambert, respected machine learning researchers and engineers, to dissect the cutting-edge of artificial intelligence in 2026. The episode's central thesis explores the rapid advancements, shifting competitive landscape between US and Chinese AI, the nuanced application of scaling laws, and the complex ethical and practical implications for users and developers.

👤 Who Should Listen

AI researchers and engineers interested in the latest LLM architectural advancements, scaling laws, and training paradigms.
Software developers and programmers looking to understand how LLMs are transforming coding workflows and debugging.
Leaders and strategists in the tech industry tracking the competitive dynamics between US and Chinese AI companies and open-weight models.
Authors, content creators, and legal professionals navigating the evolving landscape of AI-generated content, copyright, and data licensing.
Anyone concerned with the broader societal and ethical implications of AI, including its impact on human learning, mental well-being, and agency.
Current users of LLMs (e.g., ChatGPT, Gemini, Claude) seeking a deeper understanding of the technology underpinning their favorite tools and future developments.

🔑 Key Takeaways

1.The 'DeepSeek moment' in January 2025, when the Chinese company DeepSeek released near-state-of-the-art open-weight models with allegedly less compute, ignited a furious global AI competition [02:05].
2.While US models like Claude Opus 4.5 and ChatGPT currently offer superior output quality for paying users, a growing number of Chinese companies like Z.ai, Minimax, and Kimi Moonshot are releasing increasingly strong open-weight models with highly permissive licenses [05:12, 20:33, 35:10].
3.Fundamental LLM architectures have remained largely unchanged since GPT-2, with advancements primarily driven by architectural tweaks (e.g., Mixture of Experts, Multi-head Latent Attention, Group Query Attention) and algorithmic progress in post-training techniques like Reinforcement Learning with Verifiable Rewards (RLVR) [37:14, 43:22, 49:30].
4.Scaling laws continue to hold across pre-training, reinforcement learning, and inference time, with significant recent gains from inference time scaling (allowing models to 'think' for extended periods) and RLVR, which enables tool use and better software engineering [49:30].
5.The quality and curated nature of training data are paramount; specialized techniques like Almost-OCR for scientific PDFs and using high-quality synthetic data (e.g., rephrased content, best ChatGPT answers) are crucial for model performance [64:56, 69:04].
6.Over-reliance on LLMs for core tasks like coding could diminish human fulfillment and hinder the deep learning that comes from struggling with problems, despite surveys indicating increased enjoyment for many developers [89:40, 95:45].
7.Ethical and legal challenges surrounding data licensing, copyright (highlighted by Anthropic's $1.5 billion lawsuit), and the management of LLMs in sensitive domains like mental health are critical, leading to a tension between utility and safety [76:19, 84:33].
8.Senior developers are more likely to ship over 50% AI-generated code than junior developers, suggesting that expertise lies not just in writing code, but in effectively leveraging and verifying LLM outputs [91:41].

💡 Key Concepts Explained

DeepSeek Moment

A significant event in January 2025 when the open-weight Chinese company DeepSeek released DeepSeek R1, surprising the AI community with near-state-of-the-art performance using allegedly much less compute. This moment accelerated global AI competition in both research and product development, particularly in open-weight models [02:05].

Mixture of Experts (MoE)

An LLM architectural tweak where a 'router' dynamically selects a small subset of specialized 'expert' feedforward networks to process input tokens. This allows models to be much larger and more knowledgeable without a proportional increase in compute cost during inference, making them more economical for long context [41:18, 37:14].

Reinforcement Learning with Verifiable Rewards (RLVR)

A post-training technique where LLMs learn by iteratively generating actions (e.g., using tools, executing code, performing web searches) and receiving reward signals based on verifiable outcomes. This method significantly unlocks complex capabilities like tool use and improved reasoning, dramatically changing how models acquire skills [49:30, 97:47].

Inference Time Scaling

A method to enhance LLM intelligence by allowing the model to perform extended internal 'thinking' or generation of intermediate thoughts over seconds, minutes, or even hours before producing its final output. This capability, exemplified by OpenAI's o1 thinking models, significantly improves problem-solving and enables more sophisticated use cases [49:30].

Pre-training, Mid-training, and Post-training

These are distinct stages in LLM development. **Pre-training** involves initial next-token prediction on massive, diverse datasets. **Mid-training** is a more specialized phase focusing on high-quality or specific data (e.g., long-context documents). **Post-training** involves refinement techniques like supervised fine-tuning, DPO, and RLHF/RLVR to align models with human preferences and unlock specific skills [63:56, 65:58, 67:44].

⚡ Actionable Takeaways

→Explore diverse LLM models like Claude Opus 4.5 for coding, Gemini for quick factual queries, or Grok 4 Heavy for debugging to find the best fit for specific tasks [16:29, 17:31].
→Utilize LLMs to automate mundane, time-consuming tasks (e.g., fixing broken links, website tweaks) to free up mental energy for more complex or enjoyable work [92:42].
→Develop agency by actively building with AI, such as creating apps or tools, to gain practical intuition about its capabilities and limitations, rather than passively consuming AI outputs [88:38].
→When learning new concepts, consider a 'two-pass' approach: first, dedicate focused offline time for deep understanding, then use an LLM for clarification or additional context in a second pass [25:48].
→If you are an open-source project maintainer, anticipate and develop strategies for handling an influx of LLM-generated pull requests, which may require human verification and curation [78:23, 79:24].
→For those in specialized industries (e.g., pharma, law, finance), plan for in-house LLM development using proprietary data, as this could unlock domain-specific capabilities beyond general-purpose models [75:19].
→Cultivate a Goldilocks zone for learning and problem-solving, allowing for productive struggle to build expertise, but using LLMs to avoid excessive frustration and accelerate progress on non-core tasks [94:45].

⏱ Timeline Breakdown

00:00Introduction to the episode's focus: state of AI, technical breakthroughs, and predictions for 2026.

01:01Introduction of guests Sebastian Raschka and Nathan Lambert, their expertise, and notable works.

02:05Discussion begins with the 'DeepSeek moment' of January 2025 and its impact on AI competition.

03:05Sebastian argues that budget and hardware, not proprietary ideas, will be the differentiating factor in AI, preventing a winner-take-all scenario.

04:08Nathan contrasts the hype around Anthropic's Claude Opus 4.5 with Google's Gemini 3, noting Anthropic's cultural focus on code.

05:12Nathan highlights the rise of other strong Chinese open-weight model makers like Z.ai, Minimax, and Kimi Moonshot.

06:15Guests discuss the incentives for Chinese companies to release open-weight models, including international influence and circumventing US API payment barriers.

09:22Sebastian discusses LLM customization and the potential need for separate subscriptions for personal vs. work use.

10:22Nathan predicts Gemini will continue to progress against ChatGPT due to Google's scale, and Anthropic will succeed in enterprise AI for 2026.

12:23Nathan highlights Google's infrastructure advantage with TPUs, avoiding NVIDIA margins, and their data center lead.

13:25Discussion on the trade-off between LLM intelligence and speed for the general public.

14:25Sebastian shares his preference for fast models for quick tasks and 'pro mode' for thorough checks, emphasizing the need for flexible options.

16:29Nathan details his multi-LLM workflow, using Gemini for fast tasks, Claude Opus 4.5 for code/philosophy, and Grok for real-time information.

18:32Lex notes how a single positive interaction can win a user over to a model, with Sebastian adding that users switch when an LLM 'breaks'.

19:32Guests discuss why they don't use Chinese models for daily tasks, attributing it to platform availability and current superior output quality of US models.

21:34Discussion on programming with LLMs: Sebastian uses Codeium, preferring assistance over full automation, while Lex uses Claude Code for 'programming with English'.

23:43Sebastian recommends his 'Build a Large Language Model from Scratch' and 'Build a Reasoning Model from Scratch' books for deep learning.

25:48Lex and Sebastian debate the optimal strategy for using LLMs while reading books, balancing immediate context vs. avoiding distraction.

27:51Transition to discussing the landscape of open-weight LLM models.

28:57Guests name numerous prominent open-weight models from both Chinese and Western developers, including DeepSeek, Kimi, Mistral AI, Gemma, gpt-oss, and OLMo.

31:01Nathan notes that Chinese open models are often larger MoEs, while Western ones are growing in size, with 400 billion parameter MoE models teased for 2026.

32:02Sebastian highlights DeepSeek V3/V3.2 for architectural tweaks and gpt-oss-120b for its focus on tool use to mitigate hallucinations.

34:09Discussion on the explosion of open models: Chinese companies seek international influence, and US companies like OpenAI leverage external GPUs for distribution.

35:10Sebastian explains the appeal of Chinese open-weight models' unrestricted licenses compared to Llama or Gemma's terms of use.

37:14Sebastian details specific architectural ideas in open models, including Mixture of Experts, various attention mechanisms (Multi-head Latent, Group Query, Sliding Window), and gated delta nets.

40:17Sebastian provides a primer on transformer architecture, explaining its evolution from the GPT-2 decoder-only model.

41:18Sebastian details the Mixture of Experts (MoE) layer, explaining how it enables larger models without proportional compute increase.

43:22Sebastian asserts that LLM architectures are fundamentally similar to GPT-2, with advancements coming from tweaks rather than wholesale changes.

44:24Discussion shifts to where rapid advancements are occurring: pre-training, mid-training, and post-training, with current emphasis on post-training.

45:24Nathan explains that system-level optimizations (FP8, FP4) drive faster experimentation, allowing more compute and data for training.

46:27Sebastian notes that while text diffusion models and Mamba offer alternatives for cheaper applications, autoregressive transformers remain state-of-the-art.

47:57Lex asks about the persistence of scaling laws, with Nathan defining them and confirming their continued relevance.

49:30Nathan explains how inference time scaling and Reinforcement Learning with Verifiable Rewards (RLVR) have profoundly changed LLM capabilities in 2025.

51:35Discussion on the high cost of pre-training (DeepSeek at $5M, OLMo 3 at $2M for cluster rental) and the trade-off with serving costs.

53:37Nathan remains bullish on pre-training scaling, citing the historical trend and upcoming gigawatt-scale Blackwell compute clusters in 2026.

55:43Lex asks about xAI's reported gigawatt-scale compute and its utilization for pre-training, RL, and architectural decisions.

56:43Nathan clarifies that while excitement shifts to RL, pre-training remains crucial for building the best base models.

57:43Nathan describes the systems challenges of training on massive GPU clusters (10,000-100,000 GPUs), including handling hardware failures.

59:50Nathan provides a primer on language model reinforcement learning, explaining the actor-learner paradigm and policy gradient algorithms.

60:50Sebastian summarizes that all forms of scaling are desirable, but labs must find the optimal 'bang for the buck' ratio due to finite compute.

62:56Sebastian discusses the trade-off between fixed pre-training costs (long-term capability) and per-query inference scaling costs (short-term gains).

63:56Sebastian defines pre-training, mid-training (specialized data, long context), and post-training (fine-tuning, RLVR) as distinct phases of LLM development.

68:01Nathan adds that synthetic data, like Almost-OCR from PDFs, is crucial for pre-training datasets measured in trillions of tokens.

69:04Sebastian highlights OLMo 3's improved performance with less data due to higher data quality.

70:06Nathan discusses the continuous effort in improving data quality through scientific methods, including creating optimal dataset mixes for specific tasks like math and code.

72:11Nathan identifies scientific PDFs from Semantic Scholar as a high-quality data source, noting that data finding and cleaning is a key contribution at frontier labs.

73:12Sebastian notes that training data is often a closely guarded secret due to legal reasons, with models trained not to reveal sources.

74:16Nathan discusses the trend of training on licensed data versus unlicensed Common Crawl, citing Apertus's EU compliance efforts.

75:19Sebastian explains the gray area of using purchased copyrighted content for training and predicts that proprietary data will become a moat for specialized industry models.

76:19Nathan mentions Anthropic's $1.5 billion lawsuit loss to authors, triggered by torrented books used for training.

77:22Lex discusses the societal challenge of defining compensation models for creators whose data trains LLMs, drawing parallels to music streaming.

78:23Discussion begins on the problem of LLM-generated data flooding the internet, especially GitHub, and the necessity of human curation.

79:24Sebastian shares an anecdote about his MLxtend library receiving numerous LLM-generated PRs, highlighting the overwhelming nature but also value of human verification.

80:25Guests discuss the value of human experts in filtering and correctly using LLMs to provide valuable insights and executive summaries.

81:27Lex expresses disappointment with LLMs' inability to consistently capture core insights or 'the edge' in summaries, despite elaborate prompts.

82:29Nathan defines 'voice' in writing as encapsulating raw, high-information ideas, which RLHF's averaging process tends to filter out from LLMs.

83:30Nathan speculates if models like Bing Sydney, known for going 'off the rails,' might have had more voice, highlighting RLHF trade-offs for safety.

84:33Lex describes the difficult position of frontier labs balancing advanced, potentially 'edgy' AI with preventing harm (e.g., suicide discussions), which can lead to generic models.

86:35Nathan acknowledges researchers' motivation to do good but stresses the complexity of AI's role in mental health, where it can both help and potentially harm.

87:37Lex emphasizes the need for society to engage in nuanced discussions about AI, avoiding fear-mongering and recognizing developers' intentions to help.

88:38Lex and Nathan advocate for users to find 'agency' by building with AI, understanding its mechanics, and contributing to the technological conversation.

89:40Sebastian expresses concern that over-reliance on LLMs for core tasks like coding could lead to burnout and diminish the joy of creation.

90:40Lex references a survey showing that senior developers are more likely to ship over 50% AI-generated code and generally find it more enjoyable.

91:41Sebastian nuances the 'enjoyment' findings, distinguishing between AI helping with mundane tasks versus solving complex, gratifying problems.

93:45Lex shares his view of AI as a 'pair programmer' that reduces the loneliness and 'suffering' of debugging, making the process more enjoyable.

94:45Sebastian questions how future generations will become experts if they rely too heavily on LLMs, potentially missing the crucial 'unlock' gained from hands-on struggle.

96:46Lex transitions to post-training. Nathan reiterates RLVR as the biggest development of 2025 for enabling tool use and software capabilities.

97:47Lex asks for a description of RLVR. Nathan mentions being on the team that coined the term during Tulu 3 work.

Best of this topic

Best large language models (llms) episodes →Best artificial intelligence (ai) episodes →Best scaling laws episodes →Best open-weight models episodes →Best ai competition (us vs. china) episodes →

💬 Notable Quotes

“"I don't think nowadays, in 2026, that there will be any company that has access to technology that no other company has access to... The differentiating factor will be budget and hardware constraints. I don't think the ideas will be proprietary, but rather the resources needed to implement them." - Sebastian Raschka [03:05]”
— Sebastian Raschka and Nathan Lambert

“"The simple thing is: the US models are currently better, and we use them. I tried these other open models, and I'm like, 'Fun, but not gonna... I don't go back to it.'" - Nathan Lambert [20:33]”
— Sebastian Raschka and Nathan Lambert

“"I wouldn't say pre-training scaling is dead, it's just that there are other more attractive ways to scale right now. But at some point, you will still want to make some progress on the pre-training." - Sebastian Raschka [61:53]”
— Sebastian Raschka and Nathan Lambert

“"It's kinda fascinating to watch... for me, I look at the difference between a summary and the original content. Even if it's a page-long summary of a page-long content, it's interesting to see how the LLM-based summary takes the edge off. What is the signal it removes from the thing?" - Lex Fridman [81:27]”
— Sebastian Raschka and Nathan Lambert

📚 Books Mentioned

Build a Large Language Model from Scratch by Sebastian Raschka

Amazon →

Build a Reasoning Model from Scratch by Sebastian Raschka

Amazon →

Reinforcement Learning from Human Feedback by Nathan Lambert

Amazon →

Listen to Full Episode

▶ YouTube

Lex Fridman Podcast

Same show

Jensen Huang: NVIDIA - The $4 Trillion Company & the AI Revolution | Lex Fridman Podcast #494

Jensen HuangMar 2026scaling laws

Lex Fridman Podcast

1 shared topic

Dan Houser: GTA, Red Dead Redemption, Rockstar, Absurd & Future of Gaming | Lex Fridman Podcast #484

Dan HouserOct 2025artificial intelligence (ai)

Lex Fridman Podcast

Same show

Vikings, Ragnar, Berserkers, Valhalla & the Warriors of the Viking Age | Lex Fridman Podcast #495

Lars BrownworthApr 2026

Lex Fridman Podcast

Same show

Jensen Huang: NVIDIA - The $4 Trillion Company & the AI Revolution | Lex Fridman Podcast #494

Jensen HuangMar 2026

Lex Fridman Podcast

Same show

Jeff Kaplan: World of Warcraft, Overwatch, Blizzard, and Future of Gaming | Lex Fridman Podcast #493

Jeff KaplanMar 2026

Lex Fridman Podcast

Same show

Lex trains w/ Khabib Nurmagomedov | Exclusive Footage at UFC PI

Khabib NurmagomedovMar 2026

📬 Get weekly summaries like this one

No spam. Unsubscribe anytime. By subscribing you agree to our Privacy Policy.

Continue Exploring

🎙️

State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast

Episode Summary

👤 Who Should Listen

🔑 Key Takeaways

💡 Key Concepts Explained

DeepSeek Moment

Mixture of Experts (MoE)

Reinforcement Learning with Verifiable Rewards (RLVR)

Inference Time Scaling

Pre-training, Mid-training, and Post-training

⚡ Actionable Takeaways

⏱ Timeline Breakdown

💬 Notable Quotes

📚 Books Mentioned

You Might Also Like

Jensen Huang: NVIDIA - The $4 Trillion Company & the AI Revolution | Lex Fridman Podcast #494

Dan Houser: GTA, Red Dead Redemption, Rockstar, Absurd & Future of Gaming | Lex Fridman Podcast #484

Vikings, Ragnar, Berserkers, Valhalla & the Warriors of the Viking Age | Lex Fridman Podcast #495

Jensen Huang: NVIDIA - The $4 Trillion Company & the AI Revolution | Lex Fridman Podcast #494

Jeff Kaplan: World of Warcraft, Overwatch, Blizzard, and Future of Gaming | Lex Fridman Podcast #493

Lex trains w/ Khabib Nurmagomedov | Exclusive Footage at UFC PI

You Might Also Like

Jensen Huang: NVIDIA - The $4 Trillion Company & the AI Revolution | Lex Fridman Podcast #494

Dan Houser: GTA, Red Dead Redemption, Rockstar, Absurd & Future of Gaming | Lex Fridman Podcast #484

Vikings, Ragnar, Berserkers, Valhalla & the Warriors of the Viking Age | Lex Fridman Podcast #495

Jensen Huang: NVIDIA - The $4 Trillion Company & the AI Revolution | Lex Fridman Podcast #494

Jeff Kaplan: World of Warcraft, Overwatch, Blizzard, and Future of Gaming | Lex Fridman Podcast #493

Lex trains w/ Khabib Nurmagomedov | Exclusive Footage at UFC PI