
206期


今天我们不聊模型参数有多大,而是聊如何让AI变得更“会思考”,这种思考方式,有时甚至有些反常识。比如,为什么给AI疯狂“补课”,它反而可能越学越笨?我们还会探讨,如何像一位高明的老师一样引导AI攻克难题,而不是直接灌输答案。更进一步,我们会揭示如何训练AI像个侦探一样,学会“讲道理”地分析代码,以及如何让整个系统学会动态协作,找到最高效的“偷懒”方式。
00:00:35 AI大模型时代,如何花小钱办大事?
00:05:47 给AI“补课”的陷阱,为什么学得越多,它反而越笨?
00:11:37 高手辅导功课,为什么不直接给答案?
00:16:48 让AI学会“讲道理”,代码世界的侦探是怎样炼成的?
00:22:00 让AI学会“省时间”,一种更聪明的快
本期介绍的几篇论文:
[LG] Rich Insights from Cheap Signals: Efficient Evaluations via Tensor Factorization
[Google DeepMind & University of Michigan]
https://arxiv.org/abs/2603.02029
---
[LG] Theoretical Perspectives on Data Quality and Synergistic Effects in Pre- and Post-Training Reasoning Models
[University of Southern California & University of California Los Angeles & Google Research]
https://arxiv.org/abs/2603.01293
---
[LG] Learn Hard Problems During RL with Reference Guided Fine-tuning
[ByteDance Seed & UC Berkeley & CMU]
https://arxiv.org/abs/2603.01223
---
[LG] Agentic Code Reasoning
[Meta]
https://arxiv.org/abs/2603.01896
---
[CL] Learning to Draft: Adaptive Speculative Decoding with Reinforcement Learning
[Microsoft Research Asia & Peking University]
https://arxiv.org/abs/2603.01639


48期

沪ICP备06026464号-4 网络文化经营许可证
沪网文[2014]0587-137号
信息网络传播视听许可证:0911603
©2011-2019 qingting.fm ALL Rights Reserved.
应用名称:蜻蜓FM | 开发者:上海麦克风文化传媒有限公司