梯度下降法的神经网络容易收到局部最优,为什么应用广泛?
Neural networks trained with gradient descent are said to easily get stuck in local optima. Why are they still widely used?
题目类型: 技术面试题
这是一道技术面试题,常见于澳洲IT公司面试中。
难度: hard
分类: Deep Learning
标签: Local Optima, Saddle Points, Hessian, Loss Landscape
本题提供 STAR 原则详细解答和技术解析,登录匠人学院学习中心即可查看完整答案。