Implicit Regularization of SGD in High dimensional Linear Regression. Научный семинар, осень 2025

Speaker: Cong Fang, Researcher at Peking University What will the talk cover? Stochastic Gradient Descent (SGD) is one of the most widely used algorithms in modern machine learning. In high-dimensional learning problems, the number of SGD iterations is often smaller than the number of model parameters, and the implicit regularization induced by the algorithm plays a key role in ensuring strong generalization performance. In this seminar, we will: Analyze the generalization behavior of SGD across different learning scenarios; Compare learning efficiency under various scales — depending on data size and dimensionality; Discuss the effects of covariate shift; Present theoretical insights that inspire memory-efficient training algorithms for large language models (e.g., GPT-2)

12+
1 просмотр
14 дней назад
12+
1 просмотр
14 дней назад

Speaker: Cong Fang, Researcher at Peking University What will the talk cover? Stochastic Gradient Descent (SGD) is one of the most widely used algorithms in modern machine learning. In high-dimensional learning problems, the number of SGD iterations is often smaller than the number of model parameters, and the implicit regularization induced by the algorithm plays a key role in ensuring strong generalization performance. In this seminar, we will: Analyze the generalization behavior of SGD across different learning scenarios; Compare learning efficiency under various scales — depending on data size and dimensionality; Discuss the effects of covariate shift; Present theoretical insights that inspire memory-efficient training algorithms for large language models (e.g., GPT-2)

, чтобы оставлять комментарии