notesum.ai
Published at December 6Effective Rank and the Staircase Phenomenon: New Insights into Neural Network Training Dynamics
cs.LG
cs.NA
math.NA
Released Date: December 6, 2024
Authors: Yang Jiang, Yuxiang Zhao1, Quanhui Zhu1
Aff.: 1Department of Mathematics, Southern University of Science and Technology

| Notation | Stands for … |
|---|---|
| Depth of hidden layers | |
| Width of hidden layers | |
| Number of total parameters | |
| Sample size | |
| Dimension of the input | |
| The neurons of the output layer | |
| Coefficients of the output layer | |
| The mass matrix of basis functions | |
| The -rank of the mass matrix | |
| Tolerance of eigenvalues |