Citation: | S. LI and C. CHEAH, “Learning laws for deep convolutional neural networks with guaranteed convergence,” IEEE/CAA J. Autom. Sinica, 2025. doi: 10.1109/JAS.2025.125171 |
[1] |
Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng, “Reading digits in natural images with unsupervised feature learning,” in Proc. NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011, pp. 4.
|
[2] |
H. Xiao, K. Rasul, and R. Vollgraf, “Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms,” arXiv preprint arXiv: 1708.07747, 2017.
|
[3] |
T. Clanuwat, M. Bober-Irizar, A. Kitamoto, A. Lamb, K. Yamamoto, and D. Ha, “Deep learning for classical Japanese literature,” arXiv preprint arXiv: 1812.01718, 2018.
|
[4] |
A. Krizhevsky, V. Nair, and G. Hinton, “CIFAR-10 (Canadian institute for advanced research),” [Online]. Available: http://www.cs.toronto.edu/~kriz/cifar.html
|
[5] |
S. R. Fanello, C. Ciliberto, M. Santoro, L. Natale, G. Metta, L. Rosasco, and F. Odone, “iCub world: Friendly robots help building good vision data-sets,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition Workshops, Portland, USA, 2013, pp. 700–705.
|
[6] |
Z. A. Saberi, H. Sadr, and M. R. Yamaghani, “An intelligent diagnosis system for predicting coronary heart disease,” in Proc. 10th Int. Conf. Artificial Intelligence and Robotics, Qazvin, Iran, Islamic Republic of, 2024, pp. 131–137.
|
[7] |
D. Gunning, M. Stefik, J. Choi, T. Miller, S. Stumpf, and G.-Z. Yang, “XAI-explainable artificial intelligence,” Sci. Robot., vol. 4, no. 37, p. eaay7120, Dec. 2019. doi: 10.1126/scirobotics.aay7120
|
[8] |
S. Thrun and T. M. Mitchell, “Lifelong robot learning,” Rob. Auton. Syst., vol. 15, no. 1-2, pp. 25–46, Jul. 1995. doi: 10.1016/0921-8890(95)00004-Y
|
[9] |
B. Wu, J. Zhong, and C. Yang, “A visual-based gesture prediction framework applied in social robots,” IEEE/CAA J. Autom. Sinica, vol. 9, pp. 510–519, 2022. doi: 10.1109/JAS.2021.1004243
|
[10] |
A. Kendall, J. Hawke, D. Janz, P. Mazur, D. Reda, J. M. Allen, V. D. Lam, A. Bewley, and A. Shah, “Learning to drive in a day,” in Proc. Int. Conf. Robotics and Automation, Montreal, Canada, 2019, pp. 8248–8254.
|
[11] |
P. M. Kebria, A. Khosravi, S. M. Salaken, and S. Nahavandi, “Deep imitation learning for autonomous vehicles based on convolutional neural networks,” IEEE/CAA J. Autom. Sinica, vol. 7, no. 1, pp. 82–95, Jan. 2020. doi: 10.1109/JAS.2019.1911825
|
[12] |
Z. Zhao, J. Zhang, S. Chen, W. He, and K.-S. Hong, “Neural-network-based adaptive finite-time control for a two-degree-of-freedom helicopter system with an event-triggering mechanism,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 8, pp. 1754–1765, Aug. 2023. doi: 10.1109/JAS.2023.123453
|
[13] |
Z. Chen and B. Liu, Lifelong Machine Learning. 2nd ed. Cham, Germany: Springer, 2018.
|
[14] |
D. Zou, Y. Cao, D. Zhou, and Q. Gu, “Gradient descent optimizes over-parameterized deep ReLU networks,” Mach. Learn., vol. 109, no. 3, pp. 467–492, Mar. 2020. doi: 10.1007/s10994-019-05839-6
|
[15] |
S. S. Du, J. D. Lee, H. Li, L. Wang, and X. Zhai, “Gradient descent finds global minima of deep neural networks,” in Proc. 36th Int. Conf. Machine Learning, Long Beach, USA, 2019, pp. 1675–1685.
|
[16] |
Z. Allen-Zhu, Y. Li, and Z. Song, “A convergence theory for deep learning via over-parameterization,” in Proc. 36th Int. Conf. Machine Learning, Long Beach, USA, 2019, pp. 242–252.
|
[17] |
H. T. Nguyen, C. C. Cheah, and K. A. Toh, “An analytic layer-wise deep learning framework with applications to robotics,” Automatica, vol. 135, p. 110007, Jan. 2022. doi: 10.1016/j.automatica.2021.110007
|
[18] |
S. Li, H. T. Nguyen, and C. C. Cheah, “A theoretical framework for end-to-end learning of deep neural networks with applications to robotics,” IEEE Access, vol. 11, pp. 21992–22006, Feb. 2023. doi: 10.1109/ACCESS.2023.3249280
|
[19] |
S. Li and C. C. Cheah, “An analytic end-to-end collaborative deep learning algorithm,” IEEE Control Syst. Lett., vol. 7, pp. 3024–3029, Jul. 2023. doi: 10.1109/LCSYS.2023.3292034
|
[20] |
A. Guarneros-Sandoval, M. Ballesteros, I. Salgado, J. Rodríguez-Santillán, and I. Chairez, “Lyapunov stable learning laws for multilayer recurrent neural networks,” Neurocomputing, vol. 491, pp. 644–657, Jun. 2022. doi: 10.1016/j.neucom.2021.12.041
|
[21] |
O. S. Patil, D. M. Le, M. L. Greene, and W. E. Dixon, “Lyapunov-derived control and adaptive update laws for inner and outer layer weights of a deep neural network,” IEEE Control Syst. Lett., vol. 6, pp. 1855–1860, Dec. 2022. doi: 10.1109/LCSYS.2021.3134914
|
[22] |
O. S. Patil, D. M. Le, E. J. Griffis, and W. E. Dixon, “Deep residual neural network (ResNet)-based adaptive control: A lyapunov-based approach,” in Proc. 61st IEEE Conf. Decision and Control, Cancun, Mexico, 2022, pp. 3487–3492.
|
[23] |
D. Zou and Q. Gu, “An improved analysis of training over-parameterized deep neural networks,” in Proc. 33rd Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2019, pp. 184.
|
[24] |
S. Bombari, M. H. Amani, and M. Mondelli, “Memorization and optimization in deep neural networks with minimum over-parameterization,” in Proc. 36th Int. Conf. Neural Information Processing System, New Orleans, USA, 2022, pp. 554.
|
[25] |
M. Kohler and S. Langer, “Statistical theory for image classification using deep convolutional neural network with cross-entropy loss under the hierarchical max-pooling model,” J. Stat. Plan. Inference, vol. 234, p. 106188, Jan. 2025. doi: 10.1016/j.jspi.2024.106188
|
[26] |
Z. Fang and G. Cheng, “Optimal convergence rates of deep convolutional neural networks: Additive ridge functions,” Trans. Mach. Learn. Res., 2023.
|
[27] |
M. Kohler and B. Walter, “Analysis of convolutional neural network image classifiers in a rotationally symmetric model,” IEEE Trans. Inf. Theory, vol. 69, no. 8, pp. 5203–5218, Aug. 2023. doi: 10.1109/TIT.2023.3262745
|
[28] |
H. Zhang, L. Feng, X. Zhang, Y. Yang, and J. Li, “Necessary conditions for convergence of CNNs and initialization of convolution kernels,” Digital Signal Process., vol. 123, p. 103397, Apr. 2022. doi: 10.1016/j.dsp.2022.103397
|
[29] |
Y. Xu and H. Zhang, “Convergence of deep convolutional neural networks,” Neural Netw., vol. 153, pp. 553–563, Sept. 2022. doi: 10.1016/j.neunet.2022.06.031
|
[30] |
S. Zhang, M. Wang, J. Xiong, S. Liu, and P.-Y. Chen, “Improved linear convergence of training CNNs with generalizability guarantees: A one-hidden-layer case,” IEEE Trans. Neural Netw. Learn. Syst., vol. 32, no. 6, pp. 2622–2635, Jun. 2021. doi: 10.1109/TNNLS.2020.3007399
|
[31] |
H. T. Nguyen, S. Li, and C. C. Cheah, “A layer-wise theoretical framework for deep learning of convolutional neural networks,” IEEE Access, vol. 10, pp. 14270–14287, Jan. 2022. doi: 10.1109/ACCESS.2022.3147869
|
[32] |
Y. Le Cun, B. Boser, J. S. Denker, R. Howard, W. Hubbard, L. D. Jackel, and D. Henderson, “Handwritten digit recognition with a back-propagation network,” in Proc. 3rd Int. Conf. Neural Information Processing Systems, Denver, USA, 1990, pp. 396–404.
|
[33] |
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in Proc. 3rd Int. Conf. Learning Representations, San Diego, USA, 2015, pp. 1–14.
|
[34] |
D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Proc. 3rd Int. Conf. Learning Representations, San Diego, USA, 2015.
|
[35] |
N. Qian, “On the momentum term in gradient descent learning algorithms,” Neural Netw., vol. 12, no. 1, pp. 145–151, Jan. 1999. doi: 10.1016/S0893-6080(98)00116-6
|
[36] |
K. He and J. Sun, “Convolutional neural networks at constrained time cost,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Boston, USA, 2015, pp. 5353–5360.
|
[37] |
T. L. Hayes, N. D. Cahill, and C. Kanan, “Memory efficient experience replay for streaming learning,” in Proc. Int. Conf. Robotics and Automation, Montreal, Canada, 2019, pp. 9769–9776.
|