IEEE/CAA Journal of Automatica Sinica
Citation: | X. H. Wen and M. C. Zhou, “Evolution and role of optimizers in training deep learning models,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 10, pp. 2039–2042, Oct. 2024. doi: 10.1109/JAS.2024.124806 |
[1] |
Z. Zhang et al., “Mapping network-coordinated stacked gated recurrent units for turbulence prediction,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 6, pp. 1331–1341, 2024. doi: 10.1109/JAS.2024.124335
|
[2] |
H. Liu, et al., “Aspect-based sentiment analysis: A survey of deep learning methods,” IEEE Trans. Computational Social Systems, vol. 7, no. 6, pp. 1358–1375, Dec. 2020. doi: 10.1109/TCSS.2020.3033302
|
[3] |
H. Wu et al., “A PID-incorporated latent factorization of tensors approach to dynamically weighted directed network analysis,” IEEE/ CAA J. Autom. Sinica, vol. 9, no. 3, pp. 533–546, 2022. doi: 10.1109/JAS.2021.1004308
|
[4] |
W. Xu et al., “Transformer-based macroscopic regulation for highspeed railway timetable rescheduling,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 9, pp. 1822–1833, 2023. doi: 10.1109/JAS.2023.123501
|
[5] |
I. Goodfellow et al., Deep Learning. MIT press, 2016.
|
[6] |
S. Ruder, “An overview of gradient descent optimization algorithms,” arXiv preprint arXiv: 1609.04747, 2016.
|
[7] |
H. Robbins et al., “A stochastic approximation method,” The Annals of Mathematical Statistics, pp. 400–407, 1951.
|
[8] |
J. Duchi et al., “Adaptive subgradient methods for online learning and stochastic optimization,” Journal of Machine Learning Research, vol. 12, p. 7, 2011.
|
[9] |
T. Tieleman, “Lecture 6.5-RMSprop: Divide the gradient by a running average of its recent magnitude,” COURSERA: Neural Networks for Machine Learning, vol. 4, p. 2, 2012.
|
[10] |
D. P. Kingma et al., “Adam: A method for stochastic optimization,” arXiv preprint arXiv: 1412.6980, 2014.
|
[11] |
A. Vaswani et al., “Attention is all you need,” Advances in Neural Information Processing Systems, p. 30, 2017.
|
[12] |
N. S. Keskar et al., “Improving generalization performance by switching from Adam to SGD,” arXiv preprint arXiv: 1712.07628, 2017.
|
[13] |
L. Luo et al., “Adaptive gradient methods with dynamic bound of learning rate,” in Proc. Int. Conf. Learning Representations, 2019.
|
[14] |
I. Loshchilov et al., “Decoupled weight decay regularization,” arXiv preprint arXiv: 1711.05101, 2017.
|
[15] |
P. Foret et al., “Sharpness-aware minimization for efficiently improving generalization,” arXiv preprint arXiv: 2010.01412, 2020.
|
[16] |
J. Kwon et al., “ASAM: Adaptive sharpness-aware minimization for scale-invariant learning of deep neural networks,” in Proc. Int. Conf. Machine Learning, 2021, pp. 5905–5914.
|
[17] |
X. Xie et al., “Adan: Adaptive nesterov momentum algorithm for faster optimizing deep models,” arXiv preprint arXiv: 2208.06677, 2022.
|
[18] |
J. Zhuang et al., “Adabelief optimizer: Adapting stepsizes by the belief in observed gradients,” in Proc. Conf. Neural Information Processing Systems, 2020.
|
[19] |
H. Liu et al., “Sophia: A scalable stochastic second-order optimizer for language model pre-training,” arXiv preprint arXiv: 2305.14342, 2023.
|
[20] |
J. Chen et al., “Hierarchical particle swarm optimization-incorporated latent factor analysis for large-scale incomplete matrices,” IEEE Trans. Big Data, vol. 8, no. 6, pp. 1524–1536, 2022.
|
[21] |
M. Cui et al., “Surrogate-assisted autoencoder-embedded evolutionary optimization algorithm to solve high-dimensional expensive problems,” IEEE Trans. Evolutionary Computation, vol. 26, no. 4, pp. 676–689, 2022. doi: 10.1109/TEVC.2021.3113923
|
[22] |
G. Wei et al., “A hybrid probabilistic multiobjective evolutionary algorithm for commercial recommendation systems,” IEEE Trans. Computational Social Systems, vol. 8, no. 3, pp. 589–598, 2021. doi: 10.1109/TCSS.2021.3055823
|
[23] |
J. Bi et al., “Energy-optimized partial computation offloading in mobile-edge computing with genetic simulated-annealing-based particle swarm optimization,” IEEE Internet of Things Journal, vol. 8, no. 5, pp. 3774–3785, 2021. doi: 10.1109/JIOT.2020.3024223
|
[24] |
S. Gao et al., “Dendritic neuron model with effective learning algorithms for classification, approximation, and prediction,” IEEE Trans. Neural Networks and Learning Systems, vol. 30, no. 2, pp. 601–614, 2018.
|
[25] |
Y. Yu et al., “Improving dendritic neuron model with dynamic scalefree network-based differential evolution,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 1, pp. 99–110, 2021.
|
[26] |
X. Luo et al., “Interpretability diversity for decision-tree-initialized dendritic neuron model ensemble,” IEEE Trans. Neural Networks and Learning Systems, doi: 10.1109/TNNLS.2023.3290203, 2023.
|
[27] |
Y. Yu et al., “Improving dendritic neuron model with dynamic scale-free network-based differential evolution,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 1, pp. 99–110, Jan. 2022. doi: 10.1109/JAS.2021.1004284
|
[28] |
S. Gao et al., “Fully complex-valued dendritic neuron model,” IEEE Trans. Neural Networks and Learning Systems, vol. 34, no. 4, pp. 2105–2118, Apr. 2023. doi: 10.1109/TNNLS.2021.3105901
|
[29] |
F.-Y. Wang, “Intelligent vehicles from your HomePorts to underwaters and low attitude airspaces: SLAM for smart societies,” IEEE Trans. Intelligent Vehicles, vol. 9, no. 2, pp. 3092–3105, Feb. 2024. doi: 10.1109/TIV.2024.3373614
|
[30] |
G. Yuan et al., “An autonomous vehicle group cooperation model in an urban scene,” IEEE Trans. Intelligent Transportation Systems, vol. 24, no. 12, pp. 13852–13862, Dec. 2023. doi: 10.1109/TITS.2023.3300278
|
[31] |
Q. Zhao et al., “A tutorial on Internet of behaviors: Concept, architecture, technology, applications, and challenges,” IEEE Communi cations Surveys & Tutorials, vol. 25, no. 2, pp. 1227–1260, Secondquarter 2023.
|
[32] |
S. Lou et al., “Human-cyber-physical system for Industry 5.0: A review from a human-centric perspective,” IEEE Trans. Automation Science and Engineering, doi: 10.1109/TASE.2024.3360476, 2024.
|
[33] |
L. Vlacic et al., “Automation 5.0: The key to systems intelligence and Industry 5.0,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 8, pp. 1723–1727, Aug. 2024. doi: 10.1109/JAS.2024.124635
|