IEEE/CAA Journal of Automatica Sinica
Citation: | H. Zhang, Y. Li, Z. Wang, Y. Ding, and H. Yan, “Policy gradient adaptive dynamic programming for model-free multi-objective optimal control,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 4, pp. 1060–1062, Apr. 2024. doi: 10.1109/JAS.2023.123381 |
[1] |
S. Chinchali, S. C. Livingston, M. Chen, and M. Pavone, “Multi-objective optimal control for proactive decision making with temporal logic models,” Int. J. Robotics Research, vol. 38, no. 12–13, pp. 1490–1512, 1490.
|
[2] |
J. A. Villegas-Florez, B. S. Hernandez-Osorio, and E. Giraldo, “Multi-objective optimal control of resources applied to an electric power distribution system,” Engineering Letters, vol. 28, no. 3, pp. 756–761, 2020.
|
[3] |
J. Chen, J. Sun, and G. Wang, “From unmanned systems to autonomous intelligent systems,” Engineering, vol. 12, no. 5, pp. 16–19, 2022.
|
[4] |
Q. Wei, F.-Y. Wang, D. Liu, and X. Yang, “Finite-approximation-error-based discrete-time iterative adaptive dynamic programming,” IEEE Trans. Cyber., vol. 44, no. 12, pp. 2820–2833, 2014. doi: 10.1109/TCYB.2014.2354377
|
[5] |
B. Luo, D. Liu, H.-N. Wu, D. Wang, and F. L. Lewis, “Policy gradient adaptive dynamic programming for data-based optimal control,” IEEE Trans. Cyber., vol. 47, no. 10, pp. 3341–3354, 2016.
|
[6] |
D. Zhao, Q. Zhang, D. Wang, and Y. Zhu, “Experience replay for optimal control of nonzero-sum game systems with unknown dynamics,” IEEE Trans. Cyber., vol. 46, no. 3, pp. 854–865, 2015.
|
[7] |
Y. Yang, W. Gao, H. Modares, and C.-Z. Xu, “Robust actor-critic learning for continuous-time nonlinear systems with unmodeled dynamics,” IEEE Trans. Fuzzy Syst., vol. 30, no. 6, pp. 2101–2112, 2021.
|
[8] |
J. Xu, L. Wang, Y. Liu, and H. Xue, “Event-triggered optimal containment control for multi-agent systems subject to state constraints via reinforcement learning,” Nonlinear Dynamics, vol. 109, pp. 1651–1670, 2022. doi: 10.1007/s11071-022-07513-4
|
[9] |
J. Xu, L. Wang, Y. Liu, J. Sun, and Y. Pan, “Finite-time adaptive optimal consensus control for multi-agent systems subject to time-varying output constraints,” Applied Math. and Computation, vol. 427, p. 127176, 2022. doi: 10.1016/j.amc.2022.127176
|
[10] |
Y. Yang, B. Kiumarsi, H. Modares, and C. Xu, “Model-free λ-policy iteration for discrete-time linear quadratic regulation,” IEEE Trans. Neural Networks and Learning Systems, vol. 34, no. 2, pp. 635–649, 2023.
|
[11] |
V. G. Lopez and F. L. Lewis, “Dynamic multi objective control for continuous-time systems using reinforcement learning,” IEEE Trans. Autom. Control, vol. 64, no. 7, pp. 2869–2874, 2019. doi: 10.1109/TAC.2018.2869462
|