IEEE/CAA Journal of Automatica Sinica
Citation: | Y. L. Yang, Z. H. Ding, R. Wang, H. Modares, and D. C. Wunsch, “Data-driven human-robot interaction without velocity measurement using off-policy reinforcement learning,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 1, pp. 47–63, Jan. 2022. doi: 10.1109/JAS.2021.1004258 |
[1] |
E. Nuño, R. Ortega, and L. Basañez, “An adaptive controller for nonlinear teleoperators,” Automatica, vol. 46, no. 1, pp. 155–159, 2010. doi: 10.1016/j.automatica.2009.10.026
|
[2] |
S. Cai, Z. Ma, M. J. Skibniewski, and S. Bao, “Construction automation and robotics for high-rise buildings over the past decades: A comprehensive review,” Advanced Engineering Informatics, vol. 42, Article No. 100989, 2019. doi: 10.1016/j.aei.2019.100989
|
[3] |
D. Han, P. Huang, X. Liu, and Y. Yang, “Combined spacecraft stabilization control after multiple impacts during the capture of a tumbling target by a space robot,” Acta Astronautica, vol. 176, pp. 24–32, 2020. doi: 10.1016/j.actaastro.2020.05.035
|
[4] |
S. E. Fasoli, H. I. Krebs, J. Stein, W. R. Frontera, and N. Hogan, “Effects of robotic therapy on motor impairment and recovery in chronic stroke,” Archives of Physical Medicine and Rehabilitation, vol. 84, no. 4, pp. 477–482, 2003. doi: 10.1053/apmr.2003.50110
|
[5] |
J. C. Perry, J. Rosen, and S. Burns, “Upper-limb powered exoskeleton design,” IEEE/ASME Transactions on Mechatronics, vol. 12, no. 4, pp. 408–417, 2007. doi: 10.1109/TMECH.2007.901934
|
[6] |
M. Bergamasco, B. Allotta, L. Bosio, L. Ferretti, G. Parrini, G. Prisco, F. Salsedo, and G. Sartini, “An arm exoskeleton system for teleoperation and virtual environments applications,” in Proc. IEEE Int. Conf. Robotics and Automation, 1994, pp. 1449–1454.
|
[7] |
H. Modares, I. Ranatunga, F. L. Lewis, and D. O. Popa, “Optimized assistive human–robot interaction using reinforcement learning,” IEEE Trans. Cybernetics, vol. 46, no. 3, pp. 655–667, 2015.
|
[8] |
K. Guo, Y. Pan, D. Zheng, and H. Yu, “Composite learning control of robotic systems: A least squares modulated approach,” Automatica, vol. 111, Article No. 108612, 2020. doi: 10.1016/j.automatica.2019.108612
|
[9] |
T. Sun and Y. Pan, “Robust adaptive control for prescribed performance tracking of constrained uncertain nonlinear systems,” J. Franklin Institute, vol. 356, no. 1, pp. 18–30, 2019. doi: 10.1016/j.jfranklin.2018.09.005
|
[10] |
K. Dupree, P. M. Patre, Z. D. Wilcox, and W. E. Dixon, “Asymptotic optimal control of uncertain nonlinear euler-lagrange systems,” Automatica, vol. 47, no. 1, pp. 99–107, 2011. doi: 10.1016/j.automatica.2010.10.007
|
[11] |
Z. Li, J. Liu, Z. Huang, Y. Peng, H. Pu, and L. Ding, “Adaptive impedance control of human–robot cooperation using reinforcement learning,” IEEE Trans. Industrial Electronics, vol. 64, no. 10, pp. 8013–8022, 2017. doi: 10.1109/TIE.2017.2694391
|
[12] |
T. Sun, L. Peng, L. Cheng, Z. Hou, and Y. Pan, “Stability-guaranteed variable impedance control of robots based on approximate dynamic inversion,” IEEE Trans. Systems,Man,and Cybernetics:Systems, vol. 51, no. 7, pp. 4193–4200, 2019. doi: 10.1109/TSMC.2019.2930582
|
[13] |
T. Sun, L. Cheng, L. Peng, Z. Hou, and Y. Pan, “Learning impedance control of robots with enhanced transient and steady-state control performances,” Science China Information Sciences, vol. 63, no. 9, pp. 1–13, 2020.
|
[14] |
T. Sun, L. Peng, L. Cheng, Z. Hou, and Y. Pan, “Composite learning enhanced robot impedance control,” IEEE Trans. Neural Networks and Learning Systems, vol. 31, no. 3, pp. 1052–1059, 2020. doi: 10.1109/TNNLS.2019.2912212
|
[15] |
R. Colbaugh, H. Seraji, and K. Glass, “Direct adaptive impedance control of robot manipulators,” J. Robotic Systems, vol. 10, no. 2, pp. 217–248, 1993. doi: 10.1002/rob.4620100205
|
[16] |
S. Ge, C. Hang, L. Woon, and X. Chen, “Impedance control of robot manipulators using adaptive neural networks,” Int. J. Intelligent Control and Systems, vol. 2, no. 3, pp. 433–452, 1998.
|
[17] |
C. Wang, Y. Li, S. S. Ge, and T. H. Lee, “Reference adaptation for robots in physical interactions with unknown environments,” IEEE Transactions on Cybernetics, vol. 47, no. 11, pp. 3504–3515, 2016.
|
[18] |
W.-S. Lu and Q.-H. Meng, “Impedance control with adaptation for robotic manipulations,” IEEE Trans. Robotics and Automation, vol. 7, no. 3, pp. 408–415, 1991. doi: 10.1109/70.88152
|
[19] |
H. N. Rahimi, I. Howard, and L. Cui, “Neural impedance adaption for assistive human–robot interaction,” Neurocomputing, vol. 290, pp. 50–59, 2018. doi: 10.1016/j.neucom.2018.02.025
|
[20] |
Y. Wang, W. Sun, Y. Xiang, and S. Miao, “Neural network-based robust tracking control for robots,” Intelligent Automation &Soft Computing, vol. 15, no. 2, pp. 211–222, 2009.
|
[21] |
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. USA: A Bradford Book, 2018.
|
[22] |
Y. Yang, D. Wunsch, and Y. Yin, “Hamiltonian-driven adaptive dynamic programming for continuous nonlinear dynamical systems,” IEEE Trans. Neural Networks and Learning Systems, vol. 28, no. 8, pp. 1929–1940, 2017. doi: 10.1109/TNNLS.2017.2654324
|
[23] |
Y. Yang, K. G. Vamvoudakis, H. Modares, Y. Yin, and D. C. Wunsch, “Hamiltonian-driven hybrid adaptive dynamic programming,” IEEE Trans. Systems, Man, and Cybernetics: Systems, to be published, 2020.
|
[24] |
Y. Yang, K. G. Vamvoudakis, H. Modares, Y. Yin, and D. C. Wunsch, “Safe intermittent reinforcement learning with static and dynamic event generators,” IEEE Trans. Neural Networks and Learning Systems, vol. 31, no. 12, pp. 5441–5455, 2020. doi: 10.1109/TNNLS.2020.2967871
|
[25] |
D. Wang and X. Zhong, “Advanced policy learning near-optimal regulation,” IEEE/CAA J. Automa Sinica, vol. 6, no. 3, pp. 743–749, 2019. doi: 10.1109/JAS.2019.1911489
|
[26] |
Y. Yang, Z. Guo, H. Xiong, D. Ding, Y. Yin, and D. C. Wunsch, “Datadriven robust control of discrete-time uncertain linear systems via offpolicy reinforcement learning,” IEEE Trans. Neural Networks and Learning Systems, vol. 30, no. 12, pp. 3735–3747, 2019. doi: 10.1109/TNNLS.2019.2897814
|
[27] |
D. Wang, D. Liu, C. Mu, and Y. Zhang, “Neural network learning and robust stabilization of nonlinear systems with dynamic uncertainties,” IEEE Trans. Neural Networks and Learning Systems, vol. 29, no. 4, pp. 1342–1351, 2018. doi: 10.1109/TNNLS.2017.2749641
|
[28] |
Q. Zhang and D. Zhao, “Data-based reinforcement learning for nonzerosum games with unknown drift dynamics,” IEEE Trans. Cybernetics, vol. 49, no. 8, pp. 2874–2885, 2019. doi: 10.1109/TCYB.2018.2830820
|
[29] |
H. Modares, F. L. Lewis, and Z.-P. Jiang, “H∞ tracking control of completely unknown continuous-time systems via off-policy reinforcement learning,” IEEE Trans. Neural Networks and Learning Systems, vol. 26, no. 10, pp. 2550–2562, 2015. doi: 10.1109/TNNLS.2015.2441749
|
[30] |
B. Luo, H.-N. Wu, and T. Huang, “Off-policy reinforcement learning for H8 control design,” IEEE trans. Cybernetics, vol. 45, no. 1, pp. 65–76, 2014.
|
[31] |
W. Gao, Z. Jiang, and K. Ozbay, “Data-driven adaptive optimal control of connected vehicles,” IEEE Trans. Intelligent Transportation Systems, vol. 18, no. 5, pp. 1122–1133, 2017. doi: 10.1109/TITS.2016.2597279
|
[32] |
W. Gao, J. Gao, K. Ozbay, and Z. Jiang, “Reinforcement-learningbased cooperative adaptive cruise control of buses in the lincoln tunnel corridor with time-varying topology,” IEEE Trans. Intelligent Transportation Systems, vol. 20, no. 10, pp. 3796–3805, 2019. doi: 10.1109/TITS.2019.2895285
|
[33] |
T. Degris, M. White, and R. S. Sutton, “Off-policy actor-critic,” arXiv preprint arXiv:1205.4839, 2012.
|
[34] |
Y. Jiang and Z.-P. Jiang, “Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics,” Automatica, vol. 48, no. 10, pp. 2699–2704, 2012. doi: 10.1016/j.automatica.2012.06.096
|
[35] |
J. Kober and J. R. Peters, “Policy search for motor primitives in robotics,” in Advances in Neural Information Processing Systems, in Learning Motor Skills, Cham: Springer, 2014, pp. 83–117.
|
[36] |
F. Zhang, D. M. Dawson, M. S. de Queiroz, and W. E. Dixon, “Global adaptive output feedback tracking control of robot manipulators,” IEEE Trans. Automatic Control, vol. 45, no. 6, pp. 1203–1208, 2000. doi: 10.1109/9.863607
|
[37] |
F. L. Lewis, D. M. Dawson, and C. T. Abdallah, Robot Manipulator Control: Theory and Practice. Boca Raton, Florida: CRC Press, 2003.
|
[38] |
J. E. Slotine and W. Li, “On the adaptive control of robot manipulators,” Int. J. Robot. Res., vol. 6, no. 3, pp. 49–59, 1987. doi: 10.1177/027836498700600303
|
[39] |
A. T. Hasan, N. Ismail, A. Hamouda, I. Aris, M. Marhaban, and H. AlAssadi, “Artificial neural network-based kinematics jacobian solution for serial manipulator passing through singular configurations,” Advances in Engineering Software, vol. 41, no. 2, pp. 359–367, 2010. doi: 10.1016/j.advengsoft.2009.06.006
|
[40] |
R. C. Miall, D. J. Weir, D. M. Wolpert, and J. F. Stein, “Is the cerebellum a smith predictor?” Journal of Motor Behavior, vol. 25, no. 3, pp. 203–216, 1993. doi: 10.1080/00222895.1993.9942050
|
[41] |
A. Phatak, H. Weinert, I. Segall, and C. N. Day, “Identification of a modified optimal control model for the human operator,” Automatica, vol. 12, no. 1, pp. 31–41, 1976. doi: 10.1016/0005-1098(76)90066-2
|
[42] |
J. Ragazzini, “Engineering aspects of the human being as a servomechanism,” Am. Psychol., vol. 3, pp. 219–314, 1948. doi: 10.1037/h0056536
|
[43] |
E. Zergeroglu, W. Dixon, D. Haste, and D. Dawson, “A composite adaptive output feedback tracking controller for robotic manipulators,” Robotica, vol. 17, no. 6, pp. 591–600, 1999. doi: 10.1017/S0263574799001848
|
[44] |
H. Y. Lau and L. C. Wai, “A jacobian-based redundant control strategy for the 7-DOF wam,” in Proc. 7th Int. Conf. Control, Automation, Robotics and Vision, (ICARCV 2002), IEEE, 2002, vol. 2, pp. 1060–1065.
|
[45] |
F. L. Lewis, D. Vrabie, and V. L. Syrmos, Optimal Control. Hoboken, NewJersey: John Wiley & Sons, 2012.
|