IEEE/CAA Journal of Automatica Sinica
Citation: | Ao Xi, Thushal Wijekoon Mudiyanselage, Dacheng Tao and Chao Chen, "Balance Control of a Biped Robot on a Rotating Platform Based on Efficient Reinforcement Learning," IEEE/CAA J. Autom. Sinica, vol. 6, no. 4, pp. 938-951, July 2019. doi: 10.1109/JAS.2019.1911567 |
[1] |
E. J. Molinos, A. Llamazares, N. Hernández, R. Arroyo, A. Cela, and J. J. Yebes, " Perception and navigation in unknown environments: the DARPA robotics challenge,” in Proc. 1st Iberian Robotics Conf., pp. 321–329, 2013.
|
[2] |
M. Fujita, Y. Kuroki, T. Ishida, and T. T. Doi, " A small humanoid robot sdr-4x for entertainment applications,” in Proc. Int. Conf. Advanced Intelligent Mechatronics, pp. 938–943, 2003.
|
[3] |
C. Chevallereau, G. Bessonnet, G. Abba, and Y. Aoustin, Bipedal Robots: Modeling, Design and Walking Synthesis, Wiley-ISTE, 1st Edition, 2008.
|
[4] |
M. Vukobratovic and B. Borovac, " Zero-moment point thirty-five years of its life,” Internat. J. Human Robotics, 2004.
|
[5] |
B. J. Lee, D. Stonier, Y. D. Kim, J. K. Yoo, and J. H. Kim, " Modifiable walking pattern of a humanoid robot by using allowable ZMP Variation,” IEEE Trans. Robotics, vol. 24, no. 4, pp. 917–925, 2008. doi: 10.1109/TRO.2008.926859
|
[6] |
K. Nishiwaki and S. Kagami, " Strategies for adjusting the ZMP reference trajectory for maintaining balance in humanoid walking,” in Proc. IEEE Int. Conf. Robotics and Automation, 2010.
|
[7] |
K. Hirai, M. Hirose, Y. Haikawa, and T. Takenaka, " The development of honda humanoid robot,” in Proc. IEEE Int. Conf. Robotics and Automation, vol. 2, pp. 1321–1326, May 1998.
|
[8] |
P. M. Wensing and G. B. Hammam, " Optimizing foot centers of pressure through force distribution in a humanoid robot,” Inter. J. Humanoid Robotics, vol. 10, no. 3, 2013.
|
[9] |
H. Zhao, A. Hereid, W. Ma, and A. D. Ames, " Multi-contact bipedal robotic locomotion,” Robotica, vol. 35, no. 5, pp. 1072–1106, 2017. doi: 10.1017/S0263574715000995
|
[10] |
U. Huzaifa, C. Maguire, and A. LaViers, " Toward an expressive bipedal robot: variable gait synthesis and validation in a planar model,” arXiv: 1808.05594v2, 2018.
|
[11] |
T. Takenaka, T. Matsumoto, T. Yoshiike, T. Hasegawa, S. Shirokura, H. Kaneko, and A. Orita, " Real time motion generation and control for biped robot-4th report: integrated balance control,” in Proc. IEEE/RSJ Inter. Conf. on Intelligent Robots and Systems, pp. 1601–1608, 2009.
|
[12] |
J. J. Alcaraz-Jiménez, D. Herrero-Pérez, and H. Martínez-Barberá, " Robust feedback control of ZMP-based gait for the humanoid robot Nao,” Int. J. Robot. Res, vol. 32, no. 9–10, pp. 1074–1088, 2013. doi: 10.1177/0278364913487566
|
[13] |
K. Seo, J. Kim, and K. Roh, " Towards natural bipedal walking: virtual gravity compensation and capture point control,” in Proc. IEEE/RSJ Inter. Conf. on Intelligent Robots and Systems (IROS), pp. 4019–4026, 2012.
|
[14] |
J. Kim, T.T. Tran, C.V. Dang, and B. Kang, " Motion and walking stabilization of humanoids using sensory reflex control,” Inter. J. Advanced Robotic Systems (IJARS)
|
[15] |
M. Shahbazi, G.A.D. Lopes, and R. Babuska, " Observer-based postural balance control for humanoid robots,” in Proc. IEEE Int. Conf. Robotics and Biomimetics (ROBIO), pp. 891–896, 2013.
|
[16] |
S. Wang, W. Chavalitwongse, and B. Robert, " Machine learning algorithms in bipedal robot control,” IEEE Trans. System,Man,and Cybernetics:Part C Application and Reviews, vol. 42, no. 5, pp. 728–743, 2012. doi: 10.1109/TSMCC.2012.2186565
|
[17] |
I. Mordatch, N. Mishra, C. Eppenr, and P. Abbeel, " Combining model-based policy search with online model learning for control of physical humanoids,” in Proc. Int. Conf. Robotics and Automation, pp. 242–248, 2016.
|
[18] |
P. Hénaff, V. Scesa, F.B. Ouezdou, and O. Bruneau, " Real time implementation of CTRNN and BPTT algorithm to learn on-line biped robot balance: Experiments on the standing posture,” Control Eng. Pract., vol. 19, no. 1, pp. 89–99, 2011. doi: 10.1016/j.conengprac.2010.10.002
|
[19] |
A. A. Saputra and I. A. Sulistijono, " Biologically inspired control system for 3D locomotion of a humanoid biped robot,” IEEE Trans. System,Man,and Cybernetics:Systems, vol. 46, no. 7, pp. 898–911, 2016. doi: 10.1109/TSMC.2015.2497250
|
[20] |
J. P. Ferreira, M. Crisostomo, and A.P. Coimbra, " Rejection of an external force in the sagittal plane applied on a biped robot using a neuro-fuzzy controller,” in Proc. Int. Conf. Advanced Robotics, pp. 1–6. IEEE, 2009.
|
[21] |
F. Farzadpour and M. Danesh, " A new hybrid intelligent control algorithm for a seven-link biped walking robot,” J. Vibration and Control, vol. 20, no. 9, pp. 1378–1393, 2014. doi: 10.1177/1077546312470476
|
[22] |
K. S. Hwang, W. C. Jiang, Y. J. Chen, and H. Shi, " Gait balance and acceleration of a biped robot based on Q-learning,” IEEE Access, vol. 4, pp. 2439–2449, Jun. 2016. doi: 10.1109/ACCESS.2016.2570255
|
[23] |
K. S. Hwang, W. C. Jiang, Y. J. Chen, and H. Shi, " Motion segmentation and balancing for a biped robot’s imitation learning,” IEEE Trans. Industrial Information, vol. 13, no. 3, Jun. 2017.
|
[24] |
W. Wu and L. Gao, " Posture self-stabilizer of a bipedal robot based on training platform and reinforcement learning,” J. Robotics and Autonomous Systems, vol. 98, pp. 42–55, 2017. doi: 10.1016/j.robot.2017.09.001
|
[25] |
A. S. Polydoros and L. Nalpantidis, " Survey of model-based rinforcement learning: application on robotics,” J. intelligent Robotic Systems, vol. 86, pp. 153–173, 2017. doi: 10.1007/s10846-017-0468-y
|
[26] |
M. P. Deisenroth and C. E. Rasmussen, " PILCO: a model-based and data-efficient approach to policy search,” in Proc. 28th Int. Conf. Machine Learning, pp. 465–472, 2011.
|
[27] |
M. Cutler and J. How, " Efficient reinforcement learning for robots using informative simulated priors,” in Proc. Int. Conf. Robotics and Automation, pp. 2605–2612, 2015.
|
[28] |
P. Englert, A. paraschos, J. Peters, and M. P. Deisenroth, " Model-based imitation learning by probabilistic trajectory matching,” in Proc. Int. Conf. Robotics and Automation, pp. 1922–1927, 2013.
|
[29] |
Y. Gal, R. T. McAllister, and C. E. Rasmussen, " Improving PILCO with Bayesian Neural Network Dynamics Models,” Cambridge University, 2017.
|
[30] |
B. Stephens, " Integral control of humanoid balance,” in Proc. Int. Conf. Int. Robotics and Systems, pp. 4020–4027, 2007.
|
[31] |
J. Kober, J. A. Bagnell, and J. Peters, " Reinforcement learning in robotics: a survey,” Int. J. Robot. Res., vol. 32, no. 11, pp. 1238–1274, 2013. doi: 10.1177/0278364913495721
|
[32] |
C. E. Rasmussen and C. K. I. Williams, Gaussian Processes for Machine Learning, the MIT Press, Cambridge, MA, USA, 2006.
|
[33] |
A. Gelman, J. B. Carlin, H. S. Stern, and D. B. Rubin, Bayesian Data Analysis, zond Edition (Chapman & Hall/CRC Texts in Statistical Science), 3rd Edition, Nov. 2013.
|
[34] |
D. J. C. Mackay, " Comparison of approximate methods for handling hyper-parameters,” Neural Computation, vol. 11, no. 5, pp. 1035–1068, 1999. doi: 10.1162/089976699300016331
|
[35] |
M. Deisenroth, D. Fox, and C. E. Rasmussen, " Gaussian processes for data-efficient learning in robotics and control,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 37, no. 2, pp. 408–423, 2015. doi: 10.1109/TPAMI.2013.218
|
[36] |
M. Deisenroth, G. Neumann, and J. Peters, " A Survey of policy search for robotics,” Foundations and Trends in Robotics, vol. 2, no. 1–2, pp. 1–142, 2013.
|
[37] |
C. J. C. H. Watkins, and P. Dayan, " Technical note Q-learning,” Machine Learning, vol. 8, pp. 279–292, 1992.
|
[38] |
R. S. Sutton and A. G. Barto, Reinforcement Learning An Introduction, The MIT Press, USA, 2018.
|
[39] |
J. Feng, C. Fyfe, and L. C. Jain, " Experimental analysis on Sarsa(λ) and Q(λ) with different eligibility traces strategies,” J. Intelligence and Fuzzy System, vol. 20, no. 1–2, pp. 73–82, 2009.
|