A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 13 Issue 3
Mar.  2026

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 19.2, Top 1 (SCI Q1)
    CiteScore: 28.2, Top 1% (Q1)
    Google Scholar h5-index: 95, TOP 5
Turn off MathJax
Article Contents
J. Liu, Z. Zhou, J. Huang, W. Hong, and J. Shi, “Two-dimensional model-free off-policy optimal iterative learning control for time-varying batch systems,” IEEE/CAA J. Autom. Sinica, vol. 13, no. 3, pp. 692–703, Mar. 2026. doi: 10.1109/JAS.2025.125399
Citation: J. Liu, Z. Zhou, J. Huang, W. Hong, and J. Shi, “Two-dimensional model-free off-policy optimal iterative learning control for time-varying batch systems,” IEEE/CAA J. Autom. Sinica, vol. 13, no. 3, pp. 692–703, Mar. 2026. doi: 10.1109/JAS.2025.125399

Two-Dimensional Model-Free Off-Policy Optimal Iterative Learning Control for Time-Varying Batch Systems

doi: 10.1109/JAS.2025.125399
More Information
  • Although iterative learning control (ILC) has been widely used in batch processes, designing an optimal iterative learning control scheme for batch systems with unknown dynamics and time-varying parameters remains an open problem. In this paper, we propose a novel two-dimensional model-free off-policy optimal iterative learning control to achieve optimal control performance for linear time-varying batch systems. First, the one-dimensional state space is expanded to the two-dimensional state space by integrating time and batch information. Then, based on dynamic programming and a recursive algorithm, the framework of two-dimensional model-based optimal iterative learning control is established. Based on this framework, two-dimensional model-free optimal iterative learning control is further developed using model-free Q-learning reinforcement learning. The optimal iterative learning control policy is obtained through online off-policy iteration using historical and online operation data. Meanwhile, a rigorous convergence proof of the model-free optimal iterative learning control law is presented. Finally, the simulation results in the injection molding batch process demonstrate the proposed control scheme’s effectiveness, feasibility, and significant improvement in control performance.

     

  • loading
  • [1]
    S. Arimoto, S. Kawamura, and F. Miyazaki, “Bettering operation of robots by learning,” J. Robot. Syst., vol. 1, no. 2, pp. 123–140, 1984. doi: 10.1002/rob.4620010203
    [2]
    Y. Chen and C. T. Freeman, “Iterative learning control for piecewise arc path tracking with validation on a gantry robot manufacturing platform,” ISA Trans., vol. 139, pp. 650–659, 2023. doi: 10.1016/j.isatra.2023.03.046
    [3]
    M. Pierallini, F. Angelini, R. Mengacci, A. Palleschi, A. Bicchi, and M. Garabini, “Iterative learning control for compliant underactuated arms,” IEEE Trans. Syst. Man Cybern. Syst., vol. 53, no. 6, pp. 3810−3822, 2023.
    [4]
    K. Xu, B. Meng, and Z. Wang, “Design of data-driven mode-free iterative learning controller based higher order parameter estimation for multi-agent systems consistency tracking,” Knowledge-Based Syst., vol. 261, p. 110221, 2023. doi: 10.1016/j.knosys.2022.110221
    [5]
    D. Shen and J.-X. Xu, “Distributed learning consensus for heterogenous high-order nonlinear multi-agent systems with output constraints,” Automatica, vol. 97, pp. 64–72, 2018. doi: 10.1016/j.automatica.2018.07.030
    [6]
    I. Lim, D. J. Hoelzle, and K. L. Barton, “A multi-objective iterative learning control approach for additive manufacturing applications,” Control Eng. Practice, vol. 64, pp. 74–87, 2017. doi: 10.1016/j.conengprac.2017.03.011
    [7]
    Z. Afkhami, D. J. Hoelzle, and K. Barton, “Robust higher-order spatial iterative learning control for additive manufacturing systems,” IEEE Trans. Control Syst. Technol., vol. 31, no. 4, pp. 1692−1707, 2023.
    [8]
    B. Shibani, P. Ambure, A. Purohit, P. Suratia, and S. Bhartiya, “Control of batch pulping process using data-driven constrained iterative learning control,” Comput. Chem. Eng., vol. 170, p. 108138, 2023. doi: 10.1016/j.compchemeng.2023.108138
    [9]
    J. Liu, W. Hong, and J. Shi, “Two dimensional (2d) feedback control scheme based on deep reinforcement learning algorithm for nonlinear non-repetitive batch processes,” in Proc. 11th Data Driven Control and Learning Systems Conf., Chengdu, China: IEEE, 2022, pp. 262–267.
    [10]
    N. Liu and A. Alleyne, “Iterative learning identification for linear time-varying: systems,” IEEE Trans. Control Syst. Technol., vol. 24, no. 1, pp. 310–317, 2015.
    [11]
    J. Wei, H. Tao, S. Hao, W. Paszke, and K. Gałkowski, “Output feedback based robust iterative learning control via a heuristic approach for batch processes with time-varying state delays and uncertainties,” J. Process Control, vol. 116, pp. 159–171, 2022. doi: 10.1016/j.jprocont.2022.06.008
    [12]
    B. Chu, A. Rauh, H. Aschemann, E. Rogers, and D. H. Owens, “Constrained iterative learning control for linear time-varying systems with experimental validation on a high-speed rack feeder,” IEEE Trans. Control Syst. Technol., vol. 30, no. 5, pp. 1834–1846, 2021.
    [13]
    K. L. Barton and A. G. Alleyne, “A norm optimal approach to time-varying ILC with application to a multi-axis robotic testbed,” IEEE Trans. Control Syst. Technol., vol. 19, no. 1, pp. 166–180, 2010.
    [14]
    S. Hao, T. Liu, and F. Gao, “PI based indirect-type iterative learning control for batch processes with time-varying uncertainties: A 2D FM model based approach,” J. Process Control, vol. 78, pp. 57–67, 2019. doi: 10.1016/j.jprocont.2019.04.003
    [15]
    S. He, W. Chen, D. Li, Y. Xi, Y. Xu, and P. Zheng, “Iterative learning control with data-driven-based compensation,” IEEE T. Cybern., vol. 52, no. 8, pp. 7492–7503, 2021.
    [16]
    X. Yu, X. Fang, B. Mu, and T. Chen, “Kernel-based regularized iterative learning control of repetitive linear time-varying systems,” Automatica, vol. 154, p. 111047, 2023. doi: 10.1016/j.automatica.2023.111047
    [17]
    D. Meng and J. Zhang, “Design and analysis of data-driven learning control: An optimization-based approach,” IEEE Trans. Neural Netw. Learn. Syst., vol. 33, no. 10, pp. 5527–5541, 2021.
    [18]
    H. Shen, C. Peng, H. Yan, and S. Xu, “Data-driven near optimization for fast sampling singularly perturbed systems,” IEEE Trans. Autom. Control, vol. 69, no. 7, pp. 4689−4694, 2024.
    [19]
    J. Wang, J. Wu, H. Shen, J. Cao, and L. Rutkowski, “Fuzzy H control of discrete-time nonlinear Markov jump systems via a novel hybrid reinforcement Q-learning method,” IEEE T. Cybern., vol. 53, no. 11, pp. 7380–7391, 2022.
    [20]
    H. Zhang, S. Li, and Y. Zheng, “Q-learning-based model predictive control for nonlinear continuous-time systems,” Ind. Eng. Chem. Res., vol. 59, no. 40, pp. 17 987–17 999, 2020. doi: 10.1021/acs.iecr.0c02321
    [21]
    F. Guo, X. Zhou, J. Liu, Y. Zhang, D. Li, and H. Zhou, “A reinforcement learning decision model for online process parameters optimization from offline data in injection molding,” Appl. Soft. Comput., vol. 85, p. 105828, 2019. doi: 10.1016/j.asoc.2019.105828
    [22]
    Y. Ruan, Y. Zhang, T. Mao, X. Zhou, D. Li, and H. Zhou, “Trajectory optimization and positioning control for batch process using learning control,” Control Eng. Practice, vol. 85, pp. 1–10, 2019. doi: 10.1016/j.conengprac.2019.01.004
    [23]
    Y. Zhang, B. Chu, and Z. Shu, “Parameter optimal iterative learning control design: From model-based, data-driven to reinforcement learning,” IFAC-PapersOnLine, vol. 55, no. 12, pp. 494–499, 2022. doi: 10.1016/j.ifacol.2022.07.360
    [24]
    X. Wen, H. Shi, C. Su, X. Jiang, P. Li, and J. Yu, “Novel data-driven two-dimensional Q-learning for optimal tracking control of batch process with unknown dynamics,” ISA Trans., vol. 125, pp. 10–21, 2022. doi: 10.1016/j.isatra.2021.06.007
    [25]
    J. Liu, Z. Zhou, W. Hong, and J. Shi, “Two-dimensional iterative learning control with deep reinforcement learning compensation for the nonrepetitive uncertain batch processes,” J. Process Control, vol. 131, p. 103106, 2023. doi: 10.1016/j.jprocont.2023.103106
    [26]
    H. Shi, W. Gao, X. Jiang, C. Su, and P. Li, “Two-dimensional model-free Q-learning-based output feedback fault-tolerant control for batch processes,” Comput. Chem. Eng., vol. 182, p. 108583, 2024. doi: 10.1016/j.compchemeng.2024.108583
    [27]
    X. Jiang, M. Huang, H. Shi, X. Wang, and Y. Zhang, “Off-policy twodimensional reinforcement learning for optimal tracking control of batch processes with network-induced dropout and disturbances,” ISA Trans., vol. 144, pp. 228–244, 2024. doi: 10.1016/j.isatra.2023.11.011
    [28]
    H. Shi, C. Yang, X. Jiang, C. Su, and P. Li, “Novel two-dimensional off-policy: Q-learning method for output feedback optimal tracking control of batch process with unknown dynamics,” J. Process Control, vol. 113, pp. 29–41, 2022. doi: 10.1016/j.jprocont.2022.03.006
    [29]
    J. Shi, F. Gao, and T.-J. Wu, “Robust design of integrated feedback and iterative learning control of a batch process based on a 2D Roesser system,” J. Process Control, vol. 15, no. 8, pp. 907–924, 2005. doi: 10.1016/j.jprocont.2005.02.005
    [30]
    D. Bertsekas, Dynamic Programming and Optimal Control: Volume I. MIT: Athena Scientific, 2012.
    [31]
    J. Clifton and E. Laber, “Q-learning: Theory and applications,” Annu. Rev. Stat. Application, vol. 7, no. 1, pp. 279–301, 2020. doi: 10.1146/annurev-statistics-031219-041220
    [32]
    P. B. Stark and R. L. Parker, “Bounded-variable least-squares: An algorithm and applications,” Comput. Stat., vol. 10, pp. 129–129, 1995.
    [33]
    Y. Wang, T. Liu, and Z. Zhao, “Advanced PI control with simple learning set-point design: Application on batch processes and robust stability analysis,” Chem. Eng. Sci., vol. 71, pp. 153–165, 2012. doi: 10.1016/j.ces.2011.12.028
    [34]
    T. Liu, X. Z. Wang, and J. Chen, “Robust PID based indirect-type iterative learning control for batch processes with time-varying uncertainties,” J. Process Control, vol. 24, no. 12, pp. 95–106, 2014. doi: 10.1016/j.jprocont.2014.07.002

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(8)  / Tables(1)

    Article Metrics

    Article views (866) PDF downloads(66) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return