
IEEE/CAA Journal of Automatica Sinica
Citation: | Yang Yang and Dong Yue, "Distributed Tracking Control of a Class of Multi-agent Systems in Non-affine Pure-feedback Form Under a Directed Topology," IEEE/CAA J. Autom. Sinica, vol. 5, no. 1, pp. 169-180, Jan. 2018. doi: 10.1109/JAS.2017.7510382 |
Dear Editor,
Underwater distributed antenna systems (DAS) are stationary infrastructures consisting of multiple geographically distributed antenna elements (DAEs) which are interconnected through high-rate backbone networks [1]. Compared to centralized systems, the DAS could provide a larger coverage area and higher throughput for underwater acoustic (UWA) transmissions. In this work, exploiting the low sound speed in water, a multi-agent reinforcement learning (MARL)-based approach is proposed to secure underwater DAS against eavesdropping at the physical layer. Specifically, the theoretical secrecy rate is firstly derived for time-slotted UWA networks (UWANs) considering the large propagation delays. Furthermore, we investigate the long-term sum secrecy rate optimization problem under the MARL framework, where each DAE learns its optimal transmission strategy online. Simulation results show that the proposed method achieves higher secrecy performance compared to competing benchmark methods.
Typical physical-layer security approaches against eavesdropping in UWANs include: secret key generation based on the randomness of UWA channels [2]; cooperative jamming where the transmission strategies of friendly jammers are optimized to blind the eavesdropper (EVE) [3]; and secure coordinated multipoint (CoMP) transmissions to enforce time-domain self-interference at the EVE [4], [5]. Particularly about the secure CoMP transmissions, by coordinated transmission scheduling of multiple DAEs with low sound speed in water, the decoding performance of the EVE can be significantly suppressed by collisions of useful signals while the signals received by the legitimate user (LU) are collision-free. However, this type of security mechanism is effective with specific transmission protocols and cannot be applied directly to general UWANs. In addition, the CoMP transmissions heavily rely on efficient coordination of all the transmitters while sharing information with UWA transmissions results in great overhead and latency. In this work, we study the security enhancement against eavesdropping for time-slotted UWANs with CoMP transmissions by taking advantage of the low coordination cost of underwater DAS.
Reinforcement learning (RL) has been leveraged to secure terrestrial radio networks against eavesdropping at the physical layer [6]–[8]. The basic idea is to let the system learn optimal transmission strategies, e.g., transmitting nodes, transmission power, or beamforming vectors, to maximize the secrecy performance through dynamically interacting with environments. However, due to the nonnegligible transmission latency of UWA transmissions, those methods cannot be directly applied to UWANs. Although RL has been introduced to secure UWANs with privacy-preserving localization [9] and anti-jamming relay design [10], to the best of our knowledge, there is no work that exploits RL to secure UWA transmissions against eavesdropping at the physical layer. Hence, considering the large propagation delays, we propose an MARL-based framework to secure the underwater DAS, where all the DAEs coordinately learn their transmission strategies online to improve the network secrecy performance.
System model and secrecy rate: We first consider an underwater system where N DAEs coordinate with each other to transmit signal blocks to a LU while an EVE collects transmitted signals from the DAEs. In this study, we consider that the EVE’s location information is known a priori to the DAS, which has also been considered in many existing works, e.g., [5], [6] and [9]. The underwater system operates in a slotted-based manner. Specifically, in each time slot, each DAE decides its transmission strategy including whether to transmit one signal block to the LU and the transmission power of each block. Denote
Denote
If the μth DAE decides to transmit in the time slot
$$ \begin{split} \lambda_\mu(\ell;&{\cal{ I}}(\ell+D_\mu)) :=\\ &\frac{p_\mu(\ell) g_\mu }{\sigma^2_n+\sum_{\nu \in {\cal{ I}}(\ell+D_\mu),\nu \neq \mu} p_{\nu}(\ell+D_\mu-D_\nu)g_\nu} \end{split} $$ | (1) |
where
$$ \begin{split} \lambda^{\rm {(e)}}_\mu(\ell;&{\cal{ I}}^{\rm {(e)}}(\ell+D^{\rm {(e)}}_\mu)) :=\\ &\frac{p_\mu(\ell) g^{\rm{(e})}_\mu }{\sigma^2_n+\sum_{\nu \in {\cal{ I}}^{\rm {(e)}}(\ell+D^{\rm {(e)}}_\mu),\nu \neq \mu} p_{\nu}(\ell+D^{\rm {(e)}}_\mu-D^{\rm {(e)}}_\nu)g^{\rm {(e)}}_\nu} \end{split} $$ | (2) |
where
To derive the theoretical secrecy rate (SR), we first assume that the global information of transmission strategy is available. For instance,
$$ \begin{split} C_\mu(\ell) =\;&\frac{1}{2}\Big[ \log\big(1+\lambda_\mu(\ell;{\cal{ I}}(\ell+D_\mu))\big) \\ &-\log\big(1+ \lambda^{\rm (e)}_\mu(\ell;{\cal{ I}}^{\rm (e)}(\ell+D^{\rm (e)}_\mu))\big)\Big]^+ \end{split} $$ | (3) |
where
$$ \begin{align} \max_{\{d_\mu(\ell), p_\mu(\ell):1\le \mu\le N\}_{\ell=0}^{\infty}}& \sum_{\ell=0}^{\infty} \sum_{\mu=1}^{N}C_\mu(\ell) \end{align}\tag{4a} $$ |
$$ \qquad\begin{align} \text{s.t.} \qquad \lambda_\mu(\ell;{\cal{ I}}(\ell+D_\mu))&\ge \Gamma_{\rm th},\;\; \text{if}\; d_\mu=1 \end{align}\tag{4b} $$ |
where
RL reformulation: To reformulate (4) in the RL paradigm, a Markov decision process (MDP)
1) Action: Denote
2) State: As the SR
$$ \begin{split} \check{C}_\mu(\ell) :=\;&\frac{1}{2}\Big[ \log\big(1+\lambda_\mu(\ell;\check{{\cal{ I}}}(\ell,\ell+D_\mu))\big) \\ &-\log\big(1+ \lambda^{\rm (e)}_\mu(\ell;\check{{\cal{ I}}}^{\rm (e)}(\ell,\ell+D^{\rm (e)}_\mu))\big)\Big]^+ \end{split} $$ | (5) |
where
We construct a tuple for the μth DAE as
3) Reward: Denote
$$ \begin{equation} r_\mu(\ell) = C_\mu(\ell) - \beta\big[\Gamma_{\rm th}-\lambda_\mu(\ell;{\cal{ I}}(\ell+D_\mu))\big]^+d_\mu(\ell) \end{equation} $$ | (6) |
where
Based on the MDP elements defined above, the optimization problem (4) can be reformulated to an RL problem with the aim of maximizing the long-term expected sum reward as
$$ \begin{equation} \max\limits_{\{{\bf{A}}(\ell),{\bf{S}}(\ell)\}_{\ell=0}^{\infty}} \ \mathbb{E}\left\{\sum\limits_{\ell=0}^{\infty}\gamma^{\ell} R(\ell) \right\} \end{equation} $$ | (7) |
where
Performance evaluation: To evaluate the proposed method, we consider an underwater network consisting of 4 DAEs and assume that the DAEs, LU, and EVE are within a disk area of a radius of 4 km. We consider that the sound speed in water is
We compare the proposed method to three methods: 1) Nearest: Select the DAE closest to the LU to transmit signal blocks all the time with the maximal transmission power; 2) SA: A modified version of the signal alignment method with the known location of the EVE [5] for time-slotted systems; 3) DDPG: Determine the transmission strategies of all the DAEs by a single agent with the DDPG algorithm.
The average SRs with different methods as well as the link capacity of the LU in Nearest are shown in Fig. 2. The received SNR corresponds to the ratio of the received signal power to the noise power. One can see that the average SR achieved by Nearest eventually converges as the received SNR increases while the rates achieved by SA, DDPG, and the proposed method grow monotonically with the received SNR. The efficacy of the proposed method is demonstrated as it outperforms all the other methods. It also shows that, compared to DDPG where only one agent learns the actions for all the DAEs, the proposed method enables higher secrecy performance, thanks to cooperative learning with multiple agents. Moreover, the average SRs gained by different methods under different total numbers of DAEs are presented in Table 1. It shows that the SRs of all the methods increase with the total number of DAEs as more DAEs could offer higher degrees of freedom to optimize the secrecy performance. Nevertheless, the proposed method still achieves the greatest SRs, which further validates its effectiveness.
Methods | Numbers of DAEs | ||
4 | 6 | 8 | |
Nearest | 1.65 | 2.12 | 2.51 |
SA | 3.01 | 3.23 | 5.06 |
DDPG | 3.21 | 3.35 | 5.14 |
Proposed method | 3.73 | 3.99 | 5.75 |
Fig. 3 shows the learning performance of the proposed method in the case that an EVE stays in one location until the
Conclusion: This letter explored an MARL-based method to secure transmissions in underwater DAS against eavesdropping. Considering the large propagation delays of UWA transmissions, the secrecy rate was first derived for practical time-slotted UWANs. Then, the long-term sum secrecy rate maximization problem was studied in the RL paradigm, where each DAE learned its transmission schedule and transmission power online based on the MADDPG algorithm. The simulation results showed the efficacy of the proposed method compared to benchmark methods. In future, we will extend the proposed method for underwater DAS with multiple LUs and multiple EVEs, where the interference relation is more challenging. In addition, how to adapt the proposed framework to underwater systems where their nodes, e.g., LUs, EVEs, or even DAEs, can move over time, is also an important research direction for our future work.
Acknowledgments: This work was supported in part by the National Natural Science Foundation of China (62201248), and the Startup Foundation of the University of South China (200XQD056).
[1] |
Y. C. Cao, W. W. Yu, W. Ren, and G. R. Chen, "An overview of recent progress in the study of distributed multi-agent coordination, " IEEE Trans. Industr. Inform., vol. 9, no. 1, pp. 427-438, Feb. 2013. http://ieeexplore.ieee.org/document/6303906/
|
[2] |
L. Ding, Q. L. Han, and G. Guo, "Network-based leader-following consensus for distributed multi-agent systems, " Automatica, vol. 49, no. 7, pp. 2281-2286, Jul. 2013. https://www.sciencedirect.com/science/article/pii/S0005109813002331
|
[3] |
G. Guo, L. Ding, and Q. L. Han, "A distributed event-triggered transmission strategy for sampled-data consensus of multi-agent systems, " Automatica, vol. 50, no. 5, pp. 1489-1496, May 2014. https://www.sciencedirect.com/science/article/pii/S0005109814001095
|
[4] |
W. L. He, B. Zhang, Q. L. Han, F. Qian, J. Kurths, and J. D. Cao, "Leader-following consensus of nonlinear multiagent systems with stochastic sampling, " IEEE Trans. Cybern., vol. 47, no. 2, pp. 327-338, Feb. 2017. http://ieeexplore.ieee.org/document/7407343/
|
[5] |
H. W. Zhang, F. L. Lewis, and Z. H. Qu, "Lyapunov, adaptive, and optimal design techniques for cooperative systems on directed communication graphs, " IEEE Trans. Ind. Electron., vol. 59, no. 7, pp. 3026-3041, Jul. 2012. http://ieeexplore.ieee.org/document/5898403/
|
[6] |
Z. H. Peng, D. Wang, H. W. Zhang, and G. Sun, "Distributed neural network control for adaptive synchronization of uncertain dynamical multiagent systems, " IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 8, pp. 1508-1519, Aug. 2014.
|
[7] |
D. M. Yuan and D. W. C. Ho, "Randomized gradient-free method for multiagent optimization over time-varying networks, " IEEE Trans. Neural Netw. Learn. Syst., vol. 26, no. 6, pp. 1342-1347, Jun. 2015. http://ieeexplore.ieee.org/iel7/5962385/7109211/06870494.pdf?arnumber=6870494
|
[8] |
K. C. Cao, B. Jiang, and D. Yue, "Distributed consensus of multiple nonholonomic mobile robots, " IEEE/CAA J. of Autom. Sinica, vol. 1, no. 2, pp. 162-170, Apr. 2014. http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=7004546&openedRefinements%3D*%26filter%3DAND%28AND%28NOT%284283010803%29%29%2CAND%28NOT%284283010803%29%29%29%26pageNumber%3D5%26rowsPerPage%3D100%26queryText%3D%28robots%29
|
[9] |
J. Huang, H. Fang, L. H. Dou, and J. Chen, "An overview of distributed high-order multi-agent coordination, " IEEE/CAA J. of Autom. Sinica, vol. 1, no. 1, pp. 1-9, Jan. 2014. http://ieeexplore.ieee.org/document/7004613/
|
[10] |
X. H. Ge, F. W. Yang, and Q. L. Han, "Distributed networked control systems: a brief overview, " Inform. Sciences, vol. 380, pp. 117-131, Feb. 2017.
|
[11] |
R. Olfati-Saber, "Flocking for multi-agent dynamic systems: algorithms and theory, " IEEE Trans. Automat. Contr., vol. 51, no. 3, pp. 401-420, Mar. 2006. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1605401
|
[12] |
J. P. Hu and G. Feng, "Distributed tracking control of leader-follower multi-agent systems under noisy measurement, " Automatica, vol. 46, no. 8, pp. 1382-1387, Aug. 2010.
|
[13] |
Y. Q. Zhang and Y. G. Hong, "Distributed control design for leader escort of multi-agent systems, " Int. J. Control, vol. 88, no. 5, pp. 935-945, May 2015.
|
[14] |
Z. Y. Meng, Z. Y. Zhao, and Z. L. Lin, "On global leader-following consensus of identical linear dynamic systems subject to actuator saturation, " Syst. Control Lett., vol. 62, no. 2, pp. 132-142, 2013. doi: 10.1016/j.sysconle.2012.10.016
|
[15] |
X. L. Liu, B. G. Xu, and L. H. Xie, "Distributed tracking control of second-order multi-agent systems under measurement noises, " J. Syst. Sci. Complex., vol. 27, no. 5, pp. 853-865, Oct. 2014.
|
[16] |
L. P. Mo, Y. G. Niu, and T. T. Pan, "Consensus of heterogeneous multi-agent systems with switching jointly-connected interconnection, " Physica A, vol. 427, pp. 132-140, Jun. 2015. http://www.sciencedirect.com/science/article/pii/S0378437115000898
|
[17] |
Z. H. Wu, H. J. Fang, and Y. Y. She, "Weighted average prediction for improving consensus performance of second-order delayed multiagent systems, " IEEE Trans. Syst. Man Cybern. B Cybern., vol. 42, no. 5, pp. 1501-1508, Oct. 2012. http://ieeexplore.ieee.org/document/6171867/
|
[18] |
H. Kim, H. Shim, and J. H. Seo, "Output consensus of heterogeneous uncertain linear multi-agent systems, " IEEE Trans. Automat. Contr., vol. 56, no. 1, pp. 200-206, Jan. 2011. http://ieeexplore.ieee.org/document/5605658/
|
[19] |
H. W. Zhang and F. L. Lewis, "Adaptive cooperative tracking control of higher-order nonlinear systems with unknown dynamics, " Automatica, vol. 48, no. 7, pp. 1432-1439, Jul. 2012.
|
[20] |
S. El-Ferik, A. Qureshi, and F. L. Lewis, "Neuro-adaptive cooperative tracking control of unknown higher-order affine nonlinear systems, " Automatica, vol. 50, no. 3, pp. 798-808, Mar. 2014.
|
[21] |
W. W. Yu, G. R. Chen, M. Cao, and J. Kurths, "Second-order consensus for multiagent systems with directed topologies and nonlinear dynamics, " IEEE Trans. Syst. Man Cybern. B Cybern., vol. 40, no. 3, pp. 881-891, Jun. 2010. http://ieeexplore.ieee.org/document/5313874/
|
[22] |
C. L. P. Chen, G. X. Wen, Y. J. Liu, and F. Y. Wang, "Adaptive consensus control for a class of nonlinear multiagent time-delay systems using neural networks, " IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 6, pp. 1217-1226, Jun. 2014.
|
[23] |
L. Wang, W. Feng, M. Chen, and Q. Wang, "Global bounded consensus in heterogeneous multi-agent systems with directed communication graph, " IET Contr. Theory Appl., vol. 9, no. 1, pp. 147-152, Jan. 2015. http://ieeexplore.ieee.org/xpl/abstractKeywords.jsp?reload=true&arnumber=6987403&sortType%3Dasc_p_Sequence%26filter%3DAND%28p_IS_Number%3A6987394%29
|
[24] |
Y. Dong and J. Huang, "Cooperative global output regulation for a class of nonlinear multi-agent systems, " IEEE Trans. Automat. Contr., vol. 59, no. 5, pp. 1348-1354, May 2014. http://ieeexplore.ieee.org/document/7403888/
|
[25] |
C. R. Wang, J. Sheng, and H. B. Ji, "Leader-following consensus of a class of nonlinear multi-agent systems via dynamic output feedback control, " Trans. Inst. Meas. Contr., vol. 37, no. 2, pp. 154-163, Feb. 2015. doi: 10.1177/0142331214535408
|
[26] |
W. Wang, J. S. Huang, C. Y. Wen, and H. J. Fan, "Distributed adaptive control for consensus tracking with application to formation control of nonholonomic mobile robots, " Automatica, vol. 50, no. 4, pp. 1254-1263, Apr. 2014.
|
[27] |
J. M. Peng and X. D. Ye, "Distributed adaptive controller for the outputsynchronization of networked systems in semi-strict feedback form, " J. Franklin Inst., vol. 351, no. 1, pp. 412-428, Jan. 2014.
|
[28] |
M. Krstic, P. V. Kokotovic, and I. Kanellakopoulos, Nonlinear and Adaptive Control Design. New York: John Wiley & Sons, Inc., 1995.
|
[29] |
D. Swaroop, J. K. Hedrick, P. P. Yip, and J. C. Gerdes, "Dynamic surface control for a class of nonlinear systems, " IEEE Trans. Automat. Contr., vol. 45, no. 10, pp. 1893-1899, Oct. 2000.
|
[30] |
D. Wang and J. Huang, "Neural network-based adaptive dynamic surface control for a class of uncertain nonlinear systems in strict-feedback form, " IEEE Trans. Neural Netw., vol. 16, no. 1, pp. 195-202, Jan. 2005.
|
[31] |
S. J. Yoo and J. B. Park, "Decentralized adaptive output-feedback control for a class of nonlinear large-scale systems with unknown time-varying delayed interactions, " Inform. Sciences, vol. 186, no. 1, pp. 222-238, Mar. 2012.
|
[32] |
S. C. Tong, Y. M. Li, G. Feng, and T. S. Li, "Observer-based adaptive fuzzy backstepping dynamic surface control for a class of MIMO nonlinear systems, " IEEE Trans. Syst. Man Cybern. B Cybern., vol. 41, no. 4, pp. 1124-1135, Aug. 2011.
|
[33] |
T. S. Li, D. Wang, G. Feng, and S. C. Tong, "A DSC approach to robust adaptive NN tracking control for strict-feedback nonlinear systems, " IEEE Trans. Syst. Man Cybern. B Cybern., vol. 40, no. 3, pp. 915-927, Jun. 2010. http://or.nsfc.gov.cn/handle/00001903-5/84730
|
[34] |
G. Sun, D. Wang, Z. H. Peng, H. Wang, W. Y. Lan, and M. X. Wang, "Robust adaptive neural control of uncertain pure-feedback nonlinear systems, " Int. J. Control, vol. 86, no. 5, pp. 912-922, Apr. 2013.
|
[35] |
Y. M. Li, S. C. Tong, and T. S. Li, "Adaptive fuzzy output feedback dynamic surface control of interconnected nonlinear pure-feedback systems, " IEEE Trans. Cybern., vol. 45, no. 1, pp. 138-149, Jan. 2015.
|
[36] |
Y. Yang, D. Yue, and Y. S. Xue, "Decentralized adaptive neural output feedback control of a class of large-scale time-delay systems with input saturation, " J. Franklin Inst., vol. 352, no. 5, pp. 2129-2151, May 2015.
|
[37] |
S. J. Yoo, "Distributed consensus tracking for multiple uncertain nonlinear strict-feedback systems under a directed graph, " IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 4, pp. 666-672, Apr. 2013. http://ieeexplore.ieee.org/document/6415283/
|
[38] |
W. Wang, D. Wang, Z. H. Peng, and T. S. Li, "Prescribed performance consensus of uncertain nonlinear strict-feedback systems with unknown control directions, " IEEE Trans. Syst. Man Cybern. Syst., vol. 46, no. 9, pp. 1279-1286, Sep. 2016.
|
[39] |
Y. Zheng, Y. Zhu, and L. Wang, "Consensus of heterogeneous multiagent systems, " IET Contr. Theory Appl., vol. 5, no. 16, pp. 1881-1888, Nov. 2011. http://ieeexplore.ieee.org/iel5/4079545/6042759/06042766.pdf?arnumber=6042766
|
[40] |
Y. F. Su and J. Huang, "Cooperative output regulation of linear multiagent systems, " IEEE Trans. Automat. Contr., vol. 57, no. 4, pp. 1062-1066, Apr. 2012. http://ieeexplore.ieee.org/document/6026912/
|
[41] |
Z. T. Ding, "Consensus output regulation of a class of heterogeneous nonlinear systems, " IEEE Trans. Automat. Contr., vol. 58, no. 10, pp. 2648-2653, Oct. 2013. http://ieeexplore.ieee.org/document/6491446/
|
[42] |
X. X. Yin, D. Yue, and S. L. Hu, "Distributed event-triggered control of discrete-time heterogeneous multi-agent systems, " J. Franklin Inst., vol. 350, no. 3, pp. 651-669, Apr. 2013.
|
[43] |
S. B. Li, G. Feng, X. Y. Luo, and X. P. Guan, "Output consensus of heterogeneous linear discrete-time multiagent systems with structural uncertainties, " IEEE Trans. Cybern., vol. 45, no. 12, pp. 2868-2879, Dec. 2015. http://ieeexplore.ieee.org/document/7018013/
|
[44] |
X. F. Zhang, L. Liu, and G. Feng, "Leader-follower consensus of timevarying nonlinear multi-agent systems, " Automatica, vol. 52, pp. 8-14, Feb. 2015. https://www.sciencedirect.com/science/article/pii/S0005109814005159
|
[45] |
Z. H. Qu, Cooperative Control of Dynamical Systems: Applications to Autonomous Vehicles. London: Springer-Verlag, 2009.
|
[46] |
S. S. Ge, C. C. Hang, T. H. Lee, and T. Zhang, Stable Adaptive Neural Network Control. US: Springer, 2002.
|
[47] |
C. Wang, D. J. Hill, S. S. Ge, and G. R. Chen, "An ISS-modular approach for adaptive neural control of pure-feedback systems, " Automatica, vol. 42, no. 5, pp. 723-731, May 2006.
|
[48] |
A. M. Zou, Z. G. Hou, and M. Tan, "Adaptive control of a class of nonlinear pure-feedback systems using fuzzy backstepping approach, " IEEE Trans. Fuzzy Syst., vol. 16, no. 4, pp. 886-897, Aug. 2008.
|
[49] |
S. C. Tong, Y. M. Li, and Y. J. Liu, "Adaptive fuzzy output feedback decentralized control of pure-feedback nonlinear large-scale systems, " Int. J. Robust Nonlinear Control, vol. 24, no. 5, pp. 930-954, Mar. 2014.
|
[50] |
E. Kim and S. Lee, "Output feedback tracking control of MIMO systems using a fuzzy disturbance observer and its application to the speed control of a PM synchronous motor, " IEEE Trans. Fuzzy Syst., vol. 13, no. 6, pp. 725-741, Dec. 2005.
|
[51] |
C. Kwan and F. L. Lewis, "Robust backstepping control of nonlinear systems using neural networks, " IEEE Trans. Syst. Man Cybern A Syst. Hum., vol. 30, no. 6, pp. 753-766, Nov. 2000.
|
[52] |
W. He, Y. H. Chen, and Z. Yin, "Adaptive neural network control of an uncertain robot with full-state constraints, " IEEE Trans. Cybern., vol. 46, no. 3, pp. 620-629, Mar. 2016.
|
[53] |
W. He, S. Zhang, and S. S. Ge, "Adaptive control of a flexible crane system with the boundary output constraint, " IEEE Trans. Ind. Electron., vol. 61, no. 8, pp. 4126-4133, Aug. 2014.
|
[54] |
X. M. Zhang, Q. L. Han, and X. H. Yu, "Survey on recent advances in networked control systems, " IEEE Trans. Industr. Inform., vol. 12, no. 5, pp. 1740-1752, Oct. 2016.
|
[1] | Hongmin Liu, Qi Zhang, Yufan Hu, Hui Zeng, Bin Fan. Unsupervised Multi-Expert Learning Model for Underwater Image Enhancement[J]. IEEE/CAA Journal of Automatica Sinica, 2024, 11(3): 708-722. doi: 10.1109/JAS.2023.123771 |
[2] | Meilin Li, Yue Long, Tieshan Li, Hongjing Liang, C. L. Philip Chen. Dynamic Event-Triggered Consensus Control for Input Constrained Multi-Agent Systems With a Designable Minimum Inter-Event Time[J]. IEEE/CAA Journal of Automatica Sinica, 2024, 11(3): 649-660. doi: 10.1109/JAS.2023.123582 |
[3] | Yalin Zhang, Zhongxin Liu, Zengqiang Chen. A PI+R Control Scheme Based on Multi-Agent Systems for Economic Dispatch in Isolated BESSs[J]. IEEE/CAA Journal of Automatica Sinica, 2024, 11(10): 2154-2165. doi: 10.1109/JAS.2024.124236 |
[4] | Luigi D’Alfonso, Francesco Giannini, Giuseppe Franzè, Giuseppe Fedele, Francesco Pupo, Giancarlo Fortino. Autonomous Vehicle Platoons In Urban Road Networks: A Joint Distributed Reinforcement Learning and Model Predictive Control Approach[J]. IEEE/CAA Journal of Automatica Sinica, 2024, 11(1): 141-156. doi: 10.1109/JAS.2023.123705 |
[5] | Kun Jiang, Wenzhang Liu, Yuanda Wang, Lu Dong, Changyin Sun. Discovering Latent Variables for the Tasks With Confounders in Multi-Agent Reinforcement Learning[J]. IEEE/CAA Journal of Automatica Sinica, 2024, 11(7): 1591-1604. doi: 10.1109/JAS.2024.124281 |
[6] | Yisha Li, Ya Zhang, Xinde Li, Changyin Sun. Regional Multi-Agent Cooperative Reinforcement Learning for City-Level Traffic Grid Signal Control[J]. IEEE/CAA Journal of Automatica Sinica, 2024, 11(9): 1987-1998. doi: 10.1109/JAS.2024.124365 |
[7] | Jiawen Kang, Junlong Chen, Minrui Xu, Zehui Xiong, Yutao Jiao, Luchao Han, Dusit Niyato, Yongju Tong, Shengli Xie. UAV-Assisted Dynamic Avatar Task Migration for Vehicular Metaverse Services: A Multi-Agent Deep Reinforcement Learning Approach[J]. IEEE/CAA Journal of Automatica Sinica, 2024, 11(2): 430-445. doi: 10.1109/JAS.2023.123993 |
[8] | Jingshu Sang, Dazhong Ma, Yu Zhou. Group-Consensus of Hierarchical Containment Control for Linear Multi-Agent Systems[J]. IEEE/CAA Journal of Automatica Sinica, 2023, 10(6): 1462-1474. doi: 10.1109/JAS.2023.123528 |
[9] | Xiuyang Chen, Changbing Tang, Zhao Zhang. A Game Theoretic Approach for a Minimal Secure Dominating Set[J]. IEEE/CAA Journal of Automatica Sinica, 2023, 10(12): 2258-2268. doi: 10.1109/JAS.2023.123315 |
[10] | Haihua Guo, Min Meng, Gang Feng. Lyapunov-Based Output Containment Control of Heterogeneous Multi-Agent Systems With Markovian Switching Topologies and Distributed Delays[J]. IEEE/CAA Journal of Automatica Sinica, 2023, 10(6): 1421-1433. doi: 10.1109/JAS.2023.123198 |
[11] | Feiye Zhang, Qingyu Yang, Dou An. Privacy Preserving Demand Side Management Method via Multi-Agent Reinforcement Learning[J]. IEEE/CAA Journal of Automatica Sinica, 2023, 10(10): 1984-1999. doi: 10.1109/JAS.2023.123321 |
[12] | Zhe Chen, Ning Li. An Optimal Control-Based Distributed Reinforcement Learning Framework for A Class of Non-Convex Objective Functionals of the Multi-Agent Network[J]. IEEE/CAA Journal of Automatica Sinica, 2023, 10(11): 2081-2093. doi: 10.1109/JAS.2022.105992 |
[13] | Airong Wei, Xiaoming Hu, Yuzhen Wang. Tracking Control of Leader-follower Multi-agent Systems Subject to Actuator Saturation[J]. IEEE/CAA Journal of Automatica Sinica, 2014, 1(1): 84-91. |
[14] | Wenhui Liu, Feiqi Deng, Jiarong Liang, Haijun Liu. Distributed Average Consensus in Multi-agent Networks with Limited Bandwidth and Time-delays[J]. IEEE/CAA Journal of Automatica Sinica, 2014, 1(2): 193-203. |
[15] | Jie Huang, Hao Fang, Lihua Dou, Jie Chen. An Overview of Distributed High-order Multi-agent Coordination[J]. IEEE/CAA Journal of Automatica Sinica, 2014, 1(1): 1-9. |
[16] | Jinlong Wang, Qianchuan Zhao, Haitao Li. A Multi-agent Based Evaluation Framework and Its Applications[J]. IEEE/CAA Journal of Automatica Sinica, 2014, 1(2): 218-224. |
[17] | Chuanrui Wang, Xinghu Wang, Haibo Ji. A Continuous Leader-following Consensus Control Strategy for a Class of Uncertain Multi-agent Systems[J]. IEEE/CAA Journal of Automatica Sinica, 2014, 1(2): 187-192. |
[18] | Hao Zhang, Gang Feng, Huaicheng Yan, Qijun Chen. Distributed Self-triggered Control for Consensus of Multi-agent Systems[J]. IEEE/CAA Journal of Automatica Sinica, 2014, 1(1): 40-45. |
[19] | Xin Chen, Bo Fu, Yong He, Min Wu. Timesharing-tracking Framework for Decentralized Reinforcement Learning in Fully Cooperative Multi-agent System[J]. IEEE/CAA Journal of Automatica Sinica, 2014, 1(2): 127-133. |
[20] | Huiyang Liu, Long Cheng, Min Tan, Zengguang Hou. Containment Control of General Linear Multi-agent Systems with Multiple Dynamic Leaders: a Fast Sliding Mode Based Approach[J]. IEEE/CAA Journal of Automatica Sinica, 2014, 1(2): 134-140. |
Methods | Numbers of DAEs | ||
4 | 6 | 8 | |
Nearest | 1.65 | 2.12 | 2.51 |
SA | 3.01 | 3.23 | 5.06 |
DDPG | 3.21 | 3.35 | 5.14 |
Proposed method | 3.73 | 3.99 | 5.75 |