A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 5 Issue 1
Jan.  2018

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 15.3, Top 1 (SCI Q1)
    CiteScore: 23.5, Top 2% (Q1)
    Google Scholar h5-index: 77, TOP 5
C. F. Wang, Z. C. Bi, and Y. P. Wan, “Secure underwater distributed antenna systems: A multi-agent reinforcement learning approach,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 7, pp. 1622–1624, Jul. 2023. doi: 10.1109/JAS.2023.123366
Citation: Yang Yang and Dong Yue, "Distributed Tracking Control of a Class of Multi-agent Systems in Non-affine Pure-feedback Form Under a Directed Topology," IEEE/CAA J. Autom. Sinica, vol. 5, no. 1, pp. 169-180, Jan. 2018. doi: 10.1109/JAS.2017.7510382

Distributed Tracking Control of a Class of Multi-agent Systems in Non-affine Pure-feedback Form Under a Directed Topology

doi: 10.1109/JAS.2017.7510382
Funds:

the National Natural Science Foundation of China 61503194

the National Natural Science Foundation of China 61533010

the National Natural Science Foundation of China 61374055

the Ph.D. Programs Foundation of Ministry of Education of China 20110142110036

the Natural Science Foundation of Jiangsu Province BK20131381

the Natural Science Foundation of Jiangsu Province BK20140877

China Postdoctoral Science Foundation 2015M571788

Jiangsu Planned Projects for Postdoctoral Research Funds 1402066B

the Foundation of the Key Laboratory of Marine Dynamic Simulation and Control for the Ministry of Transport (DMU) DMUMSCKLT2016005

Jiangsu Government Scholarship for Overseas Studies 2017-037

the Key University Natural Science Research Project of Jiangsu Province 17KJA120003

the Scientific Foundation of Nanjing University of Posts and Telecommunications (NUPTSF) NY214076

More Information
  • In this paper, we consider a consensus tracking problem of a class of networked multi-agent systems (MASs) in non-affine pure-feedback form under a directed topology. A distributed adaptive tracking consensus control scheme is constructed recursively by the backstepping method, graph theory, neural networks (NNs) and the dynamic surface control (DSC) approach. The key advantage of the proposed control strategy is that, by the DSC technique, it avoids "explosion of complexity" problem along with the increase of the degree of individual agents and thus the computational burden of the scheme can be drastically reduced. Moreover, there is no requirement for prior knowledge about system parameters of individual agents and uncertain dynamics by employing NNs approximation technology. We then further show that, in theory, the designed control policy guarantees the consensus errors to be cooperatively semi-globally uniformly ultimately bounded (CSUUB). Finally, two examples are presented to validate the effectiveness of the proposed control strategy.

     

  • Dear Editor,

    Underwater distributed antenna systems (DAS) are stationary infrastructures consisting of multiple geographically distributed antenna elements (DAEs) which are interconnected through high-rate backbone networks [1]. Compared to centralized systems, the DAS could provide a larger coverage area and higher throughput for underwater acoustic (UWA) transmissions. In this work, exploiting the low sound speed in water, a multi-agent reinforcement learning (MARL)-based approach is proposed to secure underwater DAS against eavesdropping at the physical layer. Specifically, the theoretical secrecy rate is firstly derived for time-slotted UWA networks (UWANs) considering the large propagation delays. Furthermore, we investigate the long-term sum secrecy rate optimization problem under the MARL framework, where each DAE learns its optimal transmission strategy online. Simulation results show that the proposed method achieves higher secrecy performance compared to competing benchmark methods.

    Typical physical-layer security approaches against eavesdropping in UWANs include: secret key generation based on the randomness of UWA channels [2]; cooperative jamming where the transmission strategies of friendly jammers are optimized to blind the eavesdropper (EVE) [3]; and secure coordinated multipoint (CoMP) transmissions to enforce time-domain self-interference at the EVE [4], [5]. Particularly about the secure CoMP transmissions, by coordinated transmission scheduling of multiple DAEs with low sound speed in water, the decoding performance of the EVE can be significantly suppressed by collisions of useful signals while the signals received by the legitimate user (LU) are collision-free. However, this type of security mechanism is effective with specific transmission protocols and cannot be applied directly to general UWANs. In addition, the CoMP transmissions heavily rely on efficient coordination of all the transmitters while sharing information with UWA transmissions results in great overhead and latency. In this work, we study the security enhancement against eavesdropping for time-slotted UWANs with CoMP transmissions by taking advantage of the low coordination cost of underwater DAS.

    Reinforcement learning (RL) has been leveraged to secure terrestrial radio networks against eavesdropping at the physical layer [6]–[8]. The basic idea is to let the system learn optimal transmission strategies, e.g., transmitting nodes, transmission power, or beamforming vectors, to maximize the secrecy performance through dynamically interacting with environments. However, due to the nonnegligible transmission latency of UWA transmissions, those methods cannot be directly applied to UWANs. Although RL has been introduced to secure UWANs with privacy-preserving localization [9] and anti-jamming relay design [10], to the best of our knowledge, there is no work that exploits RL to secure UWA transmissions against eavesdropping at the physical layer. Hence, considering the large propagation delays, we propose an MARL-based framework to secure the underwater DAS, where all the DAEs coordinately learn their transmission strategies online to improve the network secrecy performance.

    System model and secrecy rate: We first consider an underwater system where N DAEs coordinate with each other to transmit signal blocks to a LU while an EVE collects transmitted signals from the DAEs. In this study, we consider that the EVE’s location information is known a priori to the DAS, which has also been considered in many existing works, e.g., [5], [6] and [9]. The underwater system operates in a slotted-based manner. Specifically, in each time slot, each DAE decides its transmission strategy including whether to transmit one signal block to the LU and the transmission power of each block. Denote $ d_\mu(\ell) \in \{0,1\} $ as the transmission schedule in the $ \ell $th time slot. $ d_\mu (\ell) = 1 $ indicates that the μth DAE is active while $ d_\mu(\ell)=0 $ implies that the μth DAE keeps silent. Denote $ p_\mu(\ell) $ as the transmission power of the signal block sent by the μth DAE in the $ \ell $th time slot. We further assume that if $ d_\mu(\ell)=0 $, the transmission power $ p_\mu(\ell)= 0 $. Denote $ g_\mu $ and $ g_\mu^{\rm (e)} $ as the transmission losses from μth DAE to the LU and the EVE, respectively. The transmission losses depend on the transmission center frequency of the acoustic signals and the propagation distances of transmission links.

    Denote $ D_\mu $ and $ D^{\rm (e)}_\mu $ as the propagation delays measured in time slots from the μth DAE to the LU and the EVE, respectively. Denote $ {\cal{ I}}(\ell) $ and $ {\cal{ I}}^{\rm (e)}(\ell) $ as interfering DAE sets indicating the DAEs whose transmitted blocks are interfered with each other at the LU and the EVE in the $ \ell $th time slot, respectively. An illustration of the sets $ {\cal{ I}}(\ell) $ and $ {\cal{ I}}^{\rm (e)}(\ell) $ is shown in Fig. 1. It is worth noting that the two sets $ {\cal{ I}}(\ell) $ and $ {\cal{ I}}^{\rm (e)}(\ell) $ are unknown until the end of $ \ell $th time slot.

    Figure  1.  An illustration of secure CoMP transmissions against eavesdropping in underwater DAS with large propagation delays and examples of interfering DAE sets. $ {\cal{ I}}^{\rm (e)}(3)=\{1,2,3\} $ indicates that, in the $ 3 $rd time slot, the signal blocks from DAEs 1−3 interfere with each other at the EVE while $ {\cal{ I}}(4)=\{1,3\} $ indicates that, in the $ 4 $th time slot, the signal blocks from DAEs 1 and 3 are interfered with each other at the LU. $ \check{{\cal{ I}}}^{\rm (e)}(1,3)=\{1,2\} $ shows that, when viewed from the end of the 1st time slot, the DAEs 1 and 2 will be interfered at the EVE in the coming 3rd time slot while $ \check{{\cal{ I}}}(1,2)=\{2\} $ shows that, when viewed from the end of the 1st time slot, the LU will only receive a signal block from DAE 2 in the 2nd time slot.

    If the μth DAE decides to transmit in the time slot $ \ell $, its transmitted signal block will be received by the LU and the EVE in the $ (\ell{+}D_\mu) $th and the $ (\ell{+}D^{(\rm e)}_\mu) $th time slots, respectively. Hence, the set of DAEs interfered with the μth DAE at the LU and the EVE can be described by $ {\cal{ I}}(\ell{+}D_\mu) $ and $ {\cal{ I}}^{\rm (e)}(\ell{+}D^{(\rm e)}_\mu) $, respectively. If $ \nu \in {\cal{ I}}(\ell{+}D_\mu)|_{\nu{\neq}\mu} $, there must be one signal block transmitted by the νth DAE in the $ (\ell{+}D_\mu{-}D_\nu) $th time slot. Based on [3], considering single-block processing, the signal-to-interference-and-noise ratio (SINR) of the received signal block at the LU sent by the μth DAE in the $ \ell $th time slot can be formulated as

    $$ \begin{split} \lambda_\mu(\ell;&{\cal{ I}}(\ell+D_\mu)) :=\\ &\frac{p_\mu(\ell) g_\mu }{\sigma^2_n+\sum_{\nu \in {\cal{ I}}(\ell+D_\mu),\nu \neq \mu} p_{\nu}(\ell+D_\mu-D_\nu)g_\nu} \end{split} $$ (1)

    where $ \lambda_\mu(\ell;{\cal{ I}}(\ell{+}D_\mu)) $ is conditioned on the set $ {\cal{ I}}(\ell{+}D_\mu) $, $ \sigma_n $ is the power of background noise, and $ p_{\nu}(\ell{+}D_\mu{-}D_\nu) $ corresponds to the transmission power performed by the νth DAE which interferes with the μth DAE in the $ (\ell{+}D_\mu) $th time slot. Similarly, the SINR of the same signal block received by the EVE can be obtained by

    $$ \begin{split} \lambda^{\rm {(e)}}_\mu(\ell;&{\cal{ I}}^{\rm {(e)}}(\ell+D^{\rm {(e)}}_\mu)) :=\\ &\frac{p_\mu(\ell) g^{\rm{(e})}_\mu }{\sigma^2_n+\sum_{\nu \in {\cal{ I}}^{\rm {(e)}}(\ell+D^{\rm {(e)}}_\mu),\nu \neq \mu} p_{\nu}(\ell+D^{\rm {(e)}}_\mu-D^{\rm {(e)}}_\nu)g^{\rm {(e)}}_\nu} \end{split} $$ (2)

    where $ \lambda^{\rm (e)}_\mu(\ell;{\cal{ I}}^{\rm (e)}(\ell{+}D^{\rm (e)}_\mu)) $ is conditioned on the set $ {\cal{ I}}^{\rm (e)}(\ell{+}D^{\rm (e)}_\mu) $.

    To derive the theoretical secrecy rate (SR), we first assume that the global information of transmission strategy is available. For instance, $ \{d_\mu(\ell)\}_{\ell=0}^\infty $ is assumed to be known in order to obtain the sets $ {\cal{ I}}(\ell) $ and $ {\cal{ I}}^{\rm (e)}(\ell) $. Based on the SINRs defined in (1) and (2), the SR of the transmitted signal block from the μth DAE in the $ \ell $th time slot can be cast as

    $$ \begin{split} C_\mu(\ell) =\;&\frac{1}{2}\Big[ \log\big(1+\lambda_\mu(\ell;{\cal{ I}}(\ell+D_\mu))\big) \\ &-\log\big(1+ \lambda^{\rm (e)}_\mu(\ell;{\cal{ I}}^{\rm (e)}(\ell+D^{\rm (e)}_\mu))\big)\Big]^+ \end{split} $$ (3)

    where $ [\cdot]^+=\max\{0,\cdot\} $. The SR depicts how much secure information can be delivered to the LU, taking into account the information leakage to the EVE. To prevent the underwater DAS against eavesdropping, by properly determining the transmission strategy including the transmission schedule $ d_\mu(\ell) $ and transmission power $ p_\mu(\ell) $ of each DAE, an optimization problem to maximize the long-term sum SR of all the DAEs can be cast as

    $$ \begin{align} \max_{\{d_\mu(\ell), p_\mu(\ell):1\le \mu\le N\}_{\ell=0}^{\infty}}& \sum_{\ell=0}^{\infty} \sum_{\mu=1}^{N}C_\mu(\ell) \end{align}\tag{4a} $$
    $$ \qquad\begin{align} \text{s.t.} \qquad \lambda_\mu(\ell;{\cal{ I}}(\ell+D_\mu))&\ge \Gamma_{\rm th},\;\; \text{if}\; d_\mu=1 \end{align}\tag{4b} $$

    where $ \Gamma_{\rm th} $ is a predetermined decoding threshold. Inequality (4b) ensures that the received block at the LU can be successfully decoded. The optimization problem (4) is a mixed-integer nonlinear programming problem combined with sequential decision-making, which in general is difficult to solve. Specifically, without the global information $ \{{\cal{ I}}(\ell),{\cal{ I}}^{\rm (e)}(\ell)\}_{\ell=0}^\infty $, its decision space is notably large, which results in the curse of dimensionality (CoD) and suboptimal solutions. In this work, we adopt RL to solve (4) in a tractable way by learning the transmission strategies of all the DAEs online.

    RL reformulation: To reformulate (4) in the RL paradigm, a Markov decision process (MDP) $ \langle{\cal{ A}},{\cal{ S}},T({\bf{S}},{\bf{A}},{\bf{S}}'),R({\bf{S}},{\bf{A}})\rangle $ must be defined where $ {\cal{ A}} $ is the action space, $ {\cal{ S}} $ is the system state space, $ T({\bf{S}},{\bf{A}},{\bf{S}}') $ is the state transition function indicating how the system state evolves from the states $ {\bf{S}} $ to $ {\bf{S}}' $ by performing the action $ {\bf{A}} $, where $ {\bf{S}},{\bf{S}}'\in{\cal{ S}} $ and $ {\bf{A}}\in{\cal{ A}} $, and $ R({\bf{S}},{\bf{A}}) $ is the reward obtained by performing the action $ {\bf{A}} $ under the system state $ {\bf{S}} $. Considering that each DAE acts as an RL agent, the action, state, and reward function are defined as follows.

    1) Action: Denote $ {\bf{a}}_\mu(\ell) $ as the transmission action for the μth DAE in the $ \ell $th time slot. It consists of the transmission schedule and transmission power, i.e., $ {\bf{a}}_\mu(\ell) {:=} \{d_\mu(\ell),p_\mu(\ell)\} $. Denote $ {\bf{A}}(\ell) $ as the action of the DAS in the $ \ell $th time slot which includes actions from all the DAEs, i.e., $ {\bf{A}}(\ell) = \{{\bf{a}}_1(\ell),{\bf{a}}_2(\ell),\ldots,{\bf{a}}_N(\ell)\} $. $ \{{\bf{A}}(\ell)\}_{\ell=0}^\infty $ can now fully describe the optimization variables in (4).

    2) State: As the SR $ C_\mu(\ell) $ with $ d_\mu(\ell)=1 $ can only be calculated in the future time slot, to describe the system status in the current $ \ell $th time slot, we define the intermediate SR observation as

    $$ \begin{split} \check{C}_\mu(\ell) :=\;&\frac{1}{2}\Big[ \log\big(1+\lambda_\mu(\ell;\check{{\cal{ I}}}(\ell,\ell+D_\mu))\big) \\ &-\log\big(1+ \lambda^{\rm (e)}_\mu(\ell;\check{{\cal{ I}}}^{\rm (e)}(\ell,\ell+D^{\rm (e)}_\mu))\big)\Big]^+ \end{split} $$ (5)

    where $ \check{{\cal{ I}}}(\ell_1,\ell_2) $ and $ \check{{\cal{ I}}}^{\rm (e)}(\ell_1,\ell_2) $ are interfering DAE sets indicating that when viewed from the end of the $ \ell_1 $th time slot, which DAEs will be interfered at the LU and the EVE in the $ \ell_2 $th time slot, respectively. Please note that $ \check{{\cal{ I}}}(\ell_1,\ell_2){\subseteq}{\cal{ I}}(\ell_2) $ and $ \check{{\cal{ I}}}^{\rm (e)}(\ell_1,\ell_2){\subseteq}{\cal{ I}}^{\rm (e)}(\ell_2) $, as shown in Fig. 1. Compared to the SR defined in (3), the intermediate SR describes the secrecy level induced by the previous and current transmission actions and does not take future actions into account.

    We construct a tuple for the μth DAE as $ {\bf{o}}_\mu(\ell) = \{{\bf{a}}_\mu(\ell),\check{C}_\mu(\ell)\} $ containing the action-observation pair in the $ \ell $th time slot. The system state for the μth DAE can now be defined as ${\bf{s}}_\mu(\ell) {:=} \{{\bf{o}}_\mu(\ell{-}1), {\bf{o}}_\mu(\ell{-}2), \ldots,{\bf{o}}_\mu(\ell{-}K)\}$ where K historical action-observation pairs are included. Denote $ {\bf{S}}(\ell) $ as the system state in the $ \ell $th time slot containing the states of all the DAEs, i.e., $ {\bf{S}}(\ell) = \{{\bf{s}}_1(\ell),{\bf{s}}_2(\ell),\ldots,{\bf{s}}_N(\ell)\} $. The action $ {\bf{A}}(\ell) $ is determined based on the current state $ {\bf{S}}(\ell) $ and then the system state evolves to $ {\bf{S}}(\ell{+}1) $ after observing the intermediate SRs $ \check{C}_\mu(\ell)|_{1\le\mu\le N} $.

    3) Reward: Denote $ r_\mu(\ell) $ as the reward function for the μth DAE in the $ \ell $th time slot after performing the action $ {\bf{a}}_\mu(\ell) $ under the state $ {\bf{s}}_\mu(\ell) $. The reward function is crucial as it guides each agent to learn the mapping from the state to the optimal action while achieving the desired goal. In this work, to pursue the optimization objective (4a) and satisfy the constraint (4b), we consider the reward function as

    $$ \begin{equation} r_\mu(\ell) = C_\mu(\ell) - \beta\big[\Gamma_{\rm th}-\lambda_\mu(\ell;{\cal{ I}}(\ell+D_\mu))\big]^+d_\mu(\ell) \end{equation} $$ (6)

    where $ \beta{>}0 $ is a penalty factor for violation of the constraint (4b). Please note that the reward function (6) contains the SR $ C_\mu(\ell) $ rather than $ \check{C}_\mu(\ell) $. Hence, it cannot be collected immediately by the end of the $ \ell $th time slot and is only available to the system in the future. In other words, due to the large propagation delays, the system receives delayed rewards. The sum reward received by the whole system in the $ \ell $th time slot can be calculated by $ R(\ell) = \sum_{\mu=1}^{N}r_\mu(\ell) $ where $ R(\ell) $ is obviously a function of the action $ {\bf{A}}(\ell) $ and state $ {\bf{S}}(\ell) $.

    Based on the MDP elements defined above, the optimization problem (4) can be reformulated to an RL problem with the aim of maximizing the long-term expected sum reward as

    $$ \begin{equation} \max\limits_{\{{\bf{A}}(\ell),{\bf{S}}(\ell)\}_{\ell=0}^{\infty}} \ \mathbb{E}\left\{\sum\limits_{\ell=0}^{\infty}\gamma^{\ell} R(\ell) \right\} \end{equation} $$ (7)

    where $ \gamma \in (0,1] $ is a discounted factor and $ \mathbb{E}(\cdot) $ is the expectation w.r.t. the joint distribution of state visitation and policy. In this work, an MARL method, i.e., multi-agent policy gradient (MADDPG) algorithm [11], is exploited to solve the optimization problem (7). In the MADDPG algorithm, each agent has an actor and a critic that learn to generate better actions and evaluate the generated actions of all the agents, respectively, where the two can be mathematically described by deep neural networks.

    Performance evaluation: To evaluate the proposed method, we consider an underwater network consisting of 4 DAEs and assume that the DAEs, LU, and EVE are within a disk area of a radius of 4 km. We consider that the sound speed in water is $ 1500 $ m/s, the center frequency of the transmitted signal block is 13 KHz, and the duration of each block is 0.1 s. Similar to the actor proposed in [12], the actor network in this study passes the state into a two-layer perceptron (TLP) followed by two ResNet blocks to generate an action. For the critic network, the state and action are firstly inputted into a CNN encoder and a TLP, respectively. Next, the concatenation of the outputs from the two network components is fed into three ResNet blocks to obtain an estimation of the expected reward for the inputting state-action pair. The Adam optimizer is used to update the two networks.

    We compare the proposed method to three methods: 1) Nearest: Select the DAE closest to the LU to transmit signal blocks all the time with the maximal transmission power; 2) SA: A modified version of the signal alignment method with the known location of the EVE [5] for time-slotted systems; 3) DDPG: Determine the transmission strategies of all the DAEs by a single agent with the DDPG algorithm.

    The average SRs with different methods as well as the link capacity of the LU in Nearest are shown in Fig. 2. The received SNR corresponds to the ratio of the received signal power to the noise power. One can see that the average SR achieved by Nearest eventually converges as the received SNR increases while the rates achieved by SA, DDPG, and the proposed method grow monotonically with the received SNR. The efficacy of the proposed method is demonstrated as it outperforms all the other methods. It also shows that, compared to DDPG where only one agent learns the actions for all the DAEs, the proposed method enables higher secrecy performance, thanks to cooperative learning with multiple agents. Moreover, the average SRs gained by different methods under different total numbers of DAEs are presented in Table 1. It shows that the SRs of all the methods increase with the total number of DAEs as more DAEs could offer higher degrees of freedom to optimize the secrecy performance. Nevertheless, the proposed method still achieves the greatest SRs, which further validates its effectiveness.

    Figure  2.  Comparison of average SRs achieved by different methods.
    Table  1.  Comparison of SRs With Different Total Numbers of DAEs
    Methods Numbers of DAEs
    4 6 8
    Nearest 1.65 2.12 2.51
    SA 3.01 3.23 5.06
    DDPG 3.21 3.35 5.14
    Proposed method 3.73 3.99 5.75
     | Show Table
    DownLoad: CSV

    Fig. 3 shows the learning performance of the proposed method in the case that an EVE stays in one location until the $ 15\,000 $th time slot and then moves to another location. It shows the moving average of the rewards over 100 time slots. One can see that the system collects negative rewards at the beginning due to violation of the constraint (4b) while the transmission policy gets improved through the learning process and the average reward eventually converges to the optimum. Immediately after the movement of the EVE, the rewards drop significantly since the current transmission policy is outdated w.r.t. the new location of the EVE. However, the system could adjust its policy to the new environment and finally reach its equilibrium, which exhibits the adaptivity of the proposed method.

    Figure  3.  Average reward with the proposed method in the case that an EVE moves to another location in the $15\,000$th time slot.

    Conclusion: This letter explored an MARL-based method to secure transmissions in underwater DAS against eavesdropping. Considering the large propagation delays of UWA transmissions, the secrecy rate was first derived for practical time-slotted UWANs. Then, the long-term sum secrecy rate maximization problem was studied in the RL paradigm, where each DAE learned its transmission schedule and transmission power online based on the MADDPG algorithm. The simulation results showed the efficacy of the proposed method compared to benchmark methods. In future, we will extend the proposed method for underwater DAS with multiple LUs and multiple EVEs, where the interference relation is more challenging. In addition, how to adapt the proposed framework to underwater systems where their nodes, e.g., LUs, EVEs, or even DAEs, can move over time, is also an important research direction for our future work.

    Acknowledgments: This work was supported in part by the National Natural Science Foundation of China (62201248), and the Startup Foundation of the University of South China (200XQD056).

  • [1]
    Y. C. Cao, W. W. Yu, W. Ren, and G. R. Chen, "An overview of recent progress in the study of distributed multi-agent coordination, " IEEE Trans. Industr. Inform., vol. 9, no. 1, pp. 427-438, Feb. 2013. http://ieeexplore.ieee.org/document/6303906/
    [2]
    L. Ding, Q. L. Han, and G. Guo, "Network-based leader-following consensus for distributed multi-agent systems, " Automatica, vol. 49, no. 7, pp. 2281-2286, Jul. 2013. https://www.sciencedirect.com/science/article/pii/S0005109813002331
    [3]
    G. Guo, L. Ding, and Q. L. Han, "A distributed event-triggered transmission strategy for sampled-data consensus of multi-agent systems, " Automatica, vol. 50, no. 5, pp. 1489-1496, May 2014. https://www.sciencedirect.com/science/article/pii/S0005109814001095
    [4]
    W. L. He, B. Zhang, Q. L. Han, F. Qian, J. Kurths, and J. D. Cao, "Leader-following consensus of nonlinear multiagent systems with stochastic sampling, " IEEE Trans. Cybern., vol. 47, no. 2, pp. 327-338, Feb. 2017. http://ieeexplore.ieee.org/document/7407343/
    [5]
    H. W. Zhang, F. L. Lewis, and Z. H. Qu, "Lyapunov, adaptive, and optimal design techniques for cooperative systems on directed communication graphs, " IEEE Trans. Ind. Electron., vol. 59, no. 7, pp. 3026-3041, Jul. 2012. http://ieeexplore.ieee.org/document/5898403/
    [6]
    Z. H. Peng, D. Wang, H. W. Zhang, and G. Sun, "Distributed neural network control for adaptive synchronization of uncertain dynamical multiagent systems, " IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 8, pp. 1508-1519, Aug. 2014.
    [7]
    D. M. Yuan and D. W. C. Ho, "Randomized gradient-free method for multiagent optimization over time-varying networks, " IEEE Trans. Neural Netw. Learn. Syst., vol. 26, no. 6, pp. 1342-1347, Jun. 2015. http://ieeexplore.ieee.org/iel7/5962385/7109211/06870494.pdf?arnumber=6870494
    [8]
    K. C. Cao, B. Jiang, and D. Yue, "Distributed consensus of multiple nonholonomic mobile robots, " IEEE/CAA J. of Autom. Sinica, vol. 1, no. 2, pp. 162-170, Apr. 2014. http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=7004546&openedRefinements%3D*%26filter%3DAND%28AND%28NOT%284283010803%29%29%2CAND%28NOT%284283010803%29%29%29%26pageNumber%3D5%26rowsPerPage%3D100%26queryText%3D%28robots%29
    [9]
    J. Huang, H. Fang, L. H. Dou, and J. Chen, "An overview of distributed high-order multi-agent coordination, " IEEE/CAA J. of Autom. Sinica, vol. 1, no. 1, pp. 1-9, Jan. 2014. http://ieeexplore.ieee.org/document/7004613/
    [10]
    X. H. Ge, F. W. Yang, and Q. L. Han, "Distributed networked control systems: a brief overview, " Inform. Sciences, vol. 380, pp. 117-131, Feb. 2017.
    [11]
    R. Olfati-Saber, "Flocking for multi-agent dynamic systems: algorithms and theory, " IEEE Trans. Automat. Contr., vol. 51, no. 3, pp. 401-420, Mar. 2006. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1605401
    [12]
    J. P. Hu and G. Feng, "Distributed tracking control of leader-follower multi-agent systems under noisy measurement, " Automatica, vol. 46, no. 8, pp. 1382-1387, Aug. 2010.
    [13]
    Y. Q. Zhang and Y. G. Hong, "Distributed control design for leader escort of multi-agent systems, " Int. J. Control, vol. 88, no. 5, pp. 935-945, May 2015.
    [14]
    Z. Y. Meng, Z. Y. Zhao, and Z. L. Lin, "On global leader-following consensus of identical linear dynamic systems subject to actuator saturation, " Syst. Control Lett., vol. 62, no. 2, pp. 132-142, 2013. doi: 10.1016/j.sysconle.2012.10.016
    [15]
    X. L. Liu, B. G. Xu, and L. H. Xie, "Distributed tracking control of second-order multi-agent systems under measurement noises, " J. Syst. Sci. Complex., vol. 27, no. 5, pp. 853-865, Oct. 2014.
    [16]
    L. P. Mo, Y. G. Niu, and T. T. Pan, "Consensus of heterogeneous multi-agent systems with switching jointly-connected interconnection, " Physica A, vol. 427, pp. 132-140, Jun. 2015. http://www.sciencedirect.com/science/article/pii/S0378437115000898
    [17]
    Z. H. Wu, H. J. Fang, and Y. Y. She, "Weighted average prediction for improving consensus performance of second-order delayed multiagent systems, " IEEE Trans. Syst. Man Cybern. B Cybern., vol. 42, no. 5, pp. 1501-1508, Oct. 2012. http://ieeexplore.ieee.org/document/6171867/
    [18]
    H. Kim, H. Shim, and J. H. Seo, "Output consensus of heterogeneous uncertain linear multi-agent systems, " IEEE Trans. Automat. Contr., vol. 56, no. 1, pp. 200-206, Jan. 2011. http://ieeexplore.ieee.org/document/5605658/
    [19]
    H. W. Zhang and F. L. Lewis, "Adaptive cooperative tracking control of higher-order nonlinear systems with unknown dynamics, " Automatica, vol. 48, no. 7, pp. 1432-1439, Jul. 2012.
    [20]
    S. El-Ferik, A. Qureshi, and F. L. Lewis, "Neuro-adaptive cooperative tracking control of unknown higher-order affine nonlinear systems, " Automatica, vol. 50, no. 3, pp. 798-808, Mar. 2014.
    [21]
    W. W. Yu, G. R. Chen, M. Cao, and J. Kurths, "Second-order consensus for multiagent systems with directed topologies and nonlinear dynamics, " IEEE Trans. Syst. Man Cybern. B Cybern., vol. 40, no. 3, pp. 881-891, Jun. 2010. http://ieeexplore.ieee.org/document/5313874/
    [22]
    C. L. P. Chen, G. X. Wen, Y. J. Liu, and F. Y. Wang, "Adaptive consensus control for a class of nonlinear multiagent time-delay systems using neural networks, " IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 6, pp. 1217-1226, Jun. 2014.
    [23]
    L. Wang, W. Feng, M. Chen, and Q. Wang, "Global bounded consensus in heterogeneous multi-agent systems with directed communication graph, " IET Contr. Theory Appl., vol. 9, no. 1, pp. 147-152, Jan. 2015. http://ieeexplore.ieee.org/xpl/abstractKeywords.jsp?reload=true&arnumber=6987403&sortType%3Dasc_p_Sequence%26filter%3DAND%28p_IS_Number%3A6987394%29
    [24]
    Y. Dong and J. Huang, "Cooperative global output regulation for a class of nonlinear multi-agent systems, " IEEE Trans. Automat. Contr., vol. 59, no. 5, pp. 1348-1354, May 2014. http://ieeexplore.ieee.org/document/7403888/
    [25]
    C. R. Wang, J. Sheng, and H. B. Ji, "Leader-following consensus of a class of nonlinear multi-agent systems via dynamic output feedback control, " Trans. Inst. Meas. Contr., vol. 37, no. 2, pp. 154-163, Feb. 2015. doi: 10.1177/0142331214535408
    [26]
    W. Wang, J. S. Huang, C. Y. Wen, and H. J. Fan, "Distributed adaptive control for consensus tracking with application to formation control of nonholonomic mobile robots, " Automatica, vol. 50, no. 4, pp. 1254-1263, Apr. 2014.
    [27]
    J. M. Peng and X. D. Ye, "Distributed adaptive controller for the outputsynchronization of networked systems in semi-strict feedback form, " J. Franklin Inst., vol. 351, no. 1, pp. 412-428, Jan. 2014.
    [28]
    M. Krstic, P. V. Kokotovic, and I. Kanellakopoulos, Nonlinear and Adaptive Control Design. New York: John Wiley & Sons, Inc., 1995.
    [29]
    D. Swaroop, J. K. Hedrick, P. P. Yip, and J. C. Gerdes, "Dynamic surface control for a class of nonlinear systems, " IEEE Trans. Automat. Contr., vol. 45, no. 10, pp. 1893-1899, Oct. 2000.
    [30]
    D. Wang and J. Huang, "Neural network-based adaptive dynamic surface control for a class of uncertain nonlinear systems in strict-feedback form, " IEEE Trans. Neural Netw., vol. 16, no. 1, pp. 195-202, Jan. 2005.
    [31]
    S. J. Yoo and J. B. Park, "Decentralized adaptive output-feedback control for a class of nonlinear large-scale systems with unknown time-varying delayed interactions, " Inform. Sciences, vol. 186, no. 1, pp. 222-238, Mar. 2012.
    [32]
    S. C. Tong, Y. M. Li, G. Feng, and T. S. Li, "Observer-based adaptive fuzzy backstepping dynamic surface control for a class of MIMO nonlinear systems, " IEEE Trans. Syst. Man Cybern. B Cybern., vol. 41, no. 4, pp. 1124-1135, Aug. 2011.
    [33]
    T. S. Li, D. Wang, G. Feng, and S. C. Tong, "A DSC approach to robust adaptive NN tracking control for strict-feedback nonlinear systems, " IEEE Trans. Syst. Man Cybern. B Cybern., vol. 40, no. 3, pp. 915-927, Jun. 2010. http://or.nsfc.gov.cn/handle/00001903-5/84730
    [34]
    G. Sun, D. Wang, Z. H. Peng, H. Wang, W. Y. Lan, and M. X. Wang, "Robust adaptive neural control of uncertain pure-feedback nonlinear systems, " Int. J. Control, vol. 86, no. 5, pp. 912-922, Apr. 2013.
    [35]
    Y. M. Li, S. C. Tong, and T. S. Li, "Adaptive fuzzy output feedback dynamic surface control of interconnected nonlinear pure-feedback systems, " IEEE Trans. Cybern., vol. 45, no. 1, pp. 138-149, Jan. 2015.
    [36]
    Y. Yang, D. Yue, and Y. S. Xue, "Decentralized adaptive neural output feedback control of a class of large-scale time-delay systems with input saturation, " J. Franklin Inst., vol. 352, no. 5, pp. 2129-2151, May 2015.
    [37]
    S. J. Yoo, "Distributed consensus tracking for multiple uncertain nonlinear strict-feedback systems under a directed graph, " IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 4, pp. 666-672, Apr. 2013. http://ieeexplore.ieee.org/document/6415283/
    [38]
    W. Wang, D. Wang, Z. H. Peng, and T. S. Li, "Prescribed performance consensus of uncertain nonlinear strict-feedback systems with unknown control directions, " IEEE Trans. Syst. Man Cybern. Syst., vol. 46, no. 9, pp. 1279-1286, Sep. 2016.
    [39]
    Y. Zheng, Y. Zhu, and L. Wang, "Consensus of heterogeneous multiagent systems, " IET Contr. Theory Appl., vol. 5, no. 16, pp. 1881-1888, Nov. 2011. http://ieeexplore.ieee.org/iel5/4079545/6042759/06042766.pdf?arnumber=6042766
    [40]
    Y. F. Su and J. Huang, "Cooperative output regulation of linear multiagent systems, " IEEE Trans. Automat. Contr., vol. 57, no. 4, pp. 1062-1066, Apr. 2012. http://ieeexplore.ieee.org/document/6026912/
    [41]
    Z. T. Ding, "Consensus output regulation of a class of heterogeneous nonlinear systems, " IEEE Trans. Automat. Contr., vol. 58, no. 10, pp. 2648-2653, Oct. 2013. http://ieeexplore.ieee.org/document/6491446/
    [42]
    X. X. Yin, D. Yue, and S. L. Hu, "Distributed event-triggered control of discrete-time heterogeneous multi-agent systems, " J. Franklin Inst., vol. 350, no. 3, pp. 651-669, Apr. 2013.
    [43]
    S. B. Li, G. Feng, X. Y. Luo, and X. P. Guan, "Output consensus of heterogeneous linear discrete-time multiagent systems with structural uncertainties, " IEEE Trans. Cybern., vol. 45, no. 12, pp. 2868-2879, Dec. 2015. http://ieeexplore.ieee.org/document/7018013/
    [44]
    X. F. Zhang, L. Liu, and G. Feng, "Leader-follower consensus of timevarying nonlinear multi-agent systems, " Automatica, vol. 52, pp. 8-14, Feb. 2015. https://www.sciencedirect.com/science/article/pii/S0005109814005159
    [45]
    Z. H. Qu, Cooperative Control of Dynamical Systems: Applications to Autonomous Vehicles. London: Springer-Verlag, 2009.
    [46]
    S. S. Ge, C. C. Hang, T. H. Lee, and T. Zhang, Stable Adaptive Neural Network Control. US: Springer, 2002.
    [47]
    C. Wang, D. J. Hill, S. S. Ge, and G. R. Chen, "An ISS-modular approach for adaptive neural control of pure-feedback systems, " Automatica, vol. 42, no. 5, pp. 723-731, May 2006.
    [48]
    A. M. Zou, Z. G. Hou, and M. Tan, "Adaptive control of a class of nonlinear pure-feedback systems using fuzzy backstepping approach, " IEEE Trans. Fuzzy Syst., vol. 16, no. 4, pp. 886-897, Aug. 2008.
    [49]
    S. C. Tong, Y. M. Li, and Y. J. Liu, "Adaptive fuzzy output feedback decentralized control of pure-feedback nonlinear large-scale systems, " Int. J. Robust Nonlinear Control, vol. 24, no. 5, pp. 930-954, Mar. 2014.
    [50]
    E. Kim and S. Lee, "Output feedback tracking control of MIMO systems using a fuzzy disturbance observer and its application to the speed control of a PM synchronous motor, " IEEE Trans. Fuzzy Syst., vol. 13, no. 6, pp. 725-741, Dec. 2005.
    [51]
    C. Kwan and F. L. Lewis, "Robust backstepping control of nonlinear systems using neural networks, " IEEE Trans. Syst. Man Cybern A Syst. Hum., vol. 30, no. 6, pp. 753-766, Nov. 2000.
    [52]
    W. He, Y. H. Chen, and Z. Yin, "Adaptive neural network control of an uncertain robot with full-state constraints, " IEEE Trans. Cybern., vol. 46, no. 3, pp. 620-629, Mar. 2016.
    [53]
    W. He, S. Zhang, and S. S. Ge, "Adaptive control of a flexible crane system with the boundary output constraint, " IEEE Trans. Ind. Electron., vol. 61, no. 8, pp. 4126-4133, Aug. 2014.
    [54]
    X. M. Zhang, Q. L. Han, and X. H. Yu, "Survey on recent advances in networked control systems, " IEEE Trans. Industr. Inform., vol. 12, no. 5, pp. 1740-1752, Oct. 2016.
  • Related Articles

    [1]Hongmin Liu, Qi Zhang, Yufan Hu, Hui Zeng, Bin Fan. Unsupervised Multi-Expert Learning Model for Underwater Image Enhancement[J]. IEEE/CAA Journal of Automatica Sinica, 2024, 11(3): 708-722. doi: 10.1109/JAS.2023.123771
    [2]Meilin Li, Yue Long, Tieshan Li, Hongjing Liang, C. L. Philip Chen. Dynamic Event-Triggered Consensus Control for Input Constrained Multi-Agent Systems With a Designable Minimum Inter-Event Time[J]. IEEE/CAA Journal of Automatica Sinica, 2024, 11(3): 649-660. doi: 10.1109/JAS.2023.123582
    [3]Yalin Zhang, Zhongxin Liu, Zengqiang Chen. A PI+R Control Scheme Based on Multi-Agent Systems for Economic Dispatch in Isolated BESSs[J]. IEEE/CAA Journal of Automatica Sinica, 2024, 11(10): 2154-2165. doi: 10.1109/JAS.2024.124236
    [4]Luigi D’Alfonso, Francesco Giannini, Giuseppe Franzè, Giuseppe Fedele, Francesco Pupo, Giancarlo Fortino. Autonomous Vehicle Platoons In Urban Road Networks: A Joint Distributed Reinforcement Learning and Model Predictive Control Approach[J]. IEEE/CAA Journal of Automatica Sinica, 2024, 11(1): 141-156. doi: 10.1109/JAS.2023.123705
    [5]Kun Jiang, Wenzhang Liu, Yuanda Wang, Lu Dong, Changyin Sun. Discovering Latent Variables for the Tasks With Confounders in Multi-Agent Reinforcement Learning[J]. IEEE/CAA Journal of Automatica Sinica, 2024, 11(7): 1591-1604. doi: 10.1109/JAS.2024.124281
    [6]Yisha Li, Ya Zhang, Xinde Li, Changyin Sun. Regional Multi-Agent Cooperative Reinforcement Learning for City-Level Traffic Grid Signal Control[J]. IEEE/CAA Journal of Automatica Sinica, 2024, 11(9): 1987-1998. doi: 10.1109/JAS.2024.124365
    [7]Jiawen Kang, Junlong Chen, Minrui Xu, Zehui Xiong, Yutao Jiao, Luchao Han, Dusit Niyato, Yongju Tong, Shengli Xie. UAV-Assisted Dynamic Avatar Task Migration for Vehicular Metaverse Services: A Multi-Agent Deep Reinforcement Learning Approach[J]. IEEE/CAA Journal of Automatica Sinica, 2024, 11(2): 430-445. doi: 10.1109/JAS.2023.123993
    [8]Jingshu Sang, Dazhong Ma, Yu Zhou. Group-Consensus of Hierarchical Containment Control for Linear Multi-Agent Systems[J]. IEEE/CAA Journal of Automatica Sinica, 2023, 10(6): 1462-1474. doi: 10.1109/JAS.2023.123528
    [9]Xiuyang Chen, Changbing Tang, Zhao Zhang. A Game Theoretic Approach for a Minimal Secure Dominating Set[J]. IEEE/CAA Journal of Automatica Sinica, 2023, 10(12): 2258-2268. doi: 10.1109/JAS.2023.123315
    [10]Haihua Guo, Min Meng, Gang Feng. Lyapunov-Based Output Containment Control of Heterogeneous Multi-Agent Systems With Markovian Switching Topologies and Distributed Delays[J]. IEEE/CAA Journal of Automatica Sinica, 2023, 10(6): 1421-1433. doi: 10.1109/JAS.2023.123198
    [11]Feiye Zhang, Qingyu Yang, Dou An. Privacy Preserving Demand Side Management Method via Multi-Agent Reinforcement Learning[J]. IEEE/CAA Journal of Automatica Sinica, 2023, 10(10): 1984-1999. doi: 10.1109/JAS.2023.123321
    [12]Zhe Chen, Ning Li. An Optimal Control-Based Distributed Reinforcement Learning Framework for A Class of Non-Convex Objective Functionals of the Multi-Agent Network[J]. IEEE/CAA Journal of Automatica Sinica, 2023, 10(11): 2081-2093. doi: 10.1109/JAS.2022.105992
    [13]Airong Wei, Xiaoming Hu, Yuzhen Wang. Tracking Control of Leader-follower Multi-agent Systems Subject to Actuator Saturation[J]. IEEE/CAA Journal of Automatica Sinica, 2014, 1(1): 84-91.
    [14]Wenhui Liu, Feiqi Deng, Jiarong Liang, Haijun Liu. Distributed Average Consensus in Multi-agent Networks with Limited Bandwidth and Time-delays[J]. IEEE/CAA Journal of Automatica Sinica, 2014, 1(2): 193-203.
    [15]Jie Huang, Hao Fang, Lihua Dou, Jie Chen. An Overview of Distributed High-order Multi-agent Coordination[J]. IEEE/CAA Journal of Automatica Sinica, 2014, 1(1): 1-9.
    [16]Jinlong Wang, Qianchuan Zhao, Haitao Li. A Multi-agent Based Evaluation Framework and Its Applications[J]. IEEE/CAA Journal of Automatica Sinica, 2014, 1(2): 218-224.
    [17]Chuanrui Wang, Xinghu Wang, Haibo Ji. A Continuous Leader-following Consensus Control Strategy for a Class of Uncertain Multi-agent Systems[J]. IEEE/CAA Journal of Automatica Sinica, 2014, 1(2): 187-192.
    [18]Hao Zhang, Gang Feng, Huaicheng Yan, Qijun Chen. Distributed Self-triggered Control for Consensus of Multi-agent Systems[J]. IEEE/CAA Journal of Automatica Sinica, 2014, 1(1): 40-45.
    [19]Xin Chen, Bo Fu, Yong He, Min Wu. Timesharing-tracking Framework for Decentralized Reinforcement Learning in Fully Cooperative Multi-agent System[J]. IEEE/CAA Journal of Automatica Sinica, 2014, 1(2): 127-133.
    [20]Huiyang Liu, Long Cheng, Min Tan, Zengguang Hou. Containment Control of General Linear Multi-agent Systems with Multiple Dynamic Leaders: a Fast Sliding Mode Based Approach[J]. IEEE/CAA Journal of Automatica Sinica, 2014, 1(2): 134-140.

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(23)  / Tables(2)

    Article Metrics

    Article views (1444) PDF downloads(113) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return