A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 15.3, Top 1 (SCI Q1)
    CiteScore: 23.5, Top 2% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
X. Liang, Q. Wu, Y. Zhou, C. Tan, H. Yin, and C. Sun, “Spiking reinforcement learning enhanced by bioinspired event source of multi-dendrite spiking neuron and dynamic thresholds,” IEEE/CAA J. Autom. Sinica, vol. 12, no. 2, pp. 1–12, Feb. 2025. doi: 10.1109/JAS.2024.124551
Citation: X. Liang, Q. Wu, Y. Zhou, C. Tan, H. Yin, and C. Sun, “Spiking reinforcement learning enhanced by bioinspired event source of multi-dendrite spiking neuron and dynamic thresholds,” IEEE/CAA J. Autom. Sinica, vol. 12, no. 2, pp. 1–12, Feb. 2025. doi: 10.1109/JAS.2024.124551

Spiking Reinforcement Learning Enhanced by Bioinspired Event Source of Multi-dendrite Spiking Neuron and Dynamic Thresholds

doi: 10.1109/JAS.2024.124551
Funds:  This work was supported by the National Natural Science Foundation of China (62236002, 62303009, 62206001, 52305001, 62102387, 62206005), the University Synergy Innovation Program of Anhui Province (GXXT-2022-041), and the China Postdoctoral Science Foundation (2023M740013)
More Information
  • Deep reinforcement learning (DRL) achieves success through the representational capabilities of deep neural networks (DNNs). Compared to DNNs, spiking neural networks (SNNs), known for their binary spike information processing, exhibit more biological characteristics. However, the challenge of using SNNs to simulate more biologically characteristic neuronal dynamics to optimize decision-making tasks remains, directly related to the information integration and transmission in SNNs. Inspired by the advanced computational power of dendrites in biological neurons, we propose a multi-dendrite spiking neuron (MDSN) model based on Multi-compartment spiking neurons (MCN), expanding dendrite types from two to multiple and deriving the analytical solution of somatic membrane potential. We apply the MDSN to deep distributional reinforcement learning to enhance its performance in executing complex decision-making tasks. The proposed model can effectively and adaptively integrate and transmit meaningful information from different sources. Our model uses a bioinspired event-enhanced dendrite structure to emphasize features. Meanwhile, by utilizing dynamic membrane potential thresholds, it adaptively maintains the homeostasis of MDSN. Extensive experiments on Atari games show that the proposed model outperforms some state-of-the-art spiking distributional RL models by a significant margin.

     

  • loading
  • [1]
    C. Sun, W. Liu, and L. Dong, “Reinforcement learning with task decomposition for cooperative multiagent systems,” IEEE Trans. Neural Networks Learn. Syst., vol. 32, no. 5, pp. 2054–2065, May 2021. doi: 10.1109/TNNLS.2020.2996209
    [2]
    W. Liu, W. Cai, K. Jiang, G. Cheng, Y. Wang, J. Wang, J. Cao, L. Xu, C. Mu, and C. Sun, “XuanCe: A comprehensive and unified deep reinforcement learning library,” arXiv preprint arXiv: 2312.16248, 2023.
    [3]
    W. Liu, L. Dong, D. Niu, and C. Sun, “Efficient exploration for multi-agent reinforcement learning via transferable successor features,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 9, pp. 1673–1686, Sep. 2022. doi: 10.1109/JAS.2022.105809
    [4]
    W. Maass, “Networks of spiking neurons: The third generation of neural network models,” Neural Networks, vol. 10, no. 9, pp. 1659–1671, Dec. 1997. doi: 10.1016/S0893-6080(97)00011-7
    [5]
    G. Tang, N. Kumar, and K. P. Michmizos, “Reinforcement co-learning of deep and spiking neural networks for energy-efficient mapless navigation with neuromorphic hardware,” in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems, Las Vegas, USA, 2020, pp. 6090–6097.
    [6]
    W. Tan, D. Patel, and R. Kozma, “Strategy and benchmark for converting deep q-networks to event-driven spiking neural networks,” in Proc. 35th AAAI Conf. Artificial Intelligence, 2021, pp. 9816–9824.
    [7]
    A. Mahadevuni and P. Li, “Navigating mobile robots to target in near shortest time using reinforcement learning with spiking neural networks,” in Proc. Int. Joint Conf. Neural Networks, Anchorage, USA, 2017, pp. 2243–2250.
    [8]
    Z. Bing, C. Meschede, K. Huang, G. Chen, F. Rohrbein, M. Akl, and A. Knoll, “End to end learning of spiking neural network based on R-STDP for a lane keeping vehicle,” in Proc. IEEE Int. Conf. Robotics and Automation, Brisbane, Australia, 2018, pp. 4725–4732.
    [9]
    T. L. Zhang and B. Xu, “Research advances and perspectives on spiking neural networks,” Chin. J. Comput., vol. 44, no. 9, pp. 1767–1785, Sep. 2021.
    [10]
    E. M. Izhikevich, “Which model to use for cortical spiking neurons?,” IEEE Trans. Neural Networks, vol. 15, no. 5, pp. 1063–1070, Sep. 2004. doi: 10.1109/TNN.2004.832719
    [11]
    W. Gerstner and W. M. Kistler, Spiking Neuron Models: Single Neurons, Populations, Plasticity. Cambridge, UK: Cambridge University Press, 2002.
    [12]
    D. Zhang, T. Zhang, S. Jia, and B. Xu, “Multi-sacle dynamic coding improved spiking actor network for reinforcement learning,” in Proc. AAAI Conf. Artificial Intelligence, 2022, pp. 59–67.
    [13]
    B. A. Richards, T. P. Lillicrap, P. Beaudoin, Y. Bengio, R. Bogacz, A. Christensen, C. Clopath, R. P. Costa, A. de Berker, S. Ganguli, C. J. Gillon, D. Hafner, A. Kepecs, N. Kriegeskorte, P. Latham, G. W. Lindsay, K. D. Miller, R. Naud, C. C. Pack, P. Poirazi, P. Roelfsema, J. Sacramento, A. Saxe, B. Scellier, A. C. Schapiro, W. Senn, G. Wayne, D. Yamins, F. Zenke, J. Zylberberg, D. Therien, and K. P. Kording, “A deep learning framework for neuroscience,” Nat. Neurosci., vol. 22, no. 11, pp. 1761–1770, Oct. 2019. doi: 10.1038/s41593-019-0520-2
    [14]
    J. J. Letzkus, B. M. Kampa, and G. J. Stuart, “Learning rules for spike timing-dependent plasticity depend on dendritic synapse location,” J. Neurosci., vol. 26, no. 41, pp. 10420–10429, Oct. 2006. doi: 10.1523/JNEUROSCI.2650-06.2006
    [15]
    S. L. Smith, I. T. Smith, T. Branco, and M. Häusser, “Dendritic spikes enhance stimulus selectivity in cortical neurons in vivo,” Nature, vol. 503, no. 7474, pp. 115–120, Oct. 2013. doi: 10.1038/nature12600
    [16]
    A. Gidon, T. A. Zolnik, P. Fidzinski, F. Bolduan, A. Papoutsi, P. Poirazi, M. Holtkamp, I. Vida, and M. E. Larkum, “Dendritic action potentials and computation in human layer 2/3 cortical neurons,” Science, vol. 367, no. 6473, pp. 83–87, Jan. 2020. doi: 10.1126/science.aax6239
    [17]
    Y. Sun, Y. Zeng, F. Zhao, and Z. Zhao, “Multi-compartment neuron and population encoding improved spiking neural network for deep distributional reinforcement learning,” arXiv preprint arXiv: 2301.07275, 2023.
    [18]
    J. K. Makara and J. C. Magee, “Variable dendritic integration in hippocampal CA3 pyramidal neurons,” Neuron, vol. 80, no. 6, pp. 1438–1450, Dec. 2013. doi: 10.1016/j.neuron.2013.10.033
    [19]
    N. S. Desai, L. C. Rutherford, and G. G. Turrigiano, “Plasticity in the intrinsic excitability of cortical pyramidal neurons,” Nat. Neurosci., vol. 2, no. 6, pp. 515–520, Jun. 1999. doi: 10.1038/9165
    [20]
    W. Zhang and D. J. Linden, “The other side of the engram: Experience-driven changes in neuronal intrinsic excitability,” Nat. Rev. Neurosci., vol. 4, no. 11, pp. 885–900, Nov. 2003. doi: 10.1038/nrn1248
    [21]
    J. Triesch, “Synergies between intrinsic and synaptic plasticity in individual model neurons,” in Proc. 17th Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2004, pp. 1417–1424.
    [22]
    C. Li and Y. Li, “A review on synergistic learning,” IEEE Access, vol. 4, pp. 119–134, Jan. 2016. doi: 10.1109/ACCESS.2015.2509005
    [23]
    R. Azouz and C. M. Gray, “Dynamic spike threshold reveals a mechanism for synaptic coincidence detection in cortical neurons in vivo,” Proc. Natl. Acad. Sci., vol. 97, no. 14, pp. 8110–8115, Jun. 2000. doi: 10.1073/pnas.130200797
    [24]
    J. Ding, B. Dong, F. Heide, Y. Ding, Y. Zhou, B. Yin, and X. Yang, “Biologically inspired dynamic thresholds for spiking neural networks,” in Proc. 36th Int. Conf. Neural Information Processing Systems, New Orleans, USA, 2022, pp. 441.
    [25]
    M. Pagkalos, S. Chavlis, and P. Poirazi, “Introducing the dendrify framework for incorporating dendrites to spiking neural networks,” Nat. Commun., vol. 14, no. 1, p. 131, Jan. 2023. doi: 10.1038/s41467-022-35747-8
    [26]
    D. Yang, L. Zhao, Z. Lin, T. Qin, J. Bian, and T. Liu, “Fully parameterized quantile function for distributional reinforcement learning,” in Proc. 33rd Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2019, pp. 556.
    [27]
    D. Patel, H. Hazan, D. J. Saunders, H. T. Siegelmann, and R. Kozma, “Improved robustness of reinforcement learning policies upon conversion to spiking neuronal network platforms applied to Atari breakout game,” Neural Networks, vol. 120, pp. 108–115, Dec. 2019. doi: 10.1016/j.neunet.2019.08.009
    [28]
    G. Tang, N. Kumar, R. Yoo, and K. P. Michmizos, “Deep reinforcement learning with population-coded spiking neural network for continuous control,” in Proc. 4th Conf. Robot Learning, Cambridge, USA, 2020, pp. 2016–2029.
    [29]
    Y. Wu, L. Deng, G. Li, J. Zhu, and L. Shi, “Spatio-temporal backpropagation for training high-performance spiking neural networks,” Front. Neurosci., vol. 12, p. 331, May 2018. doi: 10.3389/fnins.2018.00331
    [30]
    D. Chen, P. Peng, T. Huang, and Y. Tian, “Deep reinforcement learning with spiking Q-learning,” arXiv preprint arXiv: 2201.09754, 2022.
    [31]
    Y. Sun, Y. Zeng, and Y. Li, “Solving the spike feature information vanishing problem in spiking deep Q network with potential based normalization,” Front. Neurosci., vol. 16, p. 953368, Aug. 2022. doi: 10.3389/fnins.2022.953368
    [32]
    S. Zhou, X. Li, Y. Chen, S. T. Chandrasekaran, and A. Sanyal, “Temporal-coded deep spiking neural network with easy training and robust performance,” in Proc. 35th AAAI Conf. Artificial Intelligence, 2021, pp. 11143–11151.
    [33]
    X. Zhou, Z. Song, X. Wu, and R. Yan, “A spiking deep convolutional neural network based on efficient spike timing dependent plasticity,” in Proc. 3rd Int. Conf. Artificial Intelligence and Big Data, Chengdu, China, 2020, pp. 39–45.
    [34]
    R. Brette and W. Gerstner, “Adaptive exponential integrate-and-fire model as an effective description of neuronal activity,” J. Neurophysiol., vol. 94, no. 5, pp. 3637–3642, Nov. 2005. doi: 10.1152/jn.00686.2005
    [35]
    M. Sung and Y. Kim, “Training spiking neural networks with an adaptive leaky integrate-and-fire neuron,” in Proc. IEEE Int. Conf. Consumer Electronics-Asia, Seoul, Korea (South), 2020, pp. 1–2.
    [36]
    T. Kim, S. Hu, J. Kim, J. Y. Kwak, J. Park, S. Lee, I. Kim, J.-K. Park, and Y. Jeong, “Spiking neural network (SNN) with memristor synapses having non-linear weight update,” Front. Comput. Neurosci., vol. 15, p. 646125, Mar. 2021. doi: 10.3389/fncom.2021.646125
    [37]
    W. Fang, Z. Yu, Y. Chen, T. Masquelier, T. Huang, and Y. Tian, “Incorporating learnable membrane time constant to enhance learning of spiking neural networks,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Montreal, Canada, 2021, pp. 2641–2651.
    [38]
    Y. Hao, X. Huang, M. Dong, and B. Xu, “A biologically plausible supervised learning method for spiking neural networks using the symmetric STDP rule,” Neural Networks, vol. 121, pp. 387–395, Jan. 2020. doi: 10.1016/j.neunet.2019.09.007
  • JAS-2024-0505-supp.pdf

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(8)  / Tables(3)

    Article Metrics

    Article views (13) PDF downloads(2) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return