Spiking Reinforcement Learning Enhanced by Bioinspired Event Source of Multi-dendrite Spiking Neuron and Dynamic Thresholds

Xingyue Liang; Qiaoyun Wu; Yun Zhou; Chunyu Tan; Hongfu Yin; Changyin Sun

doi:10.1109/JAS.2024.124551

Volume 12 Issue 3

Mar. 2025

IEEE/CAA Journal of Automatica Sinica

JCR Impact Factor: 15.3, Top 1 (SCI Q1)

CiteScore: 23.5, Top 2% (Q1)
Google Scholar h5-index: 77， TOP 5

Turn off MathJax

Article Contents

Article Navigation > IEEE/CAA Journal of Automatica Sinica > 2025 > 12(3): 618-629

X. Liang, Q. Wu, Y. Zhou, C. Tan, H. Yin, and C. Sun, “Spiking reinforcement learning enhanced by bioinspired event source of multi-dendrite spiking neuron and dynamic thresholds,” IEEE/CAA J. Autom. Sinica, vol. 12, no. 3, pp. 618–629, Mar. 2025. doi: 10.1109/JAS.2024.124551

Citation:

X. Liang, Q. Wu, Y. Zhou, C. Tan, H. Yin, and C. Sun, “Spiking reinforcement learning enhanced by bioinspired event source of multi-dendrite spiking neuron and dynamic thresholds,” IEEE/CAA J. Autom. Sinica, vol. 12, no. 3, pp. 618–629, Mar. 2025. doi: 10.1109/JAS.2024.124551

Citation:

X. Liang, Q. Wu, Y. Zhou, C. Tan, H. Yin, and C. Sun, “Spiking reinforcement learning enhanced by bioinspired event source of multi-dendrite spiking neuron and dynamic thresholds,” IEEE/CAA J. Autom. Sinica, vol. 12, no. 3, pp. 618–629, Mar. 2025. doi: 10.1109/JAS.2024.124551

PDF( 3751 KB)

Spiking Reinforcement Learning Enhanced by Bioinspired Event Source of Multi-dendrite Spiking Neuron and Dynamic Thresholds

doi: 10.1109/JAS.2024.124551

Funds: This work was supported by the National Natural Science Foundation of China (62236002, 62303009, 62206001, 52305001, 62102387, 62206005), the University Synergy Innovation Program of Anhui Province (GXXT-2022-041), and the China Postdoctoral Science Foundation (2023M740013)

More Information

Author Bio:
Xingyue Liang received the master degree in applied mathematics from the School of Mathematical Sciences, Anhui University, in 2020. She is currently pursuing the Ph.D. degree with the School of Artificial Intelligence, Anhui University. Her research interests include spike neural networks, spike reinforcement learning, and reinforcement learning

Qiaoyun Wu (Member, IEEE) is currently working at Anhui University. She received the Ph.D. degree in computer-aided design from Nanjing University of Aeronautics and Astronautics (NUAA) in 2021. Her research interests include computer vision, robotics, and deep learning

Yun Zhou received the Ph.D. degree in communication and information system from the Hefei University of Technology, in 2018. From 2018 to 2020, she worked as a Postdoctoral Researcher in the Electronic Engineering and Information Science Department, University of Science and Technology of China. In 2020, she joined the School of Artificial Intelligence at Anhui University. Her research interests include computer vision and reinforcement learning

Chunyu Tan (Member, IEEE) received the M.S. degree in mathematics from Wuhan University, in 2015, and the Ph.D. degree in computer science from the University of Macau, Macau, China, in 2021. She is currently a Lecturer with the School of Artificial Intelligence, Anhui University. Her research interests include neuromorphic computing, biomedical signal processing, time-frequency analysis, and e-health

Hongfu Yin is currently pursuing master degree with the School of Artificial Intelligence, Anhui University. His research interests include spike neural networks, computer vision

Changyin Sun (Senior Member, IEEE) received the B.S. degree in applied mathematics from the College of Mathematics, Sichuan University, in 1996, and the M.S. and Ph.D. degrees in electrical engineering from Southeast University, in 2001 and 2004, respectively. He is currently a Professor with the School of Artificial Intelligence, Anhui University and the School of Automation, Southeast University. His research interests include intelligent control, brain-like computational intelligence, flight control, optimal theory, and embodied intelligence. Dr. Sun is also an Associate Editor of the IEEE Transactions on Neural Networks and Learning Systems, Neural Processing Letters, and IEEE/CAA Journal of Automatica Sinica
Corresponding author: Changyin Sun, e-mail: cysun@seu.edu.cn
¹ https://www.ieee-jas.net/fileZDHXBEN/journal/article/file/106d1c91-74c6-4597-aaca-cda32c50a614.pdf
Received Date: 2024-04-06
Revised Date: 2024-04-24
Accepted Date: 2024-05-12

Available Online: 2024-11-15

Abstract

Abstract

Deep reinforcement learning (DRL) achieves success through the representational capabilities of deep neural networks (DNNs). Compared to DNNs, spiking neural networks (SNNs), known for their binary spike information processing, exhibit more biological characteristics. However, the challenge of using SNNs to simulate more biologically characteristic neuronal dynamics to optimize decision-making tasks remains, directly related to the information integration and transmission in SNNs. Inspired by the advanced computational power of dendrites in biological neurons, we propose a multi-dendrite spiking neuron (MDSN) model based on Multi-compartment spiking neurons (MCN), expanding dendrite types from two to multiple and deriving the analytical solution of somatic membrane potential. We apply the MDSN to deep distributional reinforcement learning to enhance its performance in executing complex decision-making tasks. The proposed model can effectively and adaptively integrate and transmit meaningful information from different sources. Our model uses a bioinspired event-enhanced dendrite structure to emphasize features. Meanwhile, by utilizing dynamic membrane potential thresholds, it adaptively maintains the homeostasis of MDSN. Extensive experiments on Atari games show that the proposed model outperforms some state-of-the-art spiking distributional RL models by a significant margin.
- Deep reinforcement learning,
- multi-compartment spiking neurons,
- spiking neural network

FullText(HTML)

¹ https://www.ieee-jas.net/fileZDHXBEN/journal/article/file/106d1c91-74c6-4597-aaca-cda32c50a614.pdf

References(38)

References

[1]	C. Sun, W. Liu, and L. Dong, “Reinforcement learning with task decomposition for cooperative multiagent systems,” IEEE Trans. Neural Networks Learn. Syst., vol. 32, no. 5, pp. 2054–2065, May 2021. doi: 10.1109/TNNLS.2020.2996209
[2]	W. Liu, W. Cai, K. Jiang, G. Cheng, Y. Wang, J. Wang, J. Cao, L. Xu, C. Mu, and C. Sun, “XuanCe: A comprehensive and unified deep reinforcement learning library,” arXiv preprint arXiv: 2312.16248, 2023.
[3]	W. Liu, L. Dong, D. Niu, and C. Sun, “Efficient exploration for multi-agent reinforcement learning via transferable successor features,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 9, pp. 1673–1686, Sep. 2022. doi: 10.1109/JAS.2022.105809
[4]	W. Maass, “Networks of spiking neurons: The third generation of neural network models,” Neural Networks, vol. 10, no. 9, pp. 1659–1671, Dec. 1997. doi: 10.1016/S0893-6080(97)00011-7
[5]	G. Tang, N. Kumar, and K. P. Michmizos, “Reinforcement co-learning of deep and spiking neural networks for energy-efficient mapless navigation with neuromorphic hardware,” in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems, Las Vegas, USA, 2020, pp. 6090–6097.
[6]	W. Tan, D. Patel, and R. Kozma, “Strategy and benchmark for converting deep q-networks to event-driven spiking neural networks,” in Proc. 35th AAAI Conf. Artificial Intelligence, 2021, pp. 9816–9824.
[7]	A. Mahadevuni and P. Li, “Navigating mobile robots to target in near shortest time using reinforcement learning with spiking neural networks,” in Proc. Int. Joint Conf. Neural Networks, Anchorage, USA, 2017, pp. 2243–2250.
[8]	Z. Bing, C. Meschede, K. Huang, G. Chen, F. Rohrbein, M. Akl, and A. Knoll, “End to end learning of spiking neural network based on R-STDP for a lane keeping vehicle,” in Proc. IEEE Int. Conf. Robotics and Automation, Brisbane, Australia, 2018, pp. 4725–4732.
[9]	T. L. Zhang and B. Xu, “Research advances and perspectives on spiking neural networks,” Chin. J. Comput., vol. 44, no. 9, pp. 1767–1785, Sep. 2021.
[10]	E. M. Izhikevich, “Which model to use for cortical spiking neurons?,” IEEE Trans. Neural Networks, vol. 15, no. 5, pp. 1063–1070, Sep. 2004. doi: 10.1109/TNN.2004.832719
[11]	W. Gerstner and W. M. Kistler, Spiking Neuron Models: Single Neurons, Populations, Plasticity. Cambridge, UK: Cambridge University Press, 2002.
[12]	D. Zhang, T. Zhang, S. Jia, and B. Xu, “Multi-sacle dynamic coding improved spiking actor network for reinforcement learning,” in Proc. AAAI Conf. Artificial Intelligence, 2022, pp. 59–67.
[13]	B. A. Richards, T. P. Lillicrap, P. Beaudoin, Y. Bengio, R. Bogacz, A. Christensen, C. Clopath, R. P. Costa, A. de Berker, S. Ganguli, C. J. Gillon, D. Hafner, A. Kepecs, N. Kriegeskorte, P. Latham, G. W. Lindsay, K. D. Miller, R. Naud, C. C. Pack, P. Poirazi, P. Roelfsema, J. Sacramento, A. Saxe, B. Scellier, A. C. Schapiro, W. Senn, G. Wayne, D. Yamins, F. Zenke, J. Zylberberg, D. Therien, and K. P. Kording, “A deep learning framework for neuroscience,” Nat. Neurosci., vol. 22, no. 11, pp. 1761–1770, Oct. 2019. doi: 10.1038/s41593-019-0520-2
[14]	J. J. Letzkus, B. M. Kampa, and G. J. Stuart, “Learning rules for spike timing-dependent plasticity depend on dendritic synapse location,” J. Neurosci., vol. 26, no. 41, pp. 10420–10429, Oct. 2006. doi: 10.1523/JNEUROSCI.2650-06.2006
[15]	S. L. Smith, I. T. Smith, T. Branco, and M. Häusser, “Dendritic spikes enhance stimulus selectivity in cortical neurons in vivo,” Nature, vol. 503, no. 7474, pp. 115–120, Oct. 2013. doi: 10.1038/nature12600
[16]	A. Gidon, T. A. Zolnik, P. Fidzinski, F. Bolduan, A. Papoutsi, P. Poirazi, M. Holtkamp, I. Vida, and M. E. Larkum, “Dendritic action potentials and computation in human layer 2/3 cortical neurons,” Science, vol. 367, no. 6473, pp. 83–87, Jan. 2020. doi: 10.1126/science.aax6239
[17]	Y. Sun, Y. Zeng, F. Zhao, and Z. Zhao, “Multi-compartment neuron and population encoding improved spiking neural network for deep distributional reinforcement learning,” arXiv preprint arXiv: 2301.07275, 2023.
[18]	J. K. Makara and J. C. Magee, “Variable dendritic integration in hippocampal CA3 pyramidal neurons,” Neuron, vol. 80, no. 6, pp. 1438–1450, Dec. 2013. doi: 10.1016/j.neuron.2013.10.033
[19]	N. S. Desai, L. C. Rutherford, and G. G. Turrigiano, “Plasticity in the intrinsic excitability of cortical pyramidal neurons,” Nat. Neurosci., vol. 2, no. 6, pp. 515–520, Jun. 1999. doi: 10.1038/9165
[20]	W. Zhang and D. J. Linden, “The other side of the engram: Experience-driven changes in neuronal intrinsic excitability,” Nat. Rev. Neurosci., vol. 4, no. 11, pp. 885–900, Nov. 2003. doi: 10.1038/nrn1248
[21]	J. Triesch, “Synergies between intrinsic and synaptic plasticity in individual model neurons,” in Proc. 17th Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2004, pp. 1417–1424.
[22]	C. Li and Y. Li, “A review on synergistic learning,” IEEE Access, vol. 4, pp. 119–134, Jan. 2016. doi: 10.1109/ACCESS.2015.2509005
[23]	R. Azouz and C. M. Gray, “Dynamic spike threshold reveals a mechanism for synaptic coincidence detection in cortical neurons in vivo,” Proc. Natl. Acad. Sci., vol. 97, no. 14, pp. 8110–8115, Jun. 2000. doi: 10.1073/pnas.130200797
[24]	J. Ding, B. Dong, F. Heide, Y. Ding, Y. Zhou, B. Yin, and X. Yang, “Biologically inspired dynamic thresholds for spiking neural networks,” in Proc. 36th Int. Conf. Neural Information Processing Systems, New Orleans, USA, 2022, pp. 441.
[25]	M. Pagkalos, S. Chavlis, and P. Poirazi, “Introducing the dendrify framework for incorporating dendrites to spiking neural networks,” Nat. Commun., vol. 14, no. 1, p. 131, Jan. 2023. doi: 10.1038/s41467-022-35747-8
[26]	D. Yang, L. Zhao, Z. Lin, T. Qin, J. Bian, and T. Liu, “Fully parameterized quantile function for distributional reinforcement learning,” in Proc. 33rd Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2019, pp. 556.
[27]	D. Patel, H. Hazan, D. J. Saunders, H. T. Siegelmann, and R. Kozma, “Improved robustness of reinforcement learning policies upon conversion to spiking neuronal network platforms applied to Atari breakout game,” Neural Networks, vol. 120, pp. 108–115, Dec. 2019. doi: 10.1016/j.neunet.2019.08.009
[28]	G. Tang, N. Kumar, R. Yoo, and K. P. Michmizos, “Deep reinforcement learning with population-coded spiking neural network for continuous control,” in Proc. 4th Conf. Robot Learning, Cambridge, USA, 2020, pp. 2016–2029.
[29]	Y. Wu, L. Deng, G. Li, J. Zhu, and L. Shi, “Spatio-temporal backpropagation for training high-performance spiking neural networks,” Front. Neurosci., vol. 12, p. 331, May 2018. doi: 10.3389/fnins.2018.00331
[30]	D. Chen, P. Peng, T. Huang, and Y. Tian, “Deep reinforcement learning with spiking Q-learning,” arXiv preprint arXiv: 2201.09754, 2022.
[31]	Y. Sun, Y. Zeng, and Y. Li, “Solving the spike feature information vanishing problem in spiking deep Q network with potential based normalization,” Front. Neurosci., vol. 16, p. 953368, Aug. 2022. doi: 10.3389/fnins.2022.953368
[32]	S. Zhou, X. Li, Y. Chen, S. T. Chandrasekaran, and A. Sanyal, “Temporal-coded deep spiking neural network with easy training and robust performance,” in Proc. 35th AAAI Conf. Artificial Intelligence, 2021, pp. 11143–11151.
[33]	X. Zhou, Z. Song, X. Wu, and R. Yan, “A spiking deep convolutional neural network based on efficient spike timing dependent plasticity,” in Proc. 3rd Int. Conf. Artificial Intelligence and Big Data, Chengdu, China, 2020, pp. 39–45.
[34]	R. Brette and W. Gerstner, “Adaptive exponential integrate-and-fire model as an effective description of neuronal activity,” J. Neurophysiol., vol. 94, no. 5, pp. 3637–3642, Nov. 2005. doi: 10.1152/jn.00686.2005
[35]	M. Sung and Y. Kim, “Training spiking neural networks with an adaptive leaky integrate-and-fire neuron,” in Proc. IEEE Int. Conf. Consumer Electronics-Asia, Seoul, Korea (South), 2020, pp. 1–2.
[36]	T. Kim, S. Hu, J. Kim, J. Y. Kwak, J. Park, S. Lee, I. Kim, J.-K. Park, and Y. Jeong, “Spiking neural network (SNN) with memristor synapses having non-linear weight update,” Front. Comput. Neurosci., vol. 15, p. 646125, Mar. 2021. doi: 10.3389/fncom.2021.646125
[37]	W. Fang, Z. Yu, Y. Chen, T. Masquelier, T. Huang, and Y. Tian, “Incorporating learnable membrane time constant to enhance learning of spiking neural networks,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Montreal, Canada, 2021, pp. 2641–2651.
[38]	Y. Hao, X. Huang, M. Dong, and B. Xu, “A biologically plausible supervised learning method for spiking neural networks using the symmetric STDP rule,” Neural Networks, vol. 121, pp. 387–395, Jan. 2020. doi: 10.1016/j.neunet.2019.09.007

Supplements(1)

Supplements

JAS-2024-0505-supp.pdf

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(8) / Tables(3)

Get Citation

PDF

XML

Article Metrics

Article views (336) PDF downloads(78)

Highlights

Compared to DNNs, utilizing SNNs to simulate neuronal dynamics for optimizing decision tasks is more biologically plausible
This paper proposes a Multi-dendrite spiking neuron (MDSN) model，expanding dendrite types from two to multiple and deriving the analytical solution of somatic membrane potential
Applying the MDSN to deep distributional reinforcement learning enhances its performance in executing complex decision-making tasks. The proposed model can effectively and adaptively integrate and transmit meaningful information from different sources
By utilizing bioinspired dynamic thresholds to adaptively regulate the membrane potential of our MDSN for meaningful information transfer. Extensive experiments on Atari games show that the proposed model outperforms some state-of-the-art spiking distributional RL models by a significant margin

Spiking Reinforcement Learning Enhanced by Bioinspired Event Source of Multi-dendrite Spiking Neuron and Dynamic Thresholds

doi: 10.1109/JAS.2024.124551

Abstract

References

Supplements

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Highlights

Export File

Citation

Format

Content