Citation: | X. Liang, Q. Wu, Y. Zhou, C. Tan, H. Yin, and C. Sun, “Spiking reinforcement learning enhanced by bioinspired event source of multi-dendrite spiking neuron and dynamic thresholds,” IEEE/CAA J. Autom. Sinica, vol. 12, no. 2, pp. 1–12, Feb. 2025. doi: 10.1109/JAS.2024.124551 |
[1] |
C. Sun, W. Liu, and L. Dong, “Reinforcement learning with task decomposition for cooperative multiagent systems,” IEEE Trans. Neural Networks Learn. Syst., vol. 32, no. 5, pp. 2054–2065, May 2021. doi: 10.1109/TNNLS.2020.2996209
|
[2] |
W. Liu, W. Cai, K. Jiang, G. Cheng, Y. Wang, J. Wang, J. Cao, L. Xu, C. Mu, and C. Sun, “XuanCe: A comprehensive and unified deep reinforcement learning library,” arXiv preprint arXiv: 2312.16248, 2023.
|
[3] |
W. Liu, L. Dong, D. Niu, and C. Sun, “Efficient exploration for multi-agent reinforcement learning via transferable successor features,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 9, pp. 1673–1686, Sep. 2022. doi: 10.1109/JAS.2022.105809
|
[4] |
W. Maass, “Networks of spiking neurons: The third generation of neural network models,” Neural Networks, vol. 10, no. 9, pp. 1659–1671, Dec. 1997. doi: 10.1016/S0893-6080(97)00011-7
|
[5] |
G. Tang, N. Kumar, and K. P. Michmizos, “Reinforcement co-learning of deep and spiking neural networks for energy-efficient mapless navigation with neuromorphic hardware,” in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems, Las Vegas, USA, 2020, pp. 6090–6097.
|
[6] |
W. Tan, D. Patel, and R. Kozma, “Strategy and benchmark for converting deep q-networks to event-driven spiking neural networks,” in Proc. 35th AAAI Conf. Artificial Intelligence, 2021, pp. 9816–9824.
|
[7] |
A. Mahadevuni and P. Li, “Navigating mobile robots to target in near shortest time using reinforcement learning with spiking neural networks,” in Proc. Int. Joint Conf. Neural Networks, Anchorage, USA, 2017, pp. 2243–2250.
|
[8] |
Z. Bing, C. Meschede, K. Huang, G. Chen, F. Rohrbein, M. Akl, and A. Knoll, “End to end learning of spiking neural network based on R-STDP for a lane keeping vehicle,” in Proc. IEEE Int. Conf. Robotics and Automation, Brisbane, Australia, 2018, pp. 4725–4732.
|
[9] |
T. L. Zhang and B. Xu, “Research advances and perspectives on spiking neural networks,” Chin. J. Comput., vol. 44, no. 9, pp. 1767–1785, Sep. 2021.
|
[10] |
E. M. Izhikevich, “Which model to use for cortical spiking neurons?,” IEEE Trans. Neural Networks, vol. 15, no. 5, pp. 1063–1070, Sep. 2004. doi: 10.1109/TNN.2004.832719
|
[11] |
W. Gerstner and W. M. Kistler, Spiking Neuron Models: Single Neurons, Populations, Plasticity. Cambridge, UK: Cambridge University Press, 2002.
|
[12] |
D. Zhang, T. Zhang, S. Jia, and B. Xu, “Multi-sacle dynamic coding improved spiking actor network for reinforcement learning,” in Proc. AAAI Conf. Artificial Intelligence, 2022, pp. 59–67.
|
[13] |
B. A. Richards, T. P. Lillicrap, P. Beaudoin, Y. Bengio, R. Bogacz, A. Christensen, C. Clopath, R. P. Costa, A. de Berker, S. Ganguli, C. J. Gillon, D. Hafner, A. Kepecs, N. Kriegeskorte, P. Latham, G. W. Lindsay, K. D. Miller, R. Naud, C. C. Pack, P. Poirazi, P. Roelfsema, J. Sacramento, A. Saxe, B. Scellier, A. C. Schapiro, W. Senn, G. Wayne, D. Yamins, F. Zenke, J. Zylberberg, D. Therien, and K. P. Kording, “A deep learning framework for neuroscience,” Nat. Neurosci., vol. 22, no. 11, pp. 1761–1770, Oct. 2019. doi: 10.1038/s41593-019-0520-2
|
[14] |
J. J. Letzkus, B. M. Kampa, and G. J. Stuart, “Learning rules for spike timing-dependent plasticity depend on dendritic synapse location,” J. Neurosci., vol. 26, no. 41, pp. 10420–10429, Oct. 2006. doi: 10.1523/JNEUROSCI.2650-06.2006
|
[15] |
S. L. Smith, I. T. Smith, T. Branco, and M. Häusser, “Dendritic spikes enhance stimulus selectivity in cortical neurons in vivo,” Nature, vol. 503, no. 7474, pp. 115–120, Oct. 2013. doi: 10.1038/nature12600
|
[16] |
A. Gidon, T. A. Zolnik, P. Fidzinski, F. Bolduan, A. Papoutsi, P. Poirazi, M. Holtkamp, I. Vida, and M. E. Larkum, “Dendritic action potentials and computation in human layer 2/3 cortical neurons,” Science, vol. 367, no. 6473, pp. 83–87, Jan. 2020. doi: 10.1126/science.aax6239
|
[17] |
Y. Sun, Y. Zeng, F. Zhao, and Z. Zhao, “Multi-compartment neuron and population encoding improved spiking neural network for deep distributional reinforcement learning,” arXiv preprint arXiv: 2301.07275, 2023.
|
[18] |
J. K. Makara and J. C. Magee, “Variable dendritic integration in hippocampal CA3 pyramidal neurons,” Neuron, vol. 80, no. 6, pp. 1438–1450, Dec. 2013. doi: 10.1016/j.neuron.2013.10.033
|
[19] |
N. S. Desai, L. C. Rutherford, and G. G. Turrigiano, “Plasticity in the intrinsic excitability of cortical pyramidal neurons,” Nat. Neurosci., vol. 2, no. 6, pp. 515–520, Jun. 1999. doi: 10.1038/9165
|
[20] |
W. Zhang and D. J. Linden, “The other side of the engram: Experience-driven changes in neuronal intrinsic excitability,” Nat. Rev. Neurosci., vol. 4, no. 11, pp. 885–900, Nov. 2003. doi: 10.1038/nrn1248
|
[21] |
J. Triesch, “Synergies between intrinsic and synaptic plasticity in individual model neurons,” in Proc. 17th Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2004, pp. 1417–1424.
|
[22] |
C. Li and Y. Li, “A review on synergistic learning,” IEEE Access, vol. 4, pp. 119–134, Jan. 2016. doi: 10.1109/ACCESS.2015.2509005
|
[23] |
R. Azouz and C. M. Gray, “Dynamic spike threshold reveals a mechanism for synaptic coincidence detection in cortical neurons in vivo,” Proc. Natl. Acad. Sci., vol. 97, no. 14, pp. 8110–8115, Jun. 2000. doi: 10.1073/pnas.130200797
|
[24] |
J. Ding, B. Dong, F. Heide, Y. Ding, Y. Zhou, B. Yin, and X. Yang, “Biologically inspired dynamic thresholds for spiking neural networks,” in Proc. 36th Int. Conf. Neural Information Processing Systems, New Orleans, USA, 2022, pp. 441.
|
[25] |
M. Pagkalos, S. Chavlis, and P. Poirazi, “Introducing the dendrify framework for incorporating dendrites to spiking neural networks,” Nat. Commun., vol. 14, no. 1, p. 131, Jan. 2023. doi: 10.1038/s41467-022-35747-8
|
[26] |
D. Yang, L. Zhao, Z. Lin, T. Qin, J. Bian, and T. Liu, “Fully parameterized quantile function for distributional reinforcement learning,” in Proc. 33rd Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2019, pp. 556.
|
[27] |
D. Patel, H. Hazan, D. J. Saunders, H. T. Siegelmann, and R. Kozma, “Improved robustness of reinforcement learning policies upon conversion to spiking neuronal network platforms applied to Atari breakout game,” Neural Networks, vol. 120, pp. 108–115, Dec. 2019. doi: 10.1016/j.neunet.2019.08.009
|
[28] |
G. Tang, N. Kumar, R. Yoo, and K. P. Michmizos, “Deep reinforcement learning with population-coded spiking neural network for continuous control,” in Proc. 4th Conf. Robot Learning, Cambridge, USA, 2020, pp. 2016–2029.
|
[29] |
Y. Wu, L. Deng, G. Li, J. Zhu, and L. Shi, “Spatio-temporal backpropagation for training high-performance spiking neural networks,” Front. Neurosci., vol. 12, p. 331, May 2018. doi: 10.3389/fnins.2018.00331
|
[30] |
D. Chen, P. Peng, T. Huang, and Y. Tian, “Deep reinforcement learning with spiking Q-learning,” arXiv preprint arXiv: 2201.09754, 2022.
|
[31] |
Y. Sun, Y. Zeng, and Y. Li, “Solving the spike feature information vanishing problem in spiking deep Q network with potential based normalization,” Front. Neurosci., vol. 16, p. 953368, Aug. 2022. doi: 10.3389/fnins.2022.953368
|
[32] |
S. Zhou, X. Li, Y. Chen, S. T. Chandrasekaran, and A. Sanyal, “Temporal-coded deep spiking neural network with easy training and robust performance,” in Proc. 35th AAAI Conf. Artificial Intelligence, 2021, pp. 11143–11151.
|
[33] |
X. Zhou, Z. Song, X. Wu, and R. Yan, “A spiking deep convolutional neural network based on efficient spike timing dependent plasticity,” in Proc. 3rd Int. Conf. Artificial Intelligence and Big Data, Chengdu, China, 2020, pp. 39–45.
|
[34] |
R. Brette and W. Gerstner, “Adaptive exponential integrate-and-fire model as an effective description of neuronal activity,” J. Neurophysiol., vol. 94, no. 5, pp. 3637–3642, Nov. 2005. doi: 10.1152/jn.00686.2005
|
[35] |
M. Sung and Y. Kim, “Training spiking neural networks with an adaptive leaky integrate-and-fire neuron,” in Proc. IEEE Int. Conf. Consumer Electronics-Asia, Seoul, Korea (South), 2020, pp. 1–2.
|
[36] |
T. Kim, S. Hu, J. Kim, J. Y. Kwak, J. Park, S. Lee, I. Kim, J.-K. Park, and Y. Jeong, “Spiking neural network (SNN) with memristor synapses having non-linear weight update,” Front. Comput. Neurosci., vol. 15, p. 646125, Mar. 2021. doi: 10.3389/fncom.2021.646125
|
[37] |
W. Fang, Z. Yu, Y. Chen, T. Masquelier, T. Huang, and Y. Tian, “Incorporating learnable membrane time constant to enhance learning of spiking neural networks,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Montreal, Canada, 2021, pp. 2641–2651.
|
[38] |
Y. Hao, X. Huang, M. Dong, and B. Xu, “A biologically plausible supervised learning method for spiking neural networks using the symmetric STDP rule,” Neural Networks, vol. 121, pp. 387–395, Jan. 2020. doi: 10.1016/j.neunet.2019.09.007
|
JAS-2024-0505-supp.pdf |