An Interpretable Temporal Convolutional Framework for Granger Causality Analysis

Aoxiang Dong; Andrew Starr; Yifan Zhao

doi:10.1109/JAS.2025.125396

IEEE/CAA Journal of Automatica Sinica

JCR Impact Factor: 15.3, Top 1 (SCI Q1)

CiteScore: 23.5, Top 2% (Q1)
Google Scholar h5-index: 77， TOP 5

Turn off MathJax

Article Contents

Article Navigation > IEEE/CAA Journal of Automatica Sinica > 2025 > In Press, Accepted Manuscript

A. Dong, A. Starr, and Y. Zhao, “An interpretable temporal convolutional framework for granger causality analysis,” IEEE/CAA J. Autom. Sinica, 2025. doi: 10.1109/JAS.2025.125396

Citation:

A. Dong, A. Starr, and Y. Zhao, “An interpretable temporal convolutional framework for granger causality analysis,” IEEE/CAA J. Autom. Sinica, 2025. doi: 10.1109/JAS.2025.125396

A. Dong, A. Starr, and Y. Zhao, “An interpretable temporal convolutional framework for granger causality analysis,” IEEE/CAA J. Autom. Sinica, 2025. doi: 10.1109/JAS.2025.125396

Citation:

A. Dong, A. Starr, and Y. Zhao, “An interpretable temporal convolutional framework for granger causality analysis,” IEEE/CAA J. Autom. Sinica, 2025. doi: 10.1109/JAS.2025.125396

PDF( 4978 KB)

An Interpretable Temporal Convolutional Framework for Granger Causality Analysis

doi: 10.1109/JAS.2025.125396

More Information

Abstract

Abstract

Most existing parametric approaches for detecting linear or nonlinear Granger causality (GC) face challenges in estimating appropriate time delays, a critical factor for accurate GC detection. This issue becomes particularly pronounced in nonlinear complex systems, which are often opaque and consist of numerous components or variables. In this paper, we propose a novel temporal convolutional network (TCN)-based end-to-end GC detection approach called the Interpretable Temporal Convolutional Framework (ITCF). Unlike conventional deep learning models, which act like a “black box” and are difficult to analyse the interactions between variables, the proposed ITCF is able to detect both linear and nonlinear GC and automatically estimate time delay during the multivariant time series prediction. Specifically, GC is obtained by employing the Least Absolute Shrinkage and Selection Operator (Lasso) regression during the prediction of multivariate time series using TCN. Then, time delays can be estimated by interpreting the TCN kernels. We propose a convolutional Hierarchical Group Lasso (cHGL), a hierarchical regularisation approach to effectively utilise temporal information within each TCN channel for enhanced GC detection. Additionally, as far as we are concerned, this paper is the first to integrate the Iterative Soft-Thresholding Algorithm into the backpropagation of TCN to optimise the proposed cHGL, which enabling causal channel selection and inducing sparsity within each TCN channel to remove redundant temporal information, ultimately creating an end-to-end GC detection framework. The testing results of four experiments, involving two simulations and two real data, demonstrate that the proposed ITCF, in comparison with state-of-the-art, offers a more reliable estimation of GC relationships in complex systems featuring intricate dynamics, limited data lengths, or numerous variables.

FullText(HTML)

References(59)

References

[1]	Y. Zhao, E. Hanna, G. R. Bigg, and Y. Zhao, “Tracking Nonlinear Correlation for Complex Dynamic Systems Using a Windowed Error Reduction Ratio Method,” Complexity, vol. 2017, 2017, doi: 10.1155/2017/8570720.
[2]	A. Papana, C. Kyrtsou, D. Kugiumtzis, and C. Diks, “Financial networks based on Granger causality: A case study,” Physica A, vol. 482, pp. 65–73, 2017, doi: 10.1016/j.physa.2017.04.046.
[3]	X. R. Hou, K. Wang, C. Zhong, and Z. Wei, "ST-Trader: A Spatial-Temporal Deep Neural Network for Modeling Stock Market Movement, " IEEE/CAA J. Autom. Sinica, vol. 8, no. 5, pp. 1015-1024, May. 2021. doi: 10.1109/JAS.2021.1003976
[4]	J. Cao, Y. Zhao, X. Shan, H. Wei, Y. Guo, L. Chen, J. A. Erkoyuncu and P. G. Sarrigiannis, “Brain functional and effective connectivity based on electroencephalography recordings: A review,” Hum Brain Mapp, 2021, doi: 10.1002/HBM.25683.
[5]	X. Chen and Y. Wang, "Predicting Resting-state Functional Connectivity With Efficient Structural Connectivity, " IEEE/CAA J. Autom. Sinica, vol. 5, no. 6, pp. 1079-1088, Nov. 2018. doi: 10.1109/JAS.2017.7510880
[6]	D. A. Smirnov and I. I. Mokhov, “From Granger causality to long-term causality: Application to climatic data,” Phys Rev E Stat Nonlin Soft Matter Phys, vol. 80, no. 1, p. 016208, Aug. 2009, doi: 10.1103/PHYSREVE.80.016208/FIGURES/13/MEDIUM
[7]	Y. Zhao, G. R. Bigg, S. A. Billings, E. Hanna, A. J. Sole, H. Wei, V. Kadirkamanathan and D. J. Wilton, “Inferring the variation of climatic and glaciological contributions to West Greenland iceberg discharge in the twentieth century,” Cold Reg Sci Technol, vol. 121, pp. 167–178, 2016, doi: 10.1016/j.coldregions.2015.08.006.
[8]	C. W. J. Granger, “Investigating Causal Relations by Econometric Models and Cross-spectral Methods,” Econometrica, vol. 37, no. 3, p. 424, Aug. 1969, doi: 10.2307/1912791
[9]	A. Shojaie and E. B. Fox, “Granger Causality: A Review and Recent Advances,” Annu Rev Stat Appl, vol. 9, no. 1, pp. 289–319, Mar. 2022, doi: 10.1146/annurev-statistics-040120-010930
[10]	T. Schreiber, “Measuring Information Transfer,” Phys Rev Lett, vol. 85, no. 2, pp. 461–464, Jan. 2000, doi: 10.1103/PhysRevLett.85.461
[11]	L. Faes, G. Nollo, and A. Porta, “Information-based detection of nonlinear Granger causality in multivariate processes via a nonuniform embedding technique,” Phys Rev E, vol. 83, no. 5, p. 051112, May 2011, doi: 10.1103/PhysRevE.83.051112
[12]	M. Dhamala, G. Rangarajan, and M. Ding, “Estimating Granger Causality from Fourier and Wavelet Transforms of Time Series Data,” Phys Rev Lett, vol. 100, no. 1, p. 018701, Jan. 2008, doi: 10.1103/PhysRevLett.100.018701
[13]	E. Torun, T. P. Chang, and R. Y. Chou, “Causal relationship between spot and futures prices with multiple time horizons: A nonparametric wavelet Granger causality test,” Res Int Bus Finance, vol. 52, p. 101115, Apr. 2020, doi: 10.1016/J.RIBAF.2019.101115
[14]	D. Marinazzo, M. Pellicoro, and S. Stramaglia, “Kernel method for nonlinear Granger causality,” Phys Rev Lett, vol. 100, no. 14, p. 144103, Apr. 2008, doi: 10.1103/PHYSREVLETT.100.144103/FIGURES/4/MEDIUM
[15]	S. Seth and J. C. Principe, “Assessing granger non-causality using nonparametric measure of conditional independence,” IEEE Trans Neural Netw Learn Syst, vol. 23, no. 1, pp. 47–59, 2012, doi: 10.1109/TNNLS.2011.2178327.
[16]	Y. Zhao, S. A. Billings, H. Wei, and P. G. Sarrigiannis, “Tracking time-varying causality and directionality of information flow using an error reduction ratio test with applications to electroencephalography data,” Phys Rev E Stat Nonlin Soft Matter Phys, vol. 86, no. 5, Nov. 2012, doi: 10.1103/PhysRevE.86.051919.
[17]	J. Runge, J. Heitzig, V. Petoukhov, and J. Kurths, “Escaping the curse of dimensionality in estimating multivariate transfer entropy,” Phys Rev Lett, vol. 108, no. 25, p. 258701, Jun. 2012, doi: 10.1103/PHYSREVLETT.108.258701/FIGURES/4/MEDIUM
[18]	A. C. Lozano, N. Abe, Y. Liu, and S. Rosset, “Grouped graphical granger modeling methods for temporal causal modeling,” Proc of the ACM Intern Conf on Know Discov and Data Mining (SIGKDD), pp. 577–585, 2009, doi: 10.1145/1557019.1557085.
[19]	T. Hastie, R. Tibshirani, and M. Wainwright, “Statistical Learning with Sparsity”. Chapman and Hall/CRC, 2015. doi: 10.1201/b18401.
[20]	M. Yuan and Y. Lin, “Model Selection and Estimation in Regression with Grouped Variables,” J R Stat Soc Series B Stat Methodol, vol. 68, no. 1, pp. 49–67, Feb. 2006, doi: 10.1111/j.1467-9868.2005.00532.x
[21]	A. C. Lozano, N. Abe, Y. Liu, and S. Rosset, “Grouped graphical Granger modeling for gene expression regulatory networks discovery,” Bioinformatics, vol. 25, no. 12, 2009, doi: 10.1093/BIOINFORMATICS/BTP199.
[22]	W. B. Nicholson, I. Wilms, J. Bien, and D. S. Matteson, “High Dimensional Forecasting via Interpretable Vector Autoregression,” J. of Mach Learn Res, vol. 21, Dec. 2014, Accessed: Oct. 16, 2023. [Online]. Available: http://arxiv.org/abs/1412.5250
[23]	Y. Antonacci, L. Astolfi, G. Nollo, and L. Faes, “Information Transfer in Linear Multivariate Processes Assessed through Penalized Regression Techniques: Validation and Application to Physiological Networks,” Entropy, vol. 22, no. 7, p. 732, Jul. 2020, doi: 10.3390/e22070732
[24]	Y. Antonacci, J. Toppi, A. Pietrabissa, A. Anzolin, and L. Astolfi, “Measuring Connectivity in Linear Multivariate Processes With Penalized Regression Techniques,” IEEE Access, vol. 12, pp. 30638–30652, 2024, doi: 10.1109/ACCESS.2024.3368637.
[25]	S. A. Billings, Nonlinear System Identification: NARMAX Methods in the Time, Frequency, and Spatio–Temporal Domains. Chichester, UK: Wiley, 2013. doi: 10.1002/9781118535561.
[26]	Y. Zhao, S. A. Billings, H. Wei, F. He, and P. G. Sarrigiannis, “A new NARX-based Granger linear and nonlinear casual influence detection method with applications to EEG data,” J Neurosci Methods, vol. 212, no. 1, pp. 79–86, Jan. 2013, doi: 10.1016/j.jneumeth.2012.09.019
[27]	Y. Li, H. -L. Wei, S. A. Billings, and X. -F. Liao, “Time-varying linear and nonlinear parametric model for Granger causality analysis,” Phys Rev E, vol. 85, no. 4, p. 041906, Apr. 2012, doi: 10.1103/PhysRevE.85.041906.
[28]	D. Marinazzo, M. Pellicoro, and S. Stramaglia, “Nonlinear parametric model for Granger causality of time series,” Phys Rev E Stat Nonlin Soft Matter Phys, vol. 73, no. 6, p. 066216, Jun. 2006, doi: 10.1103/PHYSREVE.73.066216/FIGURES/9/MEDIUM
[29]	S. Chen, X. Hong, B. L. Luk, and C. J. Harris, “Construction of tunable radial basis function networks using orthogonal forward selection,” IEEE Trans on Sys, Man, and Cyber, Part B: Cyber, vol. 39, no. 2, pp. 457–466, 2009, doi: 10.1109/TSMCB.2008.2006688.
[30]	W. Ren, B. Li, and M. Han, “A novel Granger causality method based on HSIC-Lasso for revealing nonlinear relationship between multivariate time series,” Physica A: Statis Mech and its App, vol. 541, p. 123245, Mar. 2020, doi: 10.1016/j.physa.2019.123245
[31]	C. C. Aggarwal, Neural Networks and Deep Learning. Cham: Springer International Publishing, 2018. doi: 10.1007/978-3-319-94463-0.
[32]	K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward networks are universal approximators,” Neural Networks, vol. 2, no. 5, pp. 359–366, Jan. 1989, doi: 10.1016/0893-6080(89)90020-8
[33]	B. Lim and S. Zohren, “Time-series forecasting with deep learning: a survey,” Philo Trans of the Royal Soc A: Math, Phys and Eng Sci, vol. 379, no. 2194, p. 20200209, Apr. 2021, doi: 10.1098/rsta.2020.0209
[34]	H. Wang and G. Song, “Innovative NARX recurrent neural network model for ultra-thin shape memory alloy wire,” Neurocomputing, vol. 134, pp. 289–295, Jun. 2014, doi: 10.1016/j.neucom.2013.09.050
[35]	J. L. Elman, “Finding Structure in Time,” Cogn Sci, vol. 14, no. 2, pp. 179–211, Mar. 1990, doi: 10.1207/s15516709cog1402_1
[36]	S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Comput, vol. 9, no. 8, pp. 1735–1780, Nov. 1997, doi: 10.1162/NECO.1997.9.8.1735
[37]	K. Cho et al. , “Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation,” EMNLP 2014 - 2014 Conf on Emp Methods in Natural Lang Proc, Proc of the Conf, pp. 1724–1734, Jun. 2014, doi: 10.3115/v1/d14-1179.
[38]	Y. Wang, K. Lin, Y. Qi, Q. Lian, S. Feng, Z. Wu and G. Pan, “Estimating brain connectivity with varying-length time lags using a recurrent neural network,” IEEE Trans Biomed Eng, vol. 65, no. 9, pp. 1953–1963, Sep. 2018, doi: 10.1109/TBME.2018.2842769
[39]	P. J. Werbos, “Backpropagation Through Time: What It Does and How to Do It,” Proc of the IEEE, vol. 78, no. 10, pp. 1550–1560, 1990, doi: 10.1109/5.58337.
[40]	S. Bai, J. Z. Kolter, and V. Koltun, “An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling,” arXiv preprint, arXiv: 1803.01271, Mar. 2018.
[41]	A. Oord, S. Dieleman, H. Zen, K. Simonyan, O. Vinyals, A. Graves, N. Kalchbrenner, A. Senior and K. Kavukcuoglu, “WaveNet: A Generative Model for Raw Audio,” arXiv preprint, arXiv: 1609.03499, Sep. 2016.
[42]	A. Montalto, S. Stramaglia, L. Faes, G. Tessitore, R. Prevete, and D. Marinazzo, “Neural networks with non-uniform embedding and explicit validation phase to assess Granger causality,” Neur Net, vol. 71, pp. 159–171, Nov. 2015, doi: 10.1016/J.NEUNET.2015.08.003
[43]	B. Liu, X. He, M. Song, J. Li, G. Qu, J. Lang and R. Gu, “A Method for Mining Granger Causality Relationship on Atmospheric Visibility,” ACM Trans on Know Disc from Data (TKDD), vol. 15, no. 5, May 2021, doi: 10.1145/3447681.
[44]	A. Tank, I. Covert, N. Foti, A. Shojaie, and E. B. Fox, “Neural Granger Causality,” IEEE Trans Pattern Anal Mach Intell, pp. 1–1, 2021, doi: 10.1109/TPAMI.2021.3065601.
[45]	M. Nauta, D. Bucur, and C. Seifert, “Causal Discovery with Attention-Based Convolutional Neural Networks,” Mach Learn and Know Extra, no. 1, pp. 312–340, Jan. 2019, doi: 10.3390/MAKE1010019
[46]	T. Shi, W. Yang, A. Qi, P. Li, and J. Qiao, “LASSO and attention-TCN: a concurrent method for indoor particulate matter prediction,” Appl Intelli, vol. 53, no. 17, pp. 20076–20090, Sep. 2023, doi: 10.1007/S10489-023-04507-6/TABLES/5
[47]	Y. Shao, J. Tang, J. Liu, L. Han, and S. Dong, “Multivariable System Prediction Based on TCN-LSTM Networks with Self-Attention Mechanism and LASSO Variable Selection,” ACS Omega, Dec. 2023, doi: 10.1021/ACSOMEGA.3C06263.
[48]	M. Rosoł, M. Młyńczak, and G. Cybulski, “Granger causality test with nonlinear neural-network-based methods: Python package and simulation study. ,” Comput Methods Programs Biomed, p. 106669, Jan. 2022, doi: 10.1016/J.CMPB.2022.106669.
[49]	R. Jenatton, J. Mairal, G. Obozinski, and F. Bach, “Proximal Methods for Hierarchical Sparse Coding,” J. of Mach Learn Res, vol. 12, pp. 2297–2334, Sep. 2010, Accessed: Oct. 11, 2023. [Online]. Available: http://arxiv.org/abs/1009.2139
[50]	N. Parikh, “Proximal Algorithms,” Found and Trends® in Optim, vol. 1, no. 3, pp. 127–239, Jan. 2014, doi: 10.1561/2400000003
[51]	A. Beck and M. Teboulle, “A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems,” SIAM J Imaging Sci, vol. 2, no. 1, pp. 183–202, Jan. 2009, doi: 10.1137/080716542
[52]	Y. Antonacci, L. Minati, L. Faes, R. Pernice, G. Nollo, J. Toppi, A. Pietrabissa and L. Astolfi. , “Estimation of Granger causality through Artificial Neural Networks: applications to physiological systems and chaotic electronic oscillators,” PeerJ Comput Sci, vol. 7, p. e429, May 2021, doi: 10.7717/peerj-cs.429.
[53]	R. Jenatton, J. Mairal, G. Obozinski, and F. Bach, “Proximal Methods for Hierarchical Sparse Coding,” J. of Mach Learn Res, vol. 12, pp. 2297–2334, Sep. 2010, Accessed: Oct. 25, 2023. [Online]. Available: http://arxiv.org/abs/1009.2139
[54]	A. Bolstad, B. D. Van Veen, and R. Nowak, “Causal network inference via group sparse regularization,” IEEE Trans on Sign Proc, vol. 59, no. 6, pp. 2628–2641, Jun. 2011, doi: 10.1109/TSP.2011.2129515
[55]	E. N. Lorenz, “Predictability – a problem partly solved,” Predic of Weather and Climate, vol. 9780521848824, Cambridge University Press, 2006, pp. 40–58. doi: 10.1017/CBO9780511617652.004.
[56]	S. M. Smith et al., “Network modelling methods for FMRI,” Neuroimage, vol. 54, no. 2, pp. 875–891, Jan. 2011, doi: 10.1016/j.neuroimage.2010.08.063
[57]	R. J. Prill, D. Marbach, J. S Rodriguez, P. K. Sorger, L. G. Alexopoulos, X. Xue, N. D. Clarke, G. A. Bonnet and G. Stolovitzky, “Towards a Rigorous Assessment of Systems Biology Models: The DREAM3 Challenges,” PLoS One, vol. 5, no. 2, p. e9202, Feb. 2010, doi: 10.1371/journal.pone.00092
[58]	M. Langer, Z. He, W. Rahayu, and Y. Xue, “Distributed Training of Deep Learning Models: A Taxonomic Perspective,” IEEE Trans on Para and Distri Sys, vol. 31, no. 12, pp. 2802–2818, Dec. 2020, doi: 10.1109/TPDS.2020.3003307
[59]	S. Boyd, N. Parikh, E. Chu, B. Peleato and J. Eckstein. “Distributed optimization and statistical learning via the alternating direction method of multipliers”. Found and Trends® in Mach learn. 2011 Jul 25;3(1): 1-22.

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(14) / Tables(6)

Get Citation

PDF

XML

Article Metrics

Article views (28) PDF downloads(9)

An Interpretable Temporal Convolutional Framework for Granger Causality Analysis

doi: 10.1109/JAS.2025.125396

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Export File

Citation

Format

Content