FDTs: A Feature Disentangled Transformer for Interpretable Squamous Cell Carcinoma Grading

Pan Huang; Xin Luo

IEEE/CAA Journal of Automatica Sinica

JCR Impact Factor: 15.3, Top 1 (SCI Q1)

CiteScore: 23.5, Top 2% (Q1)
Google Scholar h5-index: 77， TOP 5

Article Contents

Article Navigation > IEEE/CAA Journal of Automatica Sinica > 2024 > In Press, Accepted Manuscript

P. Huang and X. Luo, “FDTs: A feature disentangled transformer for interpretable squamous cell carcinoma grading,” IEEE/CAA J. Autom. Sinica, 2024.

Citation:

P. Huang and X. Luo, “FDTs: A feature disentangled transformer for interpretable squamous cell carcinoma grading,” IEEE/CAA J. Autom. Sinica, 2024.

P. Huang and X. Luo, “FDTs: A feature disentangled transformer for interpretable squamous cell carcinoma grading,” IEEE/CAA J. Autom. Sinica, 2024.

Citation:

P. Huang and X. Luo, “FDTs: A feature disentangled transformer for interpretable squamous cell carcinoma grading,” IEEE/CAA J. Autom. Sinica, 2024.

PDF( 6689 KB)

FDTs: A Feature Disentangled Transformer for Interpretable Squamous Cell Carcinoma Grading

Pan Huang^,,
Xin Luo^{,
,}

More Information

Abstract

FullText(HTML)

References(18)

References

[1]	A. Dosovitskiy et al., “An image is worth 16×16 words: Transformers for image recognition at scale,” in Proc. Int. Conf. Learning Representations, 2020.
[2]	X. Chu, Z. Tian, Y. Wang, et al., “Twins: Revisiting the design of spatial attention in vision transformers,” Advances in Neural Infor. Processing Systems, vol. 34, pp. 9355−9366, 2021.
[3]	L. Yuan, Y. Chen, T. Wang, et al., “Tokens-to-token VIT: Training vision transformers from scratch on imagenet,” in Proc. IEEE/CVF Int. Conf. Computer Vision, 2022, pp. 538−547.
[4]	W. Yu, M. Luo, P. Zhou, et al., “Metaformer is actually what you need for vision,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, 2022, pp. 10819−10829.
[5]	Z. Liu, Y. Lin, Y. Cao, et al., “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, 2021, pp. 10012−10022.
[6]	I. O. Tolstikhin, N. Houlsby, A. Kolesnikov, et al., “Mlp-mixer: An all-mlp architecture for vision,” Advances in Neural Infor. Processing Systems, pp. 24261−24272, 2024.
[7]	G. Huang, Z. Liu, L. Maaten, et al., “Densely connected convolutional networks,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, 2017, pp. 4700−4708.
[8]	M. Tan and Q. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” in Proc. Int. Conf. Machine Learning, 2019, pp. 6105−6114.
[9]	Z. Liu, H Mao, C Wu, et al., “A convnet for the 2020s,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, 2022, pp. 11976−11986.
[10]	X. Ding, X. Zhang, N. Ma, et al., “RepVGG: Making VGG-style convnets great again,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, 2021, pp. 13733−13742.
[11]	C. Pan, J. Peng, and Z. Zhang, “Depth-guided vision transformer with normalizing flows for monocular 3D object detection,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 3, pp. 673−689, 2024.
[12]	Y. Zheng, J. Li, J. Shi, et al., “Kernel attention transformer for histopathology whole slide image analysis and assistant cancer siagnosis,” IEEE Trans. Medical Imaging, vol. 42, no. 9, pp. 2726–2739, 2023.
[13]	P. Huang, P. He, S. Tian, et al., “A ViT-AMC network with adaptive model fusion and multi-objective optimization for interpretable laryngeal tumor grading from histopathological images,” IEEE Trans. Medical Imaging, vol. 42, no. 1, pp. 15−28, 2023.
[14]	X. Wang, S. Yang, J. Zhang, et al., “Transformer-based unsupervised contrastive learning for histopathological image classification,” Medical Image Analysis, vol. 81, p. 102559, 2022.
[15]	Z. Li, Y. Jiang, M. Lu, et al., “Survival prediction via hierarchical multimodal co-attention transformer: A computational histology-radiology solution,” IEEE Trans. Medical Imaging, vol. 42, no. 9, pp. 2678–2689, 2023.
[16]	A. Wang, H. Chen, Z. Lin, et al., “RepViT: Revisiting mobile cnn from ViT perspective,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, 2024, pp. 15909−15920.
[17]	Y. Hu, Y. Cheng, A. Lu, et al., “LF-ViT: Reducing spatial redundancy in vision transformer for efficient image recognition,” in Proc. AAAI Conf. Artificial Intelligence, 2024, vol. 38, no. 3, pp. 2274−2284.
[18]	B. Heo, S. Park, D. Han, et al., “Rotary position embedding for vision transformer,” in Proc. European Conf. Computer Vision, 2025, pp. 289−305.

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(2) / Tables(2)

Get Citation

PDF

XML

Article Metrics

Article views (159) PDF downloads(22)

FDTs: A Feature Disentangled Transformer for Interpretable Squamous Cell Carcinoma Grading

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Export File

Citation

Format

Content