IEEE/CAA Journal of Automatica Sinica
Citation: | L. Hu, S. C. Yang, X. Luo, H. Q. Yuan, K. Sedraoui, and M. C. Zhou, “A distributed framework for large-scale protein-protein interaction data analysis and prediction using MapReduce,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 1, pp. 160–172, Jan. 2022. doi: 10.1109/JAS.2021.1004198 |
[1] |
L. Hu, J. Zhang, X. Y. Pan, H. Yan, and Z. H. You, “HiSCF: Leveraging higher-order structures for clustering analysis in biological networks,” Bioinformatics, vol. 37, no. 4, pp. 542–550, May 2021. doi: 10.1093/bioinformatics/btaa775
|
[2] |
W. J. Zhu, X. K. Liu, M. L. Xu, and H. M. Wu, “Predicting the results of RNA molecular specific hybridization using machine learning,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 6, pp. 1384–1396, Nov. 2019. doi: 10.1109/JAS.2019.1911756
|
[3] |
L. Hu, X. H. Yuan, X. Liu, S. W. Xiong, and X. Luo, “Efficiently detecting protein complexes from protein interaction networks via alternating direction method of multipliers,” IEEE/ACM Trans. Comput. Biol. Bioinf., vol. 16, no. 6, pp. 1922–1935, Nov.-Dec. 2019. doi: 10.1109/TCBB.2018.2844256
|
[4] |
L. Hu, X. J. Wang, Y. A. Huang, P. W. Hu, and Z. H. You, “A survey on computational models for predicting protein-protein interactions,” Brief. Bioinform., vol. 22, no. 5, 2021. DOI: 10.1093/bib/bbab036
|
[5] |
T. Dandekar, B. Snel, M. Huynen, and P. Bork, “Conservation of gene order: A fingerprint of proteins that physically interact,” Trends Biochem. Sci., vol. 23, no. 9, pp. 324–328, Sep. 1998. doi: 10.1016/S0968-0004(98)01274-2
|
[6] |
J. N. Wells, L. T. Bergendahl, and J. A. Marsh, “Operon gene order is optimized for ordered protein complex assembly,” Cell Rep., vol. 14, no. 4, pp. 679–685, Feb. 2016. doi: 10.1016/j.celrep.2015.12.085
|
[7] |
R. Jansen, H. Y. Yu, D. Greenbaum, Y. Kluger, N. J. Krogan, S. Chung, A. Emili, M. Snyder, J. F. Greenblatt, and M. Gerstein, “A Bayesian networks approach for predicting protein-protein interactions from genomic data,” Science, vol. 302, no. 5644, pp. 449–453, Oct. 2003. doi: 10.1126/science.1087361
|
[8] |
F. Pazos and A. Valencia, “Similarity of phylogenetic trees as indicator of protein-protein interaction,” Protein Eng., vol. 14, no. 9, pp. 609–614, Sep. 2001. doi: 10.1093/protein/14.9.609
|
[9] |
M. Pellegrini, E. M. Marcotte, M. J. Thompson, D. Eisenberg, and T. O. Yeates, “Assigning protein functions by comparative genome analysis: Protein phylogenetic profiles,” Proc. Natl. Acad. Sci. USA, vol. 96, no. 8, pp. 4285–4288, Apr. 1999. doi: 10.1073/pnas.96.8.4285
|
[10] |
A. Chowdhury, P. Rakshit, and A. Konar, “Protein-protein interaction network prediction using stochastic learning automata induced differential evolution,” Appl. Soft Comput., vol. 49, pp. 699–724, Dec. 2016. doi: 10.1016/j.asoc.2016.08.053
|
[11] |
M. A. Mahdavi and Y. H. Lin, “Prediction of protein-protein interactions using protein signature profiling,” Genomics Proteomics Bioinformatics, vol. 5, no. 3–4, pp. 177–186, Dec. 2007. doi: 10.1016/S1672-0229(08)60005-4
|
[12] |
L. Huang, L. Liao, and C. H. Wu, “Evolutionary analysis and interaction prediction for protein-protein interaction network in geometric space,” PLoS One, vol. 12, no. 9, Article No. e0183495, Sep. 2017. doi: 10.1371/journal.pone.0183495
|
[13] |
A. Ben-Hur and W. S. Noble, “Kernel methods for predicting protein-protein interactions,” Bioinformatics, vol. 21, no. Suppl 1, pp. i38–i46, Mar. 2005. doi: 10.1093/bioinformatics/bti1016
|
[14] |
J. W. Shen, J. Zhang, X. M. Luo, W. L. Zhu, K. Q. Yu, K. X. Chen, Y. X. Li, and H. L. Jiang, “Predicting protein-protein interactions based only on sequences information,” Proc. Natl. Acad. Sci. USA, vol. 104, no. 11, pp. 4337–4341, Mar. 2007. doi: 10.1073/pnas.0607879104
|
[15] |
T. Keshava Prasad, R. Goel, K. Kandasamy, S. Keerthikumar, S. Kumar, S. Mathivanan, D. Telikicherla, R. Raju, B. Shafreen, A. Venugopal, L. Balakrishnan, A. Marimuthu, S. Banerjee, D. S. Somanathan, A. Sebastian, S. Rani, S. Ray, C. J. H. Kishore, S. Kanth, M. Ahmed, M. K. Kashyap, R. Mohmood, Y. L. Ramachandra, V. Krishna, B. A. Rahiman, S. Mohan, P. Ranganathan, S. Ramabadran, R. Chaerkady, and A. Pandey, “Human protein reference database-2009 update,” Nucleic Acids Res., vol. 37, pp. D767–D772, Jan. 2009. doi: 10.1093/nar/gkn892
|
[16] |
T. L. Sun, B. Zhou, L. H. Lai, and J. F. Pei, “Sequence-based prediction of protein protein interaction using a deep-learning algorithm,” BMC Bioinformatics, vol. 18, no. 1, Article No. 277, May 2017. doi: 10.1186/s12859-017-1700-2
|
[17] |
M. Kong, Y. S. Zhang, D. Xu, W. Chen, and M. Dehmer, “FCTP-WSRC: Protein-protein interactions prediction via weighted sparse representation based classification,” Front. Genet., vol. 11, Article No. 18, Feb. 2020. doi: 10.3389/fgene.2020.00018
|
[18] |
L. Hu and K. C. C. Chan, “Extracting coevolutionary features from protein sequences for predicting protein-protein interactions,” IEEE/ACM Trans. Comput. Biol. Bioinform., vol. 14, no. 1, pp. 155–166, Jan.–Feb. 2017. doi: 10.1109/TCBB.2016.2520923
|
[19] |
E. M. Marcotte, M. Pellegrini, M. J. Thompson, T. O. Yeates, and D. Eisenberg, “A combined algorithm for genome-wide prediction of protein function,” Nature, vol. 402, no. 6757, pp. 83–86, Nov. 1999. doi: 10.1038/47048
|
[20] |
A. J. Enright, I. Iliopoulos, N. C. Kyrpides, and C. A. Ouzounis, “Protein interaction maps for complete genomes based on gene fusion events,” Nature, vol. 402, no. 6757, pp. 86–90, Nov. 1999. doi: 10.1038/47056
|
[21] |
M. H. Deng, S. Mehta, F. Z. Sun, and T. Chen, “Inferring domain-domain interactions from protein-protein interactions,” in Proc. 6th Annu. Int. Conf. Computational Biology, Washington, DC, USA, 2002, pp. 117–126.
|
[22] |
X. W. Chen and M. Liu, “Prediction of protein-protein interactions using random decision forest framework,” Bioinformatics, vol. 21, no. 24, pp. 4394–4400, Dec. 2005. doi: 10.1093/bioinformatics/bti721
|
[23] |
S. R. Maetschke, M. Simonsen, M. J. Davis, and M. A. Ragan, “Gene ontology-driven inference of protein-protein interactions using inducers,” Bioinformatics, vol. 28, no. 1, pp. 69–75, Jan. 2012. doi: 10.1093/bioinformatics/btr610
|
[24] |
S. Pitre, F. Dehne, A. Chan, J. Cheetham, A. Duong, A. Emili, M. Gebbia, J. Greenblatt, M. Jessulat, N. Krogan, X. M. Luo, and A. Golshani, “PIPE: A protein-protein interaction prediction engine based on the re-occurring short polypeptide sequences between known interacting protein pairs,” BMC Bioinformatics, vol. 7, no. 1, Article No. 365, Jul. 2006. doi: 10.1186/1471-2105-7-365
|
[25] |
J. Zahiri, O. Yaghoubi, M. Mohammad-Noori, R. Ebrahimpour, and A. Masoudi-Nejad, “PPIevo: Protein-protein interaction prediction from PSSM based evolutionary information,” Genomics, vol. 102, no. 4, pp. 237–242, Oct. 2013.
|
[26] |
H. Li, X. J. Gong, H. Yu, and C. Zhou, “Deep neural network based predictions of protein interactions using primary sequences,” Molecules, vol. 23, no. 8, Article No. 1923, Aug. 2018. doi: 10.3390/molecules23081923
|
[27] |
I. A. Kovács, K. Luck, K. Spirohn, Y. Wang, C. Pollis, S. Schlabach, W. T. Bian, D. K. Kim, N. Kishore, T. Hao, M. A. Calderwood, M. Vidal, and A. L. Barabási, “Network-based prediction of protein interactions,” Nat. Commun., vol. 10, no. 1, Article No. 1240, Mar. 2019. doi: 10.1038/s41467-019-09177-y
|
[28] |
X. J. Wang, P. W. Hu, and L. Hu, “A novel stochastic block model for network-based prediction of protein-protein interactions,” in Proc. Int. Conf. Intelligent Computing, Bari, Italy, 2020, pp. 621–632.
|
[29] |
F. Yang, K. F. Fan, D. D. Song, and H. K. Lin, “Graph-based prediction of protein-protein interactions with attributed signed graph embedding,” BMC Bioinformatics, vol. 21, no. 1, Article No. 323, Jul. 2020. doi: 10.1186/s12859-020-03646-8
|
[30] |
H. Y. Yu, P. Braun, M. A. Yıldırım, I. Lemmens, K. Venkatesan, J. Sahalie, T. Hirozane-Kishikawa, F. Gebreab, N. Li, N. Simonis, T. Hao, J. F. Rual, A. Dricot, A. Vazquez, R. R. Murray, C. Simon, L. Tardivo, S. Tam, N. Svrzikapa, C. Y. Fan, A. S. D. Smet, A. Motyl, M. E. Hudson, J. Park, X. F. Xin, M. E. Cusick, T. Moore, C. Boone, M. Snyder, F. P. Roth, A. L. Barabási, J. Tavernier, D. E. Hill, and M. Vidal, “High-quality binary protein interaction map of the yeast interactome network,” Science, vol. 322, no. 5898, pp. 104–110, Oct. 2008. doi: 10.1126/science.1158684
|
[31] |
Z. H. You, J. Z. Yu, L. Zhu, S. Li, and Z. K. Wen, “A MapReduce based parallel SVM for large-scale predicting protein-protein interactions,” Neurocomputing, vol. 145, pp. 37–43, Dec. 2014. doi: 10.1016/j.neucom.2014.05.072
|
[32] |
L. Hu, X. H. Yuan, P. W. Hu, and K. C. C. Chan, “Efficiently predicting large-scale protein-protein interactions using MapReduce,” Comput. Biol. Chem., vol. 69, pp. 202–206, Aug. 2017. doi: 10.1016/j.compbiolchem.2017.03.009
|
[33] |
J. G. Chen, K. L. Li, K. Bilal, A. A. Metwally, K. Q. Li, and P. Yu, “Parallel protein community detection in large-scale PPI networks based on multi-source learning,” IEEE/ACM Trans. Comput. Biol. Bioinform., 2018. DOI: 10.1109/TCBB.2018.2868088
|
[34] |
J. Bi, H. T. Yuan, and M. C. Zhou, “Temporal prediction of multiapplication consolidated workloads in distributed clouds,” IEEE Trans. Autom. Sci. Eng., vol. 16, no. 4, pp. 1763–1773, Oct. 2019. doi: 10.1109/TASE.2019.2895801
|
[35] |
J. Bi, H. T. Yuan, L. B. Zhang, and J. Zhang, “SGW-SCN: An integrated machine learning approach for workload forecasting in geo-distributed cloud data centers,” Inform. Sci., vol. 481, pp. 57–68, May 2019. doi: 10.1016/j.ins.2018.12.027
|
[36] |
G. D. Kritikos, C. Moschopoulos, M. Vazirgiannis, and S. Kossida, “Noise reduction in protein-protein interaction graphs by the implementation of a novel weighting scheme,” BMC Bioinformatics, vol. 12, Article No. 239, Jun. 2011. doi: 10.1186/1471-2105-12-239
|
[37] |
X. Luo, Z. Ming, Z. H. You, S. Li, Y. N. Xia, and H. Leung, “Improving network topology-based protein interactome mapping via collaborative filtering,” Knowl.-Based Syst., vol. 90, pp. 23–32, Dec. 2015. doi: 10.1016/j.knosys.2015.10.003
|
[38] |
B. Liu, K. M. Huang, J. Q. Li, and M. C. Zhou, “An incremental and distributed inference method for large-scale ontologies based on mapreduce paradigm,” IEEE Trans. Cybern., vol. 45, no. 1, pp. 53–64, Jan. 2015. doi: 10.1109/TCYB.2014.2318898
|
[39] |
M. S. Shang, X. Luo, Z. G. Liu, J. Chen, Y. Yuan, and M. C. Zhou, “Randomized latent factor model for high-dimensional and sparse matrices from industrial applications,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 1, pp. 131–141, Jan. 2019. doi: 10.1109/JAS.2018.7511189
|
[40] |
H. Zahid, T. Mahmood, A. Morshed, and T. Sellis, “Big data analytics in telecommunications: Literature review and architecture recommendations,” IEEE/CAA J. Autom. Sinica, vol. 7, no. 1, pp. 18–38, Jan. 2020.
|
[41] |
X. Y. Shi, Q. He, X. Luo, Y. N. Bai, and M. S. Shang, “Large-scale and scalable latent factor analysis via distributed alternative stochastic gradient descent for recommender systems,” IEEE Trans. Big Data, 2020. DOI: 10.1109/TBDATA.2020.2973141
|
[42] |
T. White, Hadoop: The Definitive Guide. 3rd ed. Sebastopol, CA: O’Reilly Media, Inc., 2012.
|
[43] |
M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica, “Spark: Cluster computing with working sets,” in Proc. 2nd USENIX Conf. Hot Topics in Cloud Computing, Boston, MA, 2010.
|
[44] |
L. Z. Shi, X. D. Meng, E. Tseng, M. Mascagni, and Z. Wang, “SpaRC: Scalable sequence clustering using Apache Spark,” Bioinformatics, vol. 35, no. 5, pp. 760–768, Mar. 2019.
|
[45] |
L. Hu and S. C. Yang, “A fast algorithm to identify coevolutionary patterns from protein sequences based on tree-based data structure,” in Proc. IEEE Int. Conf. Systems, Man and Cybernetics (SMC), Bari, Italy, 2019, pp. 2273–2278.
|
[46] |
A. Franceschini, D. Szklarczyk, S. Frankild, M. Kuhn, M. Simonovic, A. Roth, J. Y. Lin, P. Minguez, P. Bork, C. Von Mering, and L. J. Jensen, “String v9.1: Protein-protein interaction networks, with increased coverage and integration,” Nucleic Acids Res., vol. 41, pp. D808–D815, Nov. 2012. doi: 10.1093/nar/gks1094
|
[47] |
G. M. Wang, J. F. Qiao, J. Bi, W. J. Li, and M. C. Zhou, “TL-GDBN: Growing deep belief network with transfer learning,” IEEE Trans. Autom. Sci. Eng., vol. 16, no. 2, pp. 874–885, Apr. 2019. doi: 10.1109/TASE.2018.2865663
|
[48] |
Y. Cao and J. Huang, “Neural-network-based nonlinear model predictive tracking control of a pneumatic muscle actuator-driven exoskeleton,” IEEE/CAA J. Autom. Sinica, vol. 7, no. 6, pp. 1478–1488, Nov. 2020. doi: 10.1109/JAS.2020.1003351
|
[49] |
G. M. Wang, Q. S. Jia, J. F. Qiao, J. Bi, and C. X. Liu, “A sparse deep belief network with efficient fuzzy learning framework,” Neural Netw., vol. 121, pp. 430–440, Jan. 2020. doi: 10.1016/j.neunet.2019.09.035
|
[50] |
R. M. Li, Y. F. Huang, and J. Wang, “Long-term traffic volume prediction based on k-means Gaussian interval type-2 fuzzy sets,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 6, pp. 1344–1351, Nov. 2019.
|
[51] |
L. Hu, K. C. C. Chan, X. H. Yuan, and S. W. Xiong, “A variational Bayesian framework for cluster analysis in a complex network,” IEEE Trans. Knowl. Data Eng., vol. 32, no. 11, pp. 2115–2128, Nov. 2020. doi: 10.1109/TKDE.2019.2914200
|
[52] |
X. P. Xu, J. Li, M. C. Zhou, J. Xu, and J. D. Cao, “Accelerated two-stage particle swarm optimization for clustering not-well-separated data,” IEEE Trans. Systems,Man,and Cybernetics:Systems, vol. 50, no. 11, pp. 4212–4223, Nov. 2020.
|
[53] |
C. Wang, W. Pedrycz, Z. W. Li, and M. C. Zhou, “Residual-driven fuzzy C-means clustering for image segmentation,” IEEE/CAA J. Autom. Sinica, vol. 8, no. 4, pp. 876–889, Apr. 2021.
|
[54] |
X. S. Lu, M. C. Zhou, L. Qi and H. Y. Liu, “Clustering algorithm-based analysis of rare event evolution via social media data,” IEEE Trans. Computational Social Systems, vol. 6, no. 2, pp. 301–310, Apr. 2019.
|