2019, 6(6): 1384-1396.
doi: 10.1109/JAS.2019.1911756
Abstract:
Ribonucleic acid (RNA) hybridization is widely used in popular RNA simulation software in bioinformatics. However, limited by the exponential computational complexity of combinatorial problems, it is challenging to decide, within an acceptable time, whether a specific RNA hybridization is effective. We hereby introduce a machine learning based technique to address this problem. Sample machine learning (ML) models tested in the training phase include algorithms based on the boosted tree (BT), random forest (RF), decision tree (DT) and logistic regression (LR), and the corresponding models are obtained. Given the RNA molecular coding training and testing sets, the trained machine learning models are applied to predict the classification of RNA hybridization results. The experiment results show that the optimal predictive accuracies are 96.2%, 96.6%, 96.0% and 69.8% for the RF, BT, DT and LR-based approaches, respectively, under the strong constraint condition, compared with traditional representative methods. Furthermore, the average computation efficiency of the RF, BT, DT and LR-based approaches are 208 679, 269 756, 184 333 and 187 458 times higher than that of existing approach, respectively. Given an RNA design, the BT-based approach demonstrates high computational efficiency and better predictive accuracy in determining the biological effectiveness of molecular hybridization.
Weijun Zhu, Xiaokai Liu, Mingliang Xu and Huanmei Wu, "Predicting the Results of RNA Molecular Specific Hybridization Using Machine Learning," IEEE/CAA J. Autom. Sinica, vol. 6, no. 6, pp. 1384-1396, Nov. 2019. doi: 10.1109/JAS.2019.1911756.