
IEEE/CAA Journal of Automatica Sinica
Citation: | Xiaojun Wang, "Ladle Furnace Temperature Prediction Model Based on Large-scale Data With Random Forest," IEEE/CAA J. Autom. Sinica, vol. 4, no. 4, pp. 770-774, Oct. 2017. doi: 10.1109/JAS.2016.7510247 |
THE main purpose of ladle furnace treatment is to ensure that the liquid steel has the required temperature when the ladle is taken over at downstream secondary metallurgy units or at a continuous caster. Practically, the liquid steel temperature can not be measured continuously, making it difficult to realize precisely controlling. Therefore, a liquid steel temperature prediction model in ladle furnace is always a hot topic for the researchers [1]-[4].
Some mechanism models of the liquid steel temperature in ladle furnace are proposed in earlier works [5]. However, since the parameters are hard to obtain, mechanism models can not be used efficiently for online precise prediction. Additionally, the operating environment of ladle metallurgy is harsh, especially the high temperatures and corrosive slag associated with the process. Indeed, some of the parameters are estimated by experience. Thus, it is hard to ensure the precision of mechanism models.
To improve the precision of mechanism models, intelligent methods have been used for predicting the temperature of the liquid steel in ladle furnace. Sun et al. [6] use neural network to estimate the liquid steel temperature in ladle furnace. Ensemble models are able to further improve the precision of a single predictor [7]. In recent years, ensemble modeling methods have been used to establish the liquid steel temperature models.
In [1], Tian and Mao propose a new hybrid artificial intelligent technique called the ensemble extreme learning machines to develop a liquid steel temperature model, and demonstrate that their ensemble method can improve the precision. In [3], Lv et al. propose another ensemble temperature model in which a pruning Bagging method based on the negative correlation learning approach is applied to predict the unknown parameters, and simultaneously, to predict the undefined function of the liquid steel temperature model.
Now, the precision of the most existing temperature models can not meet the requirement of the industrial production. These models are built on small sample sets. Fortunately, the large sample set for building the liquid steel temperature model are accumulated from the practical production process. The large sample set contains more effective information, and provide possibilities to boost the precision of the temperature prediction.
However, the large sample set makes it difficult to these ensemble modeling methods. In [1], the extreme learning machine [8] is selected as the learning algorithm of sub-models and the modified AdaBoost.RT is employed as the ensemble structure. The learning speed of a extreme learning machine is extremely fast, but the modified AdaBoost.RT is a serial ensemble model where a new sub-model relies on former built sub-models. In [3], the pruning process of sub-models is one after another, and Bagging is changed from a parallel ensemble model into a serial ensemble model. Generally, a serial ensemble model is more complex than a single predictor [9].
In this paper, the random forest [10] method is preferred to build the liquid steel temperature prediction model on a large sample set. In the random forest method, Bagging [11] is used in tandem with random feature selection. The random forest is a powerful regression method with low complexity and can be designed very quickly. It is with the parallel ensemble structure, uses sample subsets, and employs a simple learning algorithm of sub-models, i.e., the regression tree [12]. The random forest is expected to fully utilize the large sample set accumulated from ladle furnace system, improve the precision, and meet the requirements of the root mean square errors (RMSE) and the maximum error of the temperature prediction.
The remainder of this paper is organized as follows. In Section Ⅱ, the thermal model of ladle furnace is given to provide a priori knowledge for the random forest temperature model. In Section Ⅲ, the random forest method is presented. In Section Ⅳ, the experimental investigations are brought out, and the random forest temperature model is compared with other ensemble temperature models. In Section Ⅴ, the conclusion of this paper is summarized.
The thermal model of ladle furnace can be divided into the following parts: the heat gain due to arcing, the heat effects of additions, the heat loss of the refractory wall, and the top surface [4], [5].
"The heat ladle furnace gained is from arcing. The submerged arc operation ensures the heat gain of liquid steel. The following expression is employed to describe the change in temperature due to arcing.
ΔTarc=ηE |
(1) |
where η is the thermal efficiency coefficient of arcing. E is the total refining power consumption.
Temperature drop due to heat loss during the heating of refractory can be seen as a one-dimensional transient heat transfer equation for the side wall and bottom wall. It is calculated with the following assumptions and expression.
Initial condition: t=0 , TWall=TPreht .
Boundary condition: t=t , TWall=TLiqSt .
ρCPdTWalldt=ℏ(TWall−T∞) |
(2) |
where TPreht is the temperature of preheated refractory. TLiqSt is temperature of liquid steel. T∞ is the temperature of the plant environment. ρ is the density of steel (kg/m 3 ). CP is the specific heat of steel (JK −1 kg). ℏ is the heat transfer coefficient (J/m 2 s). Then ΔTWall owing to the refractory heat loss can be calculated.
The temperature change of liquid steel owing to additions is shown as the following expression:
ΔTadd=∑iWiQi |
(3) |
where i designates a specific addition (metal alloy or slag). Wi is the weight of addition i (kg). Qi is the temperature effect parameter of i (℃/kg).
The heat loss from the top surface is mainly due to the radiation loss of liquid slag surface and furnace cover. Moreover, the radiation loss of liquid slag surface is owing to the temperature of liquid slag, quantification of slag and surface area. It is difficult to obtain the quantification of liquid slag exactly. And the temperature of liquid steel (replacing the slag temperature in mechanism model) is changing continually during the ladle furnace metallurgy process. Therefore, it is hard to calculate the heat loss from the top surface exactly using the traditional mechanism model. The heat loss is obtained by calculating the energy change of the cooling water in this paper.
ΔTSurf=fSurfCwQwΔTwCPW |
(4) |
where fSurf is the heat loss coefficient of slag surface and furnace cover. Cw is specific heat of cooling water (JK −1 kg). Qw is the flux of cooling water (L/s). ΔTw is the temperature change of cooling water. W is the weight of liquid steel.
The complete thermal model for the prediction of temperature in ladle furnace can be described as the total effect of the above factors" [4]:
ΔT=ΔTarc+ΔTWall+ΔTadd+ΔTSurf. |
(5) |
A combination of many simple but different predictors can reduce the complexity and achieve better performance which can not be achieved by an individual model [10], [13]. Employing the regression tree as the learning algorithm of sub-models, the random forest is a modified version of Bagging [10]. In the random forest, Bagging [11] is used in tandem with random feature selection. A tree is grown on the new training set using the random selection of features at each node to determine the split. The random forest is always better than Bagging. Because of the law of large numbers the fandom forest do not over fit [10].
The random forest can reduce the complexity in three aspects:
1) It is with the parallel ensemble structure. A parallel ensemble is with the parallel computing procedure and the construction of a sub-model on each sample subset proceeds with no communication necessary from the other CPU's [13].
2) Sub-models are built on sample subsets in which the sizes of training samples are decreased dramatically. The generating procedure of the bootstrap replications, i.e., sample subsets, is straightforward, simple and quick, which keeps the complexity low.
3) Considering its simplicity, the regression tree is employed as the learning algorithm of sub-models.
Thus, we focus on building a random forest temperature prediction model.
To further improve the precision of the temperature prediction, in stead of the simple average, the generalized ensemble method is adopted, which is defined using the terminology of Perrone and Cooper [14]. The detailed calculation process is also showed in [15].
"Consider the ensemble model that consists of P sub-models takes the form:
fE(X)=P∑i=1βifi(X)=g(X)+P∑i=1βiεi(X) |
(6) |
where βi is the weight of the i th sub-model, with ∑Pi=1βi=1 ; g(X) is the true function to be estimated; εi(X)=fi(X)−g(X) is the error of the i th sub-model; and fi(X) is the sub-model.
One considers the covariance matrix Γ which is with size P×P and the entries are as follows:
Γij=E[εi(X)εj(X)] |
(7) |
where in practice one works with a finite-sample approximation
Γij=1NN∑k=1[fi(Xk)−Yk][fj(Xk)−Yk] |
(8) |
and N is the number of the training samples. The ensemble error equals:
JE= E[{fE(X)−g(X)}2]= E[(P∑i=1βiεi)(P∑j=1βjεj)]≈ P∑i=1P∑j=1βiβjΓij= βTΓβ. |
(9) |
An optimal choice of β follows then from:
minβ12βTΓβs.t.P∑i=1βi=1. |
(10) |
Then, with Lagrange multiplier λ ,
ℓ(β,λ)=12βTΓβ−λ((P∑i=1βi)−1) |
(12) |
one obtains the conditions for optimality:
{∂ℓ∂β=Γβ−λ1v=0∂ℓ∂λ=1Tvβ−1 |
(13) |
with optimal solution
β=Γ−11v1TvΓ1v |
(14) |
with 1v=[1,…,1]T " [15].
The 1714 samples of production are used. The number of the main factors that affect the temperature is 10. They are the weight of the liquid steel, the number of the ladle, the ladle states, the temperature of the empty ladle, the refining time, the initial temperature, the refining power consumption, the volume of argon purging, the time interval of temperature testing, and the heat effects of additions.
95 % of the data are randomly selected to train the model, and the others are testing data. The data are normalized in the range [0, 1]. Any model is constructed from training set using 10-fold cross-validation.
To demonstrate the potential and necessity of the random forests for the temperature prediction in ladle furnace on large sample set, the following temperature models are compared with the random forest temperature model: the modified AdaBoost.RT based ensemble extreme learning machines proposed by Tian et al. in [1] and the pruned Bagging based on the negative correlation learning proposed by Lv et al. in [3].
For each of the three temperature models, the modeling process is repeated 20 times and the statistical results of obtained performances is presented. The RMSE and the maximum error are employed as the index to quantify the precision, i.e., the accuracy on training set and the generalization on testing set, of the temperature prediction.
All programs are compiled by MATLAB and run on the computer with Pentium (R) Dual-Core CPU, 2.00 GB EMS memory, Microsoft Windows XP professional system.
When the temperature is estimated with the random forest, we find it useful to set the re-sampling rate as 63 %. Breiman [10] demonstrates that the random forest is insensitive to the number of features selected to split each node. Usually, selecting 1 or 2 features gives near optimum results. Thus, 2 features are selected in this experiment. Methods to prune back the trees are not considered. The minimum leaf-node size is set as for all regression trees.
In the modified AdaBoost.RT temperature model, the structure, the parameters of the extreme learning machine, and the training error are the same as [1]. In the pruned Bagging temperature model, 80 % samples are selected from the training set to generate each sample subset. The number of nodes in the hidden layer is 20. Fig. 1 shows the performance of the random forest, the modified AdaBoost.RT and the pruned Bagging temperature models with different numbers of sub-models.
Fig. 1 shows that the performance of the random forest is sensitive to the number of sub-models. On both the training and the testing sets, as number of sub-models increases, the RMSE and the max error of the random forest temperature model reduce. The performance of the random forest with more than 450 sub-models can meet the requirements of the RMSE (less than 3 ℃) and the maximum error (less than 10 ℃) of the temperature prediction. Thus, the random forest with 450 sub-models is used for the temperature prediction.
Fig. 1 also shows that on both the training and the testing sets, as the number of sub-models increases, the RMSE and the max error of the modified AdaBoost.RT and the pruned Bagging temperature models reduce. With less than 500 sub-models, the modified AdaBoost.RT temperature model can not meet the requirements of the RMSE and the maximum error. With more than 400 sub-models, the pruned Bagging temperature model can meet the requirement of the RMSE on both the training and the testing sets, and meet the requirement of the maximum error on the training set. However, with less than 500 sub-models, the pruned Bagging temperature model cannot meet the requirement of maximum error on the testing set. The detailed results of the temperature models based on the random forest, the modified AdaBoost.RT and the pruned Bagging (with 450 sub-models) are presented in Table Ⅰ. On both the training and the testing sets, the RMSE and the maximum error of the temperature estimated with the random forest are less than those of the temperature estimated with the pruned Bagging and the modified AdaBoost.RT. The curve of the temperature estimated with the random forest (450 sub-models) is showed in Fig. 2. Here, the real temperature means the true temperature in practical production process. Based on the random forest, the curve of the real temperature is fitted effectively by the predictive curve.
Temperature model | Training | Testing | ||
RMSE | Maximum error | RMSE | Maximum error | |
Random forest | 2.3 | 8.9 | 2.8 | 7.6 |
Modified AdaBoost.RT | 3.4 | 15.1 | 3.7 | 11.2 |
Pruned Bagging | 4.2 | 17.9 | 4.3 | 13.7 |
To summarize, with the large sample set accumulated from the production process in ladle furnace, only the temperature model estimated with the random forest meets the requirements of the temperature prediction on both the training and the testing sets:
1) The RMSE of the temperature prediction is lower than 2.8 ℃.
2) The maximum error of the temperature prediction is lower than 8.9 ℃.
The precision, i.e., the accuracy and the generalization, of the random forest based liquid steel temperature model can satisfy industrial production in ladle furnace.
The main contribution of this paper is to significantly improve the precision of the liquid steel temperature prediction in ladle furnace by the random forest method on the large sample set accumulated from the production process.
The experiments demonstrate that the random forest outperforms the pruned Bagging and the modified AdaBoost.RT in the precision of the temperature prediction.
On the training set, the RMSE of the random forest temperature model is 2.3 ℃ which is less than the pruned Bagging's 3.4 ℃ and the modified AdaBoost.RT's 4.2 ℃; the maximum error of the random forest temperature model is 8.9 ℃ which is less than the pruned Bagging's 15.1 ℃ and the modified AdaBoost.RT's 17.9 ℃.
On the testing set, the RMSE of the random forest temperature model is 2.8 ℃ which is less than the pruned Bagging's 3.7 ℃ and the modified AdaBoost.RT's 4.3 ℃; the maximum error of the random forest temperature model is 7.6 ℃ which is less than the pruned Bagging's 11.2 ℃ and the modified AdaBoost.RT's 13.7 ℃.
Only the random forest temperature model meet the requirements of the RMSE (less than 3 ℃) and the maximum error (less than 10 ℃) and satisfy industrial production in ladle furnace.
[1] |
H. X. Tian and Z. Z. Mao, "An ensemble ELM based on modified AdaBoost. RT algorithm for predicting the temperature of molten steel in ladle furnace, " IEEE Trans. Automat. Sci. Eng. , vol. 7, no. 1, pp. 73-80, Jan. 2010. http://ieeexplore.ieee.org/document/4745835/
|
[2] |
H. X. Tian, Z. Z. Mao, and A. N. Wang, "A new incremental learning modeling method based on multiple models for temperature prediction of molten steel in LF, " ISIJ Int. , vol. 49, no. 1, pp. 58-63, Jan. 2009. http://www.researchgate.net/publication/240796632_A_New_Incremental_Learning_Modeling_Method_Based_on_Multiple_Models_for_Temperature_Prediction_of_Molten_Steel_in_LF
|
[3] |
W. Lv, Z. Z. Mao, P. Yuan, and M. X. Jia, "Pruned bagging aggregated hybrid prediction models for forecasting the steel temperature in ladle furnace, " Steel Res. Int. , vol. 85, no. 3, pp. 405-414, Mar. 2014. doi: 10.1002/srin.201200302
|
[4] |
H. X. Tian, Z. Z. Mao, and Y. Wang, "Hybrid modeling of molten steel temperature prediction in LF, " ISIJ Int. , vol. 48, no. 1, pp. 58-62, Jan. 2008. http://www.researchgate.net/publication/250161320_Hybrid_Modeling_of_Molten_Steel_Temperature_Prediction_in_LF
|
[5] |
U. Ç amdali and M. Tunç, "Steady state heat transfer of ladle furnace during steel production process, " J. Iron Steel Res. Int. , vol. 13, no. 3, pp. 18-20, 25, May 2006. http://kns.cnki.net/KCMS/detail/detail.aspx?filename=ying200603004&dbname=CJFD&dbcode=CJFQ
|
[6] |
Y. G. Sun, D. X. Wang, and B. S. Tao, "An intelligent ladle furnace control system, " Proc. 3rd World Congress on Intelligent Control Automation, Hefei, China, vol. 1, pp. 330-334, 2000. http://en.cnki.com.cn/Article_en/CJFDTOTAL-YJZH199906003.htm
|
[7] |
L. K. Hansen and P. Salamon, "Neural network ensembles, " IEEE Trans. Pattern Anal. Mach. Intell. , vol. 12, no. 10, pp. 993-1001, Oct. 1990.
|
[8] |
G. B. Huang, H. M. Zhou, X. J. Ding, and R. Zhang, "Extreme learning machine for regression and multiclass classification, " IEEE Trans. Syst. Man, Cybern. B, Cybern. , vol. 42, no. 2, pp. 513-529, Apr. 2012. http://www.ncbi.nlm.nih.gov/pubmed/21984515
|
[9] |
E. Tuv, A. Borisov, G. Runger, and K. Torkkola, "Feature selection with ensembles, artificial variables, and redundancy elimination, " J. Mach. Learn. Res. , vol. 10, pp. 1341-1366, Jul. 2009. http://www.researchgate.net/publication/220320233_Feature_Selection_with_Ensembles_Artificial_Variables_and_Redundancy_Elimination
|
[10] |
L. Breiman, "Random forests", Mach. Learn. , vol. 45, pp. 5-32, Oct. 2001.
|
[11] |
L. Breiman, Out-of-bag estimation, 1996[Online]. Available: http://citeseerx.ist.psu.edu/viewdoc/download; jsessionid=D381E2972FE245E978F59E4C079D5D24?doi=10.1.1.45.3712&rep=rep1&type=pdf
|
[12] |
L. Breiman, J. H. Friedman, C. J. Stone, and R. A. Olshen, Classification and Regression Trees. Boca Raton:CRC Press, 1998.
|
[13] |
L. Breiman, "Bagging predictors, " Mach. Learn. , vol. 24, no. 2, pp. 123-140, Aug. 1996. http://www.researchgate.net/publication/239724173_Bagging_predictors_Machine_Learning
|
[14] |
M. P. Perrone and L. N. Cooper, "When networks disagree: Ensemble methods for hybrid neural networks, " in Neural Networks for Speech and Image Processing, London, UK: Chapman & Hall, 1993, 126-142.
|
[15] |
J. A. K. Suykens, T. V. Gestel, J. De Brabanter, B. De Moor, and J. Vandewalle, Least Squares Support Vector Machines. River Edge, NJ, Singapore: World Scientific, 2002.
|
[1] | Tianyu Wang, Fan Zhou, Yangjie Wu, Jun Zhao, Wei Wang. A Multi-Condition Sequential Network Ensemble for Industrial Energy Storage Prediction Considering the Condition Switching Characteristics[J]. IEEE/CAA Journal of Automatica Sinica, 2025, 12(2): 369-380. doi: 10.1109/JAS.2024.124962 |
[2] | Xuerui Cao, Kaixiang Peng, Ruihua Jiao. Multi-Phase Degradation Modeling Based on Uncertain Random Process for Remaining Useful Life Prediction Under Triple Uncertainties[J]. IEEE/CAA Journal of Automatica Sinica. |
[3] | Xiufang Chen, Zhenming Su, Long Jin, Shuai Li. A Correntropy-Based Echo State Network With Application to Time Series Prediction[J]. IEEE/CAA Journal of Automatica Sinica, 2025, 12(2): 425-435. doi: 10.1109/JAS.2024.124932 |
[4] | Jijun Qu, Zhijian Ji, Jirong Wang, Yungang Liu. Necessary and Sufficient Conditions for Controllability and Essential Controllability of Directed Circle and Tree Graphs[J]. IEEE/CAA Journal of Automatica Sinica, 2025, 12(4): 694-704. doi: 10.1109/JAS.2024.124866 |
[5] | Fuyong Wang, Jiayi Gong, Zhongxin Liu, Fei Chen. Fixed-Time Stability of Random Nonlinear Systems[J]. IEEE/CAA Journal of Automatica Sinica. doi: 10.1109/JAS.2024.124353 |
[6] | Zhiming Zhang, Shangce Gao, MengChu Zhou, Mengtao Yan, Shuyang Cao. Mapping Network-Coordinated Stacked Gated Recurrent Units for Turbulence Prediction[J]. IEEE/CAA Journal of Automatica Sinica, 2024, 11(6): 1331-1341. doi: 10.1109/JAS.2024.124335 |
[7] | Xiaofeng Yuan, Weiwei Xu, Yalin Wang, Chunhua Yang, Weihua Gui. A Deep Residual PLS for Data-Driven Quality Prediction Modeling in Industrial Process[J]. IEEE/CAA Journal of Automatica Sinica, 2024, 11(8): 1777-1785. doi: 10.1109/JAS.2024.124578 |
[8] | Zhipeng Chen, Xinyi Wang, Weihua Gui, Jilin Zhu, Chunhua Yang, Zhaohui Jiang. A Novel Sensing Imaging Equipment Under Extremely Dim Light for Blast Furnace Burden Surface: Starlight High-Temperature Industrial Endoscope[J]. IEEE/CAA Journal of Automatica Sinica, 2024, 11(4): 893-906. doi: 10.1109/JAS.2023.123954 |
[9] | Lin Chen, Xin Luo. Tensor Distribution Regression Based on the 3D Conventional Neural Networks[J]. IEEE/CAA Journal of Automatica Sinica, 2023, 10(7): 1628-1630. doi: 10.1109/JAS.2023.123591 |
[10] | Zhuoxuan Li, Iakov Korovin, Xinli Shi, Sergey Gorbachev, Nadezhda Gorbacheva, Wei Huang, Jinde Cao. A Data-Driven Rutting Depth Short-Time Prediction Model With Metaheuristic Optimization for Asphalt Pavements Based on RIOHTrack[J]. IEEE/CAA Journal of Automatica Sinica, 2023, 10(10): 1918-1932. doi: 10.1109/JAS.2023.123192 |
[11] | Xiufang Chen, Mei Liu, Shuai Li. Echo State Network With Probabilistic Regularization for Time Series Prediction[J]. IEEE/CAA Journal of Automatica Sinica, 2023, 10(8): 1743-1753. doi: 10.1109/JAS.2023.123489 |
[12] | Meng Yao, Guoliang Wei. Dynamic Event-Triggered Control of Continuous-Time Systems With Random Impulses[J]. IEEE/CAA Journal of Automatica Sinica, 2023, 10(12): 2292-2299. doi: 10.1109/JAS.2023.123534 |
[13] | Ji Ma, Jiayu Qiu, Xiao Yu, Weiyao Lan. Distributed Nash Equilibrium Seeking Over Random Graphs[J]. IEEE/CAA Journal of Automatica Sinica, 2022, 9(12): 2193-2196. doi: 10.1109/JAS.2022.105854 |
[14] | Binghui Li, Badong Chen. An Adaptive Rapidly-Exploring Random Tree[J]. IEEE/CAA Journal of Automatica Sinica, 2022, 9(2): 283-294. doi: 10.1109/JAS.2021.1004252 |
[15] | Yiming Lei, Haiping Zhu, Junping Zhang, Hongming Shan. Meta Ordinal Regression Forest for Medical Image Classification With Ordinal Labels[J]. IEEE/CAA Journal of Automatica Sinica, 2022, 9(7): 1233-1247. doi: 10.1109/JAS.2022.105668 |
[16] | Zhipeng Chen, Zhaohui Jiang, Chunjie Yang, Weihua Gui, Youxian Sun. Dust Distribution Study at the Blast Furnace Top Based on k-Sε-up Model[J]. IEEE/CAA Journal of Automatica Sinica, 2021, 8(1): 121-135. doi: 10.1109/JAS.2020.1003468 |
[17] | Qinqin Zhu. Latent Variable Regression for Supervised Modeling and Monitoring[J]. IEEE/CAA Journal of Automatica Sinica, 2020, 7(3): 800-811. doi: 10.1109/JAS.2020.1003153 |
[18] | Boqian Wang, Xianting Ding, Fei-Yue Wang. Determination of Polynomial Degree in the Regression of Drug Combinations[J]. IEEE/CAA Journal of Automatica Sinica, 2017, 4(1): 41-47. |
[19] | Zhen Hong, Rui Wang, Xile Li. A Clustering-tree Topology Control Based on the Energy Forecast for Heterogeneous Wireless Sensor Networks[J]. IEEE/CAA Journal of Automatica Sinica, 2016, 3(1): 68-77. |
[20] | Shouguang Wang, Mengdi Gan, Mengchu Zhou, Dan You. A Reduced Reachability Tree for a Class of Unbounded Petri Nets[J]. IEEE/CAA Journal of Automatica Sinica, 2015, 2(4): 345-352. |
Temperature model | Training | Testing | ||
RMSE | Maximum error | RMSE | Maximum error | |
Random forest | 2.3 | 8.9 | 2.8 | 7.6 |
Modified AdaBoost.RT | 3.4 | 15.1 | 3.7 | 11.2 |
Pruned Bagging | 4.2 | 17.9 | 4.3 | 13.7 |