2. Computer Science and Technology, Nanjing University of Post and Telecommunications, Nanjing 210003, China;
3. State Grid Smart Grid Research Institute, Beijing 102209, China
I. INTRODUCTION
With the increasingly widespread application of information and communication technology, smart grid has gradually evolved into cyber physical system of deep integration between information space and physical space[1, 2, 3].
Because information system affects operation and control of physical system, all kinds of existing information systems with security risks will naturally be introduced into the physical systems. These security risks and vulnerabilities greatly increase the probability of intrusions and attacks on physical power systems from Internet[4, 5, 6, 7, 8]. Reference [4] presents a security-oriented stochastic risk management technique, which can calculate cyber-physical security indices to measure the security level of the underlying cyber-physical setting. Reference [5$-$8] analyzed all kinds of attacks on power physical systems, designed countermeasures for simultaneous intrusions, model-based attack and other network attacks (e.g., Dos, DDos, replay attack etc). Traditional power outage is caused when safety and reliability of physical systems are destroyed. However, with the increase of complexity in smart grid, all kinds of techniques which attack physical systems through security vulnerabilities of information systems have also increased. Due to dependence of physical power system on information system and control system, vulnerability of control network and complexity of power grid, local and small disturbances of control system by network intrusion attacks can cause collapse of physical systems. Finally, a large area of blackout will be produced. The significant potential security hazard of cyber physical power system (CPPS) is that security of information space will inevitably affect security of physical space so as to form a chain reaction, and eventually lead to the collapse of the physical space[9]. Iran's "Stuxnet" virus crossed the border of information and physical space and led to serious consequences. So, the rapid development of information and communication technology not only improves control capability of power grid, but also gives physical systems to lay security hazards. Timely detection and assessment of various security risks for CPPS is essential for the effective control and solution of security risks in CPPS.
Risk assessment[10] is probability estimation of risks that threats and vulnerabilities cause. In recent years, risk assessment has been widely applied to many aspects of the power system[11, 12, 13, 14]. But security risk assessment for cyber physical power system is also unusual.
However, complexity and dynamic change of CPPS cause the uncertainty of risk assessment. The unsuitable risk assessment methods will greatly affect accuracy of risk assessment. Traditional assessment methods include qualitative assessment, quantitative assessment and the integration of qualitative and quantitative assessment[10]. Due to the complexity, nonlinearity and uncertainty of security risk assessment, these assessment methods have some limitations, subjective arbitrariness and fuzziness. The operation of these methods is more, complex, and, lack of ability, of self-learning. So, many methods based on artificial intelligence and machine learning are introduced into security risk assessment. Reference [15] proposed intelligent information security risk assessment based on a decision tree algorithm. Reference [16] presents risk assessment of information security based on improved wavelet neural network. Information security risk assessment based on grey-analytic network process (G-ANP) was proposed in [17]. Reference [18] applied fuzzy expert system to information security risk assessment for an attendance system. Because of more security risk elements in CPPS, these methods are easy to lead to high computational load, low accuracy and complex operation of security risk assessment. The essential work in the process of security risk assessment is the construction of the risk level classification model. However, these methods can only be used to divide the risk level, which leads to the difficulty in solving the classification function model linkingrisk level and security risk factors.
In order to better solve these problems, combining with attribution reduction and function mining, security risk assessment of cyber physical power system based on rough set and gene expression programming (SRACPPS-RGEP) is proposed in this paper.
This paper makes the following contributions: 1) it presents fast attribution reduction based on binary search algorithm; 2) it proposes SRACPPS-RGEP and 3) it describes simulated experiments that have been done and provides performance analysis results.
The content of this paper is organized as follows. Section II provides fast attribution reduction based on binary search algorithm. Section III introduces SRACPPS-RGEP. Section IV represents experiments and analysis. Finally, conclusions are given in Section V.
II. FAST ATTRIBUTION REDUCTION BASED ON BINARY SEARCH ALGORITHM
A. Security Risk Element of CPPS
For security risk assessment of CPPS, the first thing to do is to analyze security risks of information and physical systems. For the physical systems, Reference [1, 19] noted that security risks include matching degree of local power supply, natural disasters, load changes of zone, transmission line fault, supply reliability of zone, small signal stability, transient stability, voltage stability, protection equipment failures, power communication equipment failures etc. For the information systems, Reference [19] noted that security risks of cyber systems include hardware and software failures, physical environment impact, malicious code, cyber-attack, physical attack, leakage, misrepresentation, repudiation, operational error, poor management, unauthorized use and abuse etc.
In order to better describe security risk element for CPPS, security risk elements' set is defined as follows.
$\textbf{Definition 1. Security risk elements' set.}$ Let the sets be defined as $P=\{\langle p_1 , v_1 \rangle, \ldots, \langle p_k , v_k \rangle , \ldots, \langle p_n , v_n \rangle\}$, $k\in [1, $ $n]$, $I=\{\langle i_1 , v_1 \rangle, \ldots, \langle i_j , v_j \rangle, \ldots, \langle i_m , v_m \rangle\}$, $j\in [1, m]$, $S$ $=$ $\{I, P, D\}$, where $I$ represents collection of security risk $i$ of information systems in CPPS and the corresponding risk values $v$, $P$ represents collection of security risk $p$ of physical systems in CPPS and $v$, and $D$ represents risk level of CPPS. $S$ is called security risk elements' set (SRES) of CPPS.
$\textbf{Definition 2. Security risk decision table.}$ Let decision table be $T=\langle U, C\cup D, V, f\rangle$, where $U$, $C\cup D=R$, $C$ $=\langle I, P\rangle$, $C=\{c_1 , c_2 , \ldots, c_n \}$, $D=\{d_1 , d_2 , \ldots, d_m \}$, and $V=\cup v_r $, $r\in{\bf R}$ respectively represent data collection, all cyber and physical security risk collection, risk level collection, risk value and risk level value collection of SRES. $f:$ $U\times{\bf R}\to V$ represents information function. $\forall, r\in {\bf R}$, $x$ $\in$ $U$, $f(x, r)\in v_r $. $T$ is called security risk decision table.
An example is shown as Table I.
B. FAR-BSA
For complex and high-dimensional data sets, direct analysis is very difficult. So, the dimension reduction is necessary. Existing methods on dimension reduction include principal component analysis (PCA), singular value decomposition (SVD) and rough set (RS) etc., where PCA and SVD inevitably result in the loss of part of the decision information. Dimension reduction based on RS does not change the decision rules of original data[20, 21]. Due to unique and equivalence of the optimal reduction, we just need to solve any optimal reduction. So, fast attribution reduction based on binary search algorithm (FAR-BSA) is proposed in this paper.
Firstly, related definitions of FAR-BSA are given as follows[22].
$\textbf{Definition 3.}$ Let security risk decision table be $T= \langle U$, $C\cup D, V, f\rangle$, where $\forall, P\subseteq {\bf R}$, and $x, y\in U$. $\forall, r\in P$, if and only if $f(x, r)=f(y, r)$, then $x$ and $y$ are indiscernible, and denoted by $U/R$ or $IND(P)=\{(x, y)\in U|\forall, r\in P$, $f(x, r)$ $=$ $f(y, r)\}$.
$\textbf{Definition 4.}$ Let security risk decision table be $T=\langle U, , C\cup D, V, f\rangle$, the positive region $POS_C (D) =\cup \{Y_i \subset U/C: Y_i \subset D\}$.
$\textbf{Definition 5.}$ Let security risk decision table be $T= \langle U, $ $C\cup D, V, f\rangle$. Dependence degree between condition attribution $C$ and decision attribution $D$ is denoted by $r_C (D)$, if $r_C (D)={card(POS_C (D))}/{card(U)}=1$, then $T$ is consistent, where $card\mbox{(}U\mbox{)}$ represents the number of elements in collection $U$.
$\textbf{Definition 6.}$ Let security risk decision table $T=\langle U, $ $C\cup D, V, f\rangle$ be consistent. For $c\subset C$, if $POS_C (D)$ $=$ $POS_{C-\{c\}} (D)$, then condition attribution $c$ is reducible.
$\textbf{Property 1.}$ Let security risk decision table $T=\langle U, $ $C\cup D, V, f\rangle$ be consistent. For $c\subset C$, if $POS_C (D)$ $=$ $POS_{C-\{c\}} (D)$, then $T'=\langle U, C-\{c\}\cup D, V, f\rangle$, which reduces condition attribution $c$ to be consistent.
$\textbf{Proof.}$ From Definition 5, for security risk decision table $T$ $=$ $\langle U, C\cup D, V, f\rangle$,
\begin{equation} \label{eq1} r_C (D)=\frac{card(POS_C (D))}{card(U)}=1 \end{equation} | (1) |
\begin{equation} \label{eq2} r_{C{-\{c\}}} (D)=\frac{card(POS_{C{-\{c\}}} (D))}{card(U)}=\frac{card(POS_C (D))}{card(U)}. \end{equation} | (2) |
$\square$
$\textbf{Definition 7.}$ Let condition attribution set $C'$ be a reduction of security risk decision table $T$. Reduction which contains the least condition attribution in $C'$ is called the optimal reduction of $T$.
$\textbf{Lemma 1.}$ Let security risk decision table $T=\langle U, C \cup D, $ $V, f\rangle $ be consistent. $S\subset C$, if $S$ is not reduction of $T$, then $\forall, S'$ $\subset$ $S$ is not reduction too.
$\textbf{Proof.}$ Supposing that $\forall, S'\subset S\subset C$ is a reduction, then from Property 1, $T$ is still consistent after condition attribution set $C$ removes $S'$. And because of $S'\subset S$, then $(C-S)\subset$ $(C-S')$. From Definition 6 and known conditions, we know that security risk decision table $T$ must be consistent after condition attribution set $C$ removes $S$. Then from Property 1, for $S\subset C$, $S$ is reducible too. This conclusion is inconsistent with the known conditions. So the original proposition is right.
$\square$
The basic steps of FAR-BSA are shown as follows.
$\textbf{Algorithm 1.}$ FAR-BSA ($T$)
$\textbf{Input.} T=\langle U, C\cup D, V, f\rangle, C=\{x_1 , x_2 , \ldots, x_n \}$;
$\textbf{Output.}$ bestReduction;
%\textbf{Algorithm}$ FAR-BSA ($T$)
${\bf Step 1.}$ int $m=n/2$;
${\bf Step 2.}$ Looping through all reduction for including $m$ conditional attribution in $C$. If finding a reduction according to coordination of security risk decision table $T$, then going to Step 3 until finding the optimal reduction, or going to Step 4;
${\bf Step 3.}$ If $m$ equals to 1, then the reduction is the bestReduction. If $m$ equals to $n$, then bestReduction equals to $C$. Otherwise, let $m$ equals to $m/2$, and go to Step 2;
${\bf Step 4.}$ If conditional attribution subset is not a reduction when $m$ equals to $n/2$, then let $m$ equals to $(m+n)/2$, and go to Step 2;
${\bf Step 5.}$ return bestReduction.
In this paper, we suppose that security risk decision table $T$ is coordinated. In FAR-BSA, maximum time complexity of computing conditional attribution combination is ${\rm O}(\log _2^m )$ $(1\le m\le n)$. Time complexity of computing a reduction by coordination of $T$ is ${\rm O}(| U| )$. So, the maximum time complexity of solving the optimal reduction is ${\rm O}(| U| \times \log _2^m )$.
III. SRACPPS-RGEP
Essentially, the risk assessment is to build a nonlinear function among system assets, threats and asset vulnerabilities[16]. In terms of function mining, risk assessment can be seen as a process which mines function model among factors affecting system security and risk level, and determines risk level of the current system based on the function model. Traditional function mining algorithms include regression methods, genetic programming (GP) and gene expression programming (GEP) etc. Regression methods assume that function type is known in advance, then estimate parameters by means of the least squares method[23] or the improved methods, and finally, obtain the function model. These methods depend on a priori knowledge and a lot of subjective factors so that the complex function model is not easy to be built. Moreover, these algorithms have high time complexity and low computation efficiency for complex high-dimensional data. To solve these problems, [24] used GP for mathematical modeling, obtained good experimental results, and avoided the defect of traditional statistical methods. However, the efficiency of function model mined by GP was low. So, a new algorithm which was called GEP was proposed in [25]. Compared with GP, efficiency of complex function mined by GEP was improved by 4-6 times.
Meanwhile, due to complexity of cyber physical power system, factors that affect security of CPPS are massive, complex and diverse. Risk assessment function model directly mined by GEP will lead to higher time complexity. In this paper, security risk assessment for CPPS based on rough set and gene expression programming (SRACPPS-RGEP) is proposed. Firstly, risk factors which affect security of CPPS are reduced by calling Algorithm 1. And then, function model among security risk factors in CPPS is mined by improved GEP.
A. Code of SRACPPS-RGEP
Gene code is an important expression form of SRACPPS-GEP and described as follows.
$\textbf{Definition 8.}$ Let function set be $F=\{+, -, \ast , /\}$, terminal set be $T=\{d_1 , d_{_2 } , \ldots, d_i , \ldots, d_m \}$, where $m$represents the number of security risk elements in CPPS. Then gene which is built according to rules and symbol in [25] is called risk assessment gene (RA-Gene), where head of RA-Gene $h$ is composed of elements of $F$ and $T$, and tail of RA-Gene $t$ is composed of elements of $T$. Moreover, the lengths of $h$ and $t$ follow the equation:
\begin{equation} \label{eq3} length(t)=length(h)\times (n-1)+1, \end{equation} | (3) |
SRACPPS-RGEP adopts linear code of fixed length[26] to represent an individual which is called a chromosome. In the meantime, a chromosome is composed of one or more RA-Genes.
B. Population of SRACPPS-RGEP
In evolution computation, diversity of population directly decides whether the optimal or suboptimal solutions can be obtained. At present, random method is applied to initial population in GEP. But diversity of population is limited. Reference [27] used gene space balance strategy to initialize population and enlarged diversity of initial population. However, the gene space balance strategy cannot dynamically increase diversity of population during the local convergence of GEP. Reference [28] used diversity-guided grading evolution strategy to change diversity of GEP population. But implementation of the algorithm was very complex. In [29], authors dynamically adjusted population size by self-adaptive coefficient. However, the algorithm cannot accurately reflect diversity of the current population. To solve these problems, adjustment of dynamic population based on variable-step (ADP-VS) is proposed in this paper. ADP-VS dynamically adjusts population size of GEP by variable step so as to better increase diversity of the current population and probability of searching the optimal solution.
The basic steps of ADP-VS are shown as follows:
$textbf{Algorithm 2.}$ ADP-VS ()
$\textbf{Input.}$ maximum generation $G_{\max} $, maximum step $S_{\max} $, maximum fitness $F_{\max} $;
$\textbf{Output.}$ NewPopulation;
Begin $\{$
1. ${\rm PopSize} = 500;\quad //$size of initial population
2. $\theta = 0;\alpha = 0.01;\beta = 0.05;\quad //$setting the initial value of related parameters
3. for $(i = 1;i < = {G_{\max }};i + + )\{ $
4. $\quad if(i\% 10 = = 0)\{ $
5. $\quad\quad$ if (maximum fitness value of population does not change in 10 generations){
6. $\quad \quad \quad \Delta = ({F_{\max }} - {F_{\max }}[i])/{F_{\max }};\quad //$computing error between the largest fitness value of current population and the real maximum fitness value
7. $\quad \quad \quad if(\Delta > \beta )\{ \theta = 200;{\rm{PopSize}} + = \theta ;{\rm{\} }}$
8. $\quad \quad \quad elseif(\alpha < \Delta < \beta )$
$\quad \quad \quad \quad \quad \quad \{ \theta = 100;{\rm{PopSize}} + = \theta ;\} $
9. ${\quad \quad \quad \quad \quad else\{ \theta = 0;\} }$
}}}}
End.
In Algorithm 2, $\theta $ represents step when population size changes, $\alpha $ and $\beta $ represent upper limit and lower limit of error between the maximum fitness value of current population and the real maximum fitness value respectively.
C. Description of SRACPPS-RGEP
To better describe SRACPPS-RGEP, the related concepts are given before risk assessment function mining algorithm is introduced.
$\textbf{Definition 9.}$ Let risk elements' set of CPPS be $S=\{\langle I, $ $V\rangle, \langle P, V\rangle, D\}$, and then $D=f(C_1 , \ldots, C_k , \ldots, C_{n+m} )$, $C$ $=$ $I\cup P$ is called risk assessment function (RA-Function).
$\textbf{Definition 10.}$ Let $F_{\rm optimal} $ be the optimal fitness value of risk assessment function based on GEP, $F_{\max} $ be maximum fitness value, then ${F_{\rm optimal} }/{F_{\max} }\times 100, \% $ is called risk assessment function mining success ratio (RAFMSR).
$\textbf{Definition 11.}$ In SRACPPS-RGEP, fitness function $f(i)$ of the $ i$-th individual is expressed by (3).
\begin{equation} \label{eq4} f_i =\sum\limits_{j=1}^n {(M-| V_{ij} -T_j | )}, \end{equation} | (4) |
Meanwhile, genetic operation of function mining based on GEP includes selection based on elitism, mutation, IS/RIS/Gene transposition, one-point/two-point/gene recombination etc.
The process of SRACPPS-RGEP is shown as follows.
$\textbf{Input.}$ $T=\langle U, C\cup D, V, f\rangle$, population size $\textit{PopSize}$, maximum iterations \textit{MaxGen}, maximum fitness value $\textit{MaxFitness}$, mutation probability $P_m $, transposition probability $P_t $, recombination probability $P_r $;
$\textbf{Output.}$ the optimal risk assessment function $\textit{BestRiskAss-Fun.}$
Begin {
1. Call FAR-BSA ($T$);\ \ //reduce security risk decision table
2. Initialization of Population $S;\quad //$initial population according to reduced decision table $T$
3. $gen=0$;
4. While ($fitness
5. $\quad$ GeneticOpertion $(S);\quad //$selection, mutation, transposition, and recombination operators are done
6. $\quad$ Calculating fitness value of every individual in population;
7. $\quad$ Keeping the best individuals;
8. $\quad fitness=Bestfitness;$
9. $\quad gen + + ;{\rm{\} }}$
10. return BestRiskAss-Fun.
D. Application Analysis of SRACPPS-RGEP
From introduction, it is known that security assessment or vulnerability
assessment is considered from the point of physical system or information
system in CPPS. SRACPPS-RGEP takes information system and physical system as
a whole to consider security risk assessment. It is vital to safe and
reliable operation of CPPS. Meanwhile, massive security risk logs obtained
from CPPS are stored in existing power security defense system. Security
risk model is extracted from these security risk logs by SRACPPS-RGEP.
Finally, the model can be a basis of decision-making in safe and reliable
operation of CPPS.
In order to better apply SRACPPS-RGEP to power security defense
system, firstly, interface of security risk logs in CPPS is given;
secondly, SRACPPS-RGEP analyzes security risk logs in CPPS by the
interface, moreover, analysis and computing will be transplanted
to distributed computing platform (such as cloud computing, etc.);
lastly, results will be returned to power security defense system
as a basis for adjustment of operation control strategy. The whole
process is shown in Fig. 1.
In Fig. 1, SRACPPS-RGEP is deployed on the private cloud computing
platform in power grid. The algorithm can analyze security risk log
of CPPS in parallel by interface and return results to power
security defense system.
IV. EXPERIMENTS AND ANALYSES
In order to better verify the feasibility and effectiveness of the
algorithms in this paper, simulation experiments are done in a
laboratory environment. Two experiments are given on the following
platform: Intel i5 2.3, GHz + 2, G + Win7 + Java etc.
The experimental data is mainly from China Southern Grid Provincial
Power Company. Firstly, security risks of cyber and physical system
in the power company are analyzed. Then the corresponding security
risk elements' set is obtained. Lastly, security risk decision table
is built by the security risk elements' set. In the decision table,
the number of condition attribution and decision attribution is 21
and 5 respectively. Decision attribution value includes higher,
high, medium, low and lower. The number of data set is 20. After
quantification and normalization, training data is composed of top
15 elements, and the remaining data is test data.
$\textbf{Experiment 1.}$ For the known security risk decision table
for CPPS, change of the number of condition attribution based on
FAR-BSA is shown in Table II. The optimal attribution reduction
based on FAR-BSA, PCA, SVD, attribution reduction algorithm based
on positive region (AR-PR), attribution reduction algorithm based
on discernable matrix (AR-DM), GEP-ARRS[26] and TS-ARRS
[30] is shown in Table III. Fig. 2 shows comparison of
time-consumption in which the optimal attribution reduction is
obtained respectively based on FAR-BSA, PCA, SVD, AR-PR, AR-DM,
GEP-ARRS and TS-ARRS under the condition that the algorithm runs 5
times.
It is well known that the optimal attribution reduction is not
unique for invariant condition attributions. From Table II, we know
that FAR-BSA is effective for solving an optimal attribution
reduction. After reduction, the number of condition attribution
reduces to about 76.19 %. In all condition attribution, only
five condition attributions including local power matching degree,
small signal stability, voltage stability, operational error, and
unauthorized use and abuse are retained so that complexity of
function mining by GEP will be greatly reduced. Meanwhile, according
to rough set theory, attribution reduction does not change the
ability to decide risk level of CPPS.
From Table III, an optimal reduction based on FAR-BSA, AR-PR,
AR-DM, GEP-ARRS and TS-ARRS includes five condition attributions.
An optimal reduction based on PCA and SVD includes respectively
seven and eight condition attributions, where the optimal
reduction based on FAR-BSA, AR-PR, AR-DM, GEP-ARRS and TS-ARRS
does not result in loss of the original decision information.
While attribution reduction based on PCA and SVD will inevitably
result in partial loss of the original decision information.
Fig. 2 shows that time-consumption for solving the optimal
reduction based on FAR-BSA, AR-PR, AR-DM, GEP-ARRS and TS-ARRS is
less than PCA and SVD. However, in comparison with AR-PR, AR-DM,
GEP-ARRS and TS-ARRS, time-consuming of FAR-BSA reduces to about
38.08 %, 41.35 %, 46.20 %, and 49.62 %,
respectively. This is mainly because that maximum time complexity of
FAR-BSA is ${\rm O}(|U| \times \log _2^m )$ $(1\le m\le \left| C
\right|)$, while time complexity of AR-PR, AR-DM, PCA, SVD, GEP-ARRS
and TS-ARRS is about, ${\rm O}(| C| \times | U| )$, ${\rm O}(| C|
\times \frac{| U| }{2})$, ${\rm O}(| C| ^3)$, ${\rm O}(| C| ^3)$,
${\rm O}(| C|$ $\times$ $| U| \times MaxGen)$, and ${\rm O}(| C|
\times | U| \times n^2)$ respectively. Here $| U| $ represents the
number of elements in collection $U$, $| C| $ represents the number
of elements in collection $C$, $MaxGen$ represents the maximum
generations of GEP and $n$ represents a number of variables in
TS-ARRS. Moreover, due to equivalence of reduction, FAR-BSA only
requires the solution of an optimal reduction. So, for security risk
decision table having a large number of condition attributions in
CPPS, FAR-BSA can not only solve the optimal attribute reduction,
but also make time-consuming to a minimum in comparison with other
attribution reduction algorithms.
$\textbf{Experiment 2.}$ On the basis of Experiment 1, for security
risk decision table in CPPS after reduction, performance of
SRACPPS-RGEP is described in Experiment 2. Parameters of GEP are
shown in Table IV. Fig. 3 shows comparison of difference between
optimal and maximum fitness value by GEP before and after
reduction. Under the condition that SRACPPS-RGEP runs 5 times,
comparison of average time-consuming before and after reduction is
shown in Fig. 4. Fig. 5 describes error between maximum fitness
value of current population and real maximum fitness value with
increase of population size. Figs. 6 and 7 show respectively the
comparison between model value by SRACPPS-RGEP and real value for
training and test data.
From Fig. 3, we know that in comparison with difference between
optimal and maximum fitness value before the reduction, difference
after the reduction has been maximally decreased about
89.81 %. Function mining success ratio has been maximally
reached about 99.86 %. This means that for high-dimensional
data sets, attribution reduction greatly improves success
probability of function mining without changing decision capability
of existing data sets. Fig. 4 shows that attribution reduction
greatly reduces average time-consuming of security risk assessment
function model based on GEP for the same security risk decision
table. Average time-consuming has been maximally dropped about
56.52 % under the same GEP parameters. This is mainly because
that complexity of GEP population has been drastically simplified by
reduction.
Optimal risk assessment function model by SRACPPS-RGEP is shown as
(4).
Fig. 1 Application process of SRACPPS-RGEP.
Fig. 2 Time-consumption for an optimal reduction based on the seven algorithms.
Fig. 3 Comparison of difference between optimal and maximum fitness value before and after reduction.
Fig. 4 Comparison of average time-consumptionof GEP before and after reduction.
Fig. 5 Comparison of error between maximum fitness value of current population and real maximum fitness value.
Fig. 6 Comparison between the values of training data and model.
Fig. 7 Comparison between the values of test data and model.
\begin{align}
\label{eq5}
& f(a, b, c, d, e)=-a+c+b\times \csc(\sin(\tan(b+d)))\notag \\[1mm]
&\qquad + b\times \frac{\sec(c+d-b\times \frac{e}{d})}{d}+a\times
\tan(\frac{c}{d}+d)
\end{align}
(5)
From Fig. 5, we know that with the increase of population size, error between maximum fitness value of current population and real maximum fitness value gradually declines. This is because that adjustment strategy of dynamic population based on variable-step is applied to dynamically increase population size and diversity of individual, thereby increase solution of the optimal individual. Meanwhile, degree of fitness between real value and model value of train data and test data for security risk decision table is shown respectively in Figs. 6 and 7. From Fig. 6, we can see that maximum error between model value and real value of security risk train data by (5) is 0.7441, and minimum error is 0.004. Fig. 7 indicates that maximum error between model value and real value of test data by (5) is 0.3091, and minimum error is 0.0497. It can be seen that the model has high prediction accuracy.
For security risk assessment of CPPS, the higher prediction accuracy of security risk assessment function model based on SRACPPS-RGEP is, the higher accuracy of assessment for security risk level of CPPS is.
V. CONCLUSION
With deep integration of information systems and physical systems in smart grid, various types of security risks of information systems will affect the normal operation of the physical system. Security risks of cyber physical power system need to be timely found and evaluated because handling of these security risks are essential for safe and reliable operation of cyber physical power system. In this paper, for better and faster solution of optimal attribute reduction on security risk decision table for CPPS, fast attribution reduction based on binary search algorithm (FAR-BSA) is firstly proposed. On the basis of FAR-BSA, security risk assessment algorithm for cyber physical power system based on rough set and gene expression programming (SRACPPS-RGEP) is described. Experimental results show that SRACPPS-RGEP can quickly discover optimal security risk function model, and the function model has high predictive accuracy. This will provide good foundation for timely and accurate assessment and prediction of security risk in CPPS in the future.
The existing power information security defense system does not take security assessment of CPPS into account. In the future, the SRACPPS-RGEP can be used as an important part of existing power information security defense system, and provide comprehensive security situation assessment for power system of China from information region to production and control region.
[1] | Zhao Jun-Hua, Wen Fu-Shuan, Xue Yu-Sheng, Li Xue, Dong Zhao-Yang. Cyber physical power systems: architecture, implementation techniques and challenges. Automation of Electric Power Systems, 2010, 34(16): 1-7 (in Chinese) |
[2] | Ju Ping, Qin Chuan, Huang Hua, Wu Feng, Jin Yu-Qing. Research trends of power system modeling geared to smart grid. Automation of Electric Power Systems, 2012, 36(11): 1-6 (in Chinese) |
[3] | Yu Yi-Xin, Luan Wen-Peng. Basic philosophy of smart grid. Journal of Tianjin University, 2011, 44(5): 377-384 (in Chinese) |
[4] | Vellaithurai C, Srivastava A, Zonouz S, Berthier R. CPINDEX: cyberphysical vulnerability assessment for power-grid infrastructures. IEEE Transactions on Smart Grid, 2015, 6(2): 566-575 |
[5] | Hong J H, Liu C C, Govindarasu M. Integrated anomaly detection for cyber security of the substations. IEEE Transactions on Smart Grid, 2014, 5(4): 1643-1653 |
[6] | Sridhar S, Govindarasu M. Model-based attack detection and mitigation for automatic generation control. IEEE Transactions on Smart Grid, 2014, 5(2): 580-591 |
[7] | Zonouz S, Davis C M, Davis K R, Berthier R, Bobba R B, Sanders W H. SOCCA: a security-oriented cyber-physical contingency analysis in power infrastructures. IEEE Transactions on Smart Grid, 2014, 5(1): 3-13 |
[8] | Hu J K, Pota H R, Guo S. Taxonomy of attacks for agent-based smart grids. IEEE Transactions on Parallel and Distributed Systems, 2014, 25(7): 1886-1895 |
[9] | Mei Sheng-Wei, Wang Ying-Ying, Chen Lai-Jun. Overviews and prospects of the cyber security of smart grid from the view of complex network theory. High Voltage Engineering, 2011, 37(3): 672-679 (in Chinese) |
[10] | Feng Deng-Guo, Zhang Yang, Zhang Yu-Qing. Survey of information security risk assessment. Journal of China Institute of Communications, 2004, 25(7): 10-18 (in Chinese) |
[11] | Guo Chuang-Xin, Yu Bin, Guo Jia, Wen Bo-Jian, Zhang Jin-Jiang, Zhang Li, Lu Hai-Bo, Li Bo. Security risk assessment of the IEC61850-based substation automation system. Proceedings of the CSEE, 2014, 34(4): 685-694 (in Chinese) |
[12] | Liu N, Zhang J H, Wu X. Asset analysis of risk assessment for IEC 61850-based power control systems, Part I: methodology. IEEE Transactions on Power Delivery, 2011, 26(2): 869-875 |
[13] | Liu N, Zhang J H, Wu X. Asset analysis of risk assessment for IEC 61850-based power control systems, Part II: application in substation. IEEE Transactions on Power Delivery, 2011, 26(2): 876-881 |
[14] | Bompard E, Gao C W, Napoli R, Russo A, Masera M, Stefanini A. Risk assessment of malicious attacks against power systems. IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans, 2009, 39(5): 1074-1085 |
[15] | Zhang Li, Yao Yi-Zhan, Peng Jian-Fen, Chen Hong-Bo, Du Yu-Ge. Intelligent information security risk assessment based on a decision tree algorithm. Journal of Tsinghua University (Science and Technology), 2011, 51(10): 1236-1239 (in Chinese) |
[16] | Zhao Dong-Mei, Liu Jin-Xing, Ma Jian-Feng. Risk assessment of information security based on improved wavelet neural network. Computer Science, 2010, 37(2): 90-93 (in Chinese) |
[17] | Zhao Gang, Wu Tian-Shui. Information security risk assessment based on G-ANP. Journal of Tsinghua University (Science and Technology), 2013, 53(12): 1761-1767 (in Chinese) |
[18] | Chang L Y, Lee Z J. Applying fuzzy expert system to information security risk assessment-a case study on an attendance system. In: Proceedings of the 2013 International Conference on Fuzzy Theory and Its Applications. Taipei, China: IEEE, 2013. 346-351 |
[19] | Wang Bo. Research on Security Risk and Vulnerability Assessment Methods of Complicated Power System [Ph. D. dissertation], Huazhong University of Science and Technology, China, 2011. (in Chinese) |
[20] | Yin Lin-Zi, Yang Chun-Hua, Wang Xiao-Li, Gui Wei-Hua. An incremental algorithm for attribute reduction based on labeled discernibility matrix. Acta Automatica Sinica, 2014, 40(3): 397-404 (in Chinese) |
[21] | Jiang Yun-Liang, Yang Zhang-Xian, Liu Yong. Quick distribution reduction algorithm in inconsistent information system. Acta Automatica Sinica, 2012, 38(3): 382-388 (in Chinese) |
[22] | Pawlak Z. Rough sets. International Journal of Computer and Information Sciences, 1982, 11(5): 341-356 |
[23] | Zeng Qing-Hong, Lu De-Tang. Curve and surface fitting based on moving least-squares methods. Journal of Engineering Graphics, 2004, 25(1): 84-89 (in Chinese) |
[24] | Koza J R. Genetic Programming II: Automatic Discovery of Reusable Programs. Cambridge: MIT Press, 1994. 110-199 |
[25] | Ferreira, C. Gene expression programming: a new adaptive algorithm for solving problems. Complex Systems, 2001, 13(2): 87-129 |
[26] | Deng S, Wang R C, Fu X, Yang L C. Gene expression programming for attribution reduction in rough set. International Journal of Computers and Applications, 2010, 32(2): 226-231 |
[27] | Hu Jian-Jun, Tang Chang-Jie, Duan Lei, Zuo Jie, Peng Jing, Yuan Chang-An. The strategy for diversifying initial population of gene expression programming. Chinese Journal of Computers, 2007, 30(2): 305-310 (in Chinese) |
[28] | Liu Qi-Hong, Tang Chang-Jie, Hu Jian-Jun, Zeng Tao, Liu Ying-Tian, Qiu Jiang-Tao. Gene expression programming based on diversity-guided grading evolution. Journal of Sichuan University (Engineering Science Edition), 2006, 38(6): 108-113 (in Chinese) |
[29] | Deng Song, Wang Ru-Chuan. Classification of distributed GEP-BP based on grid service. Acta Electronica Sinica, 2009, 37(11): 2600-2603 (in Chinese) |
[30] | Header A -R, Wang J, Fukushima M. Tabu search for attribute reduction in rough set theory. Soft Computing, 2007, 12(9): 909-918 |