 Original research
 Open access
 Published:
An improved constraintinference approach for causality exploration of power system transient stability
Protection and Control of Modern Power Systems volume 8, Article number: 59 (2023)
Abstract
Transient stability is the key aspect of power system dynamic security assessment, and datadriven methods are becoming alternative measures of assessment. The current datadriven methods only construct correlations between variables while neglecting causal relationships. Therefore, they face problems such as poor robustness, which restrict their practical application. This paper introduces an improved constraintinference approach for causality exploration of power system transient stability. Firstly, a causal structure discovery method of power system transient stability is proposed based on a PCIGCI algorithm, which addresses the shortage caused by Markov equivalence and massive variables. Then, a relative average causal effect index is proposed to reveal the relationship between relative intervention strength and causal effects. The results of a case study verify that the proposed method can identify the causal structure between the transient stability variables entirely based on data. In addition, the causal effect sorting between “cause” and “outcome” of transient stability variables is revealed. This paper provides a new approach for data mining to uncover the causal mechanisms between variables in power systems and expand the capabilities of datadriven methods in power system application.
1 Introduction
With the expansion of power grids, the operational mode of power systems is becoming more complex. Power systems face significant challenges in terms of security and stability, with transient stability assessment being a crucial component. The mechanism model analysis method based on reduction theory plays an important role in transient stability assessment. It includes numerical integration methods, direct methods [1,2,3], etc. However, the effectiveness of mechanism modeling methods relies on accurate models and parameters, which are increasingly difficult to achieve in complex power systems. In addition, numerical integration methods are unable to provide the evaluation results of transient stability directly, and thus they still require people to further analyze the simulation results and data.
In recent years, with the widespread installation of measurement devices in power systems and the improvement of data analysis and processing capabilities, analyzing the complex operational behavior of power systems based on datadriven methods has become a research hotspot [4,5,6,7,8]. Artificial intelligence models are representative applications of datadriven methods, which can construct complex mapping between input datasets and sample labels. These datadriven models have many advantages, including direct output of power system transient stability assessment results and significantly increasing the speed of evaluation through offline training and online matching.
However, these mapping relationships, built under the guiding principle of “correlation” with no regard for “causation”, cause predictions made by datadriven models to face problems such as poor robustness to out of distribution datasets [9,10,11,12] and difficulty in interpretation [13,14,15,16]. This also leads to the fact that such datadriven assessment methods have not been widely used in safety sensitive engineering scenarios.
In fact, the “causal relationship” between variables is a characterization of the physical problem. It plays an irreplaceable role in revealing the mechanism of events, guiding intervention behavior, and other aspects. It is also an important carrier of knowledge that is easy for humans to understand [17]. To bridge the gaps of existing datadriven methods, exploring causal relationships from data is a significant requirement in security sensitive scenarios such as power system transient stability analysis.
The theory of causation developed from statistics is concerned with discovering causal relationships behind data. Since the causal model was proposed at the end of the twentieth century, relevant research began to flourish, providing an important means for data analysis. At present, causal theory has achieved great success in many fields such as economics, education, sociology, etc. [18,19,20,21]. Causal inference is mainly concerned with the discovery of causal structure between variables and the evaluation of causal effect. Since causal inference usually needs to change the generation mechanism of target variables, this is also the key point in distinguishing between correlation and causality.
Currently, there are two widely accepted causal models: Rubin causal model (RCM) and structural causal model (SCM). RCM, also known as the potential causal framework [22], mainly studies the average causal effect of two variables, while SCM proposed by Pearl [23] uses a causal graph to model the causal relationship between variables. As well as causal effect estimation, it mainly focuses on the problem of causal structure discovery. The classical methods of causal structure discovery can be divided into constraintbased and structural equationbased methods. The PC [24], IC [25] and FCI [26] algorithms, as representatives of constraintbased methods, have the advantages of dealing with highdimensional variables and being applicable to both linear and nonlinear problems, but they can only give equivalence classes of possible causal graphs. The methods based on structural equations are to make certain assumptions about the form of structural equations, which can identify the complete causal graphs. However, their applicability is also limited by the equation form, such as LiNGAM [27] and ANM [28] algorithms, etc. New theories and methods of causal inference can provide a reference for revealing causal relationships between transient stability variables from data in modern power systems.
In the power industry, some initial attempts have been made to look at causal relationships. In [29], a reverse information entropy causal inference method (RIECI) is proposed by revealing the asymmetric attributes of the causal relationship between highly correlated pairs of variables in power systems, indicating the feasibility of analyzing causal relationships from operating datasets. However, the transient stability assessment of power systems is a high dimensional and complicated problem. Exploring complex causal relationships between variables based on datadriven methods still faces challenges.
The highlights of this paper are:

(1)
An improved causal structure discovery method based on a PCIGCI algorithm for datasets of power system transient stability assessment is proposed. It addresses the shortcoming of Markov equivalence class of the existing constrainedbased method.

(2)
The related average causal effect (RACE) index is proposed to quantitatively evaluate the causal effect under unit interventions. This reveals the relationship between relative intervention intensity of the cause variable and causal effects.
The rest of the paper is arranged as follows: classical causal inference methods are briefly introduced in Sect. 2, while an improved causality exploring method combining causal structure discovery with causal effect evaluation for transient stability assessment is proposed in Sect. 3. Example simulation and verification are provided in Sect. 4. Section 5 offers the conclusion of this paper.
2 Introduction to classical datadriven causal inference methods
2.1 A causal structure discovery method based on PC algorithm
The PeterClark (PC) algorithm proposed in [26] is one of the classic methods for causal structure discovery.
A causal network (also known as a structural causality graph) is represented by a directed acyclic graph (DAG) showing the probability dependencies between variables. It can be represented by a triple G = (V, E, P). Here, V = {v_{1}, v_{2}, …, v_{n}} is the set of all nodes in the DAG, and E = {e(v_{i}, v_{j})v_{i}, v_{j} ∈ V} is the set of singledirected edges between every two nodes, where e(v_{i}, v_{j}) denotes the causal relationship v_{i} → v_{j} between v_{i} and v_{j}. P = {P(v_{i}pa_{vi})v_{i}, pa_{vi} ∈ V} is the set of conditional probabilities.
The PC algorithm for causal structure construction is divided into two stages. The first stage aims to identify the dependencies between nodes and represent them as an undirected graph. The skeleton of the structural causality graph is constructed in this stage, as shown in Fig. 1a. The second stage aims to infer the direction of causal dependencies between nodes, extending the undirected graph to the DAG as shown in Fig. 1b.
2.1.1 Casual skeleton construction
A condition independence test is the main method for the PC algorithm to identify causal dependencies between variables.
The hth order sample partial correlation coefficient between any two variables i and j under k conditions can be estimated by:
It needs to be transformed into a normal distribution through a Fisher Z transformation, shown as:
For a given significance level \(\alpha \in (0,1)\), the test rule is shown as:
where Ф is the cumulative distribution function of N(0,1). If (3) is true, then the hypothesis that variables i and j are independent under condition k, must be accepted.
The dseparation criterion [24] can be applied to identify causal undirected graphs. Starting with an undirected complete graph, if there is no edge between nodes v_{i} and v_{j} in the graph, then there must be a set Z that dseparates v_{i} and v_{j}. By testing whether the subset of V/{v_{i}, v_{j}} can dseparate \(v_{i}\) and \(v_{j}\) one by one, the causal dependence between v_{i} and v_{j} can be inferred. Subsequent removal of nonexistent "edges" between variables forms the skeleton of the structural causality graph.
2.1.2 Causal direction inference
Some causal directions between cause variable and outcome variable can also be inferred based on the result of conditional independence. Consider a skeleton of three variables, v_{i}—v_{k}—v_{j}, where v_{i} and v_{j} are not connected by an edge and are independent, denoted as v_{i}⊥v_{j}. However, if v_{i} and v_{j} are not independent under the condition of variable v_{k}, then the causal dependency direction between variables in the undirected graph v_{i}—v_{k}—v_{j} can only be v_{i} → v_{k} ← v_{j} as shown in Fig. 2a. This is called a Vstructure. The Vstructure is a special form in structural causality graphs, and has unique identifiability in causal direction identification.
If the conditional independence constraint is v_{i}⊥v_{j} v_{k}, then there are three possible causal dependence directions between variables v_{i}, v_{j}, and v_{k}, as shown in Fig. 2b. Structures 1 and 2 are called “chain” structures, and structure 3 is called a “fork” structure. Thus, these causal directions cannot be uniquely identified.
2.1.3 The problem of incomplete causal structure graph caused by Markov equivalence
The generated structural causality graph typically contains both directed edges with nonreversible direction and undirected edges with reversible direction. This can only be called a completed partially directed acyclic graph (CPDAG) rather than a Bayesian network (BN).
For example, considering the CPDAG obtained from variables v_{i}, v_{j}, v_{k} and v_{l} using the dseparation criterion, as shown in Fig. 3a, G_{1} contains only a single Vstructure \(v_{j} \to v_{l} \leftarrow v_{k}\), and the two undirected edges \(v_{i}  v_{j}\) and \(v_{i}  v_{k}\) cannot be inferred with certain causal direction by the Meek principles [24]. Therefore, G_{2}, G_{3}, and G_{4}, which have the same skeleton and Vstructure, are Markov equivalent and form a Markov equivalence class of the Bayesian network.
Markov equivalent classes are a common problem faced by constraintbased methods. Because the direction of edges in the causal network cannot be completely determined, this not only greatly affects the understanding of the causal structure in the data, but also makes it impossible to infer the causal effect between some key variables.
2.2 Causal effect inference method based on ACE
After obtaining the causal relationship network between the variables, causal inference techniques can quantify the degree to which the “cause” affects the “outcome” in each causal direction, i.e., the causal effect.
The relationship between the outcome variable of sample i and whether it receives the intervention is shown as:
where y_{i} is the outcome variable, y_{1i} and y_{0i} represent the results of sample i after and before receiving the intervention, respectively. D_{i} = {0,1}, i.e., 1 for the treatment group and 0 for the comparison group. (y_{1i }− y_{0i}) represents the causal effect of whether sample i receives the intervention. However, because of individual differences in different samples, the impact of applying the same intervention on the results is different. To minimize the impact of individual differences in the samples, expectations can be taken for the causal effect of all samples, namely, the average causal effect (ACE), shown as:
The challenge in evaluating causal effects lies in the fact that \(y_{1i}\) and \(y_{0i}\) cannot be observed at the same time. It belongs to the “counterfactual” causal inference framework, i.e., for a single data sample, it can only be in one of the two states of being intervened or not being intervened. In fact, once data is collected, it is an unchangeable record, while how to implement “intervention” on data is also one of the key concerns of datadriven methods.
Matching estimator methods provide a feasible solution to this problem. If sample i belongs to the treatment group, a sample j in the comparison group is found such that the covariates x (features other than \(\left\{ {y_{i} ,D_{i} } \right\}\)) of sample j are as close as possible to those of sample i, i.e., x_{i} ≈ x_{j}. In this case, \(y_{j}\) can be used as the estimation of y_{0i}, i.e., \(\hat{y}_{0i} = y_{j}\).
Practically, it is difficult to find similar x_{j} to match x_{i} in highdimensional space if the dimension of x_{i} is very high, with a great risk of match failure. In addition, the average causal effect only controls the intervention intensity of cause variables by dividing the samples into a treatment group and a comparison group, but cannot quantitatively reveal the relationship between intervention intensity and causal effect.
3 Improved constraintinference approach for causality exploration of power system transient stability
To address the shortcomings of existing causal inference methods introduced in Sect. 2, this section proposes an improved constraintinference approach for causality exploration of power system transient stability. This includes an improved causal structure discovery method based on the PCIGCI algorithm, and a causal inference method based on propensity score matching with nearestneighbour within calliper and relative average causal effect indicators (RACE).
3.1 Improved causal structure discovery method based on PCIGCI algorithm
To overcome the Markov equivalence problem, some have proposed causal function models from the perspective of the distribution characteristics of data caused by the causal mechanism. The information geometric causal inference (IGCI) algorithm proposed in [30] is a typical representative of the causal function algorithm. It uses the independence between distribution of the cause variable and the “causeeffect” function mechanism to determine the causal relationship between variables and has been proven to be reliable in nonlinear causal direction mining problems.
The IGCI causal determination indicators for variables x and y are as follows:
If \(C_{x \to y} < 0\), then x is inferred to cause y;
If \(C_{x \to y} > 0\), then y is inferred to cause x.
In (6), \(D(\cdot\mid\mid\cdot)\) is the relative entropy distance and \(S( \cdot )\) is the differential entropy. The detailed expressions for these quantities can be found in [30].
However, the IGCI algorithm is mainly used for determining the causal direction between two variables, because the method itself does not have the ability to identify the causal graph skeleton from the data. This makes it difficult to apply in identification of the structural causality between highdimensional variables. In addition, if there is no possible causal candidate direction between variables in the first place, it is not possible to obtain reliable results using only the IGCI method.
The improved causal structure discovery method based on PCIGCI proposed in this paper uses the IGCI method to extend the causal discovery on the basis of the CPDAG generated by the PC algorithm, determining the causal directions of the undirected edges, and generating a causal network expressed as DAG. It solves the deficiency of causal structure discovery methods being only based on constraint, and improves the conditions for the application of IGCI through the prior causal skeleton construction, making the causal direction recognition results more reliable.
3.2 Causal effect inference method based on PSMNNC method and RACE index
To solve the problem of samples matching between treatment and comparison groups caused by the highdimensional characteristics of samples in the evaluation of causal effect, this paper proposes a sample matching method based on propensity score. In addition, because the datadriven transient stability evaluation datasets have a large number of samples, the onetomany sample matching method can make the evaluation of causal effect more accurate. Here the nearestneighbour matching within caliper algorithm is introduced. This is called the PSMNNC method.
The propensity score is the conditional probability that sample i enters the treatment group, i.e., \(p(x_{i} ) = P(D_{i} = 1x = x_{i} )\). Therefore, the main steps for calculating ACE through the PSMNNC are as follows:

(1)
Select covariates x_{i}. Include as many variables as possible that may affect \(\left( {y_{0i} ,y_{1i} } \right)\) and \(D_{i}\).

(2)
Estimate propensity score. Here logistic regression is used to establish a regression model to evaluate the likelihood of each sample receiving intervention.

(3)
Perform propensity score matching. The nearestneighbor matching within the caliper algorithm is introduced, i.e., finding the closest match within the bias γ range. This can improve the efficiency of sample matching and the accuracy of causal effect estimation.

(4)
Calculate ACE estimated value for the matched samples, as:
$$ACE = \frac{1}{N}\sum\limits_{i = 1}^{N} {\left( {\hat{y}_{1i}  \hat{y}_{0i} } \right)}$$(7)
where N is the sample size, \(\hat{y}_{1i}\) and \(\hat{y}_{0i}\) are the estimates of \(y_{1i}\) and \(y_{0i}\), respectively.
To reveal the impact of intervention intensity on causal effects, it is assumed that \(E(C_{i} D_{i} = 0)\) and \(E(C_{i} D_{i} = 1)\) represent the expected values of the cause variable in the comparison and treatment groups, respectively. The difference between them can indirectly reflect the average intervention strength on the cause variable, represented by \(Do( \cdot )\), i.e.:
Based on quantitative evaluation of intervention intensity, RACE is proposed for calculating ACE under unit intervention strength. This can help to compare and analyse the causal effect under relative intervention strength of the different cause variables. The calculation equation is shown as:
The RACE index can be used to reveal the sorting of the influence of transient stability results when the intervention of the same relative intensity is applied to different cause variables. This provides a basis for explaining the influence of different causes on the evaluation results of transient stability from the perspective of causal effect.
3.3 Sample set construction of power system stability assessment used for causality inference
Although measured data with unknown generation mechanisms can be directly used to construct sample sets for causal reference, large perturbations rarely occur in actual power systems and there are few examples of transient instability. Therefore, constructing a transient stability assessment data sample set based on simulation is an effective method to solve the shortage of measured data.
There are many factors affecting transient stability of power systems. They can be classified as operating variables that change with operating conditions and structure variables determined by the grid structure and component parameters. Usually, some operating variables need to be specified in the simulation, such as P_{G0}, U_{G0}, P_{L0}, Q_{L0}, T_{f}, and F_{l}. P_{G0} represents the set of generator steadystate active power variables, U_{G0} is the set of initial voltages of generation nodes, P_{L0} and Q_{L0} represent the active and reactive load levels before the disturbance, T_{f} represents the fault duration, and F_{l} represents the location of the fault occurrence. This can be expressed by continuous variables such as electrical distance, or by discrete coding of the fault location.
The samples of other operating variables need to be generated through power flow equations or calculations based on dynamic simulation results data, such as Q_{G0}, δ_{0}, V_{0}, θ_{0}, SI, UI, and S_{sys}. Q_{G0} represents the set of generator steadystate reactive output variables, δ_{0} represents the initial generator power angle, while V_{0} and θ_{0} represent the predisturbance voltage amplitude and phase angle of load and contact nodes, respectively. SI and UI represent the stability margin evaluation and instability evaluation indices of the generator. These can quantitatively evaluate the transient stability margin of each generator based on the trajectory information after the disturbance, and the calculation method is described in [31]. S_{sys} is the system stability flag, which takes the value of 0 or 1, where 0 represents instability, 1 represents stability, and the maximum relative power angle difference can be used as a criterion.
If the impact of structure variables on transient stability assessment is not of concern, then the structural variables can be specified as constants, such as X_{d}^{’}, T_{j}, Z_{l}, where X_{d}^{’} is the daxis transient reactance of generator, T_{j} is the generator inertia time constant, and Z_{l} is the impedance parameters of transmission lines.
Numerous data samples are generated by perturbing some specified operating variables and concerned structure variables. We assume that for each variable participating in the perturbation, a Gaussian distribution with expectation of \(\mu { = }\mu_{{}}^{{{\text{sp}}}}\) and standard deviation of \(\sigma { = }\sigma_{{}}^{{{\text{sp}}}}\) is satisfied. \(\mu_{{}}^{{{\text{sp}}}}\) and \(\sigma_{{}}^{{{\text{sp}}}}\) are the given values of expectation and standard deviation, respectively. A sample set is constructed by combining a total of m perturbations. Thereby, the original sample set matrix \({\text{X}}_{{{\text{ori}}}}\) used for transient stability causality inference can be denoted as:
where \(O_{{{\text{ori}}}}^{{{\text{spec}}}}\) represents the samples of specified operating variables, \(O_{{{\text{ori}}}}^{{{\text{cal}}}}\) represents the calculated samples of operating variables, and \(S_{{{\text{ori}}}}^{{{\text{spec}}}}\) represents the samples of perturbed structure variables.
In a causal network, a given observed variable can be divided into an exogenous variable set U and an endogenous variable set V, according to whether it can be determined or influenced by other variables. The variables in V are often the ones that need to be interpreted. Dividing variables into endogenous and exogenous endows them with explained or to be explained attributes. This helps to gain a profound understanding of the identified structural causal relationships.
4 Case study
To comprehensively test the effectiveness of the causality exploring method proposed in this paper, this section first takes a single machine infinite bus (SMIB) system as an example to verify the feasibility of discovering the causal structure and evaluating causal effect from the operational data sets. Then the case of WSCC9 is used to verify the capability of discovering highdimensional causal relationships.
4.1 Causal inference of transient stability in SMIB system
4.1.1 Case settings
The topology of the SMIB system is shown in Fig. 4. The system reference voltage is 230 kV, and the generator uses the classical secondorder model. A threephase shortcircuit fault at 50% of the branch from Bus 2 to Bus 3 is applied, with simulation starting at 0 s and a time step of 0.01 s.
Following the sample generation method introduced in Sect. 3.3, the original sample set matrix X_{ori} for transient stable causal inference is constructed. It contains 3 specified operating variables of P_{G0}, U_{G0} and T_{f}, 8 calculated operating variables of Q_{G0}, δ_{0}, V_{03}, \(\theta_{\left 0 \right1}\), \(\theta_{\left 0 \right2}\), SI, UI, and S_{sys}, and 2 concerned structural variables of T_{j}, X’_{d}. The dataset consists of 5000 samples.
4.1.2 Causal structure discovery test on data sets
The causal structure discovering method described in Sect. 3.1 is applied to X_{ori}, and the significance level \(\alpha = 0.05\) is set to identify the causal structure. The results are characterized as a structural causality graph, denoted as G_{ori}. There is no undirected edge caused by Markov equivalence in G_{ori}, which means that the causal relationships in the graph are determined. The structural causality graph of the SIMB is shown in Fig. 5.
In Fig. 5, the variables in purple boxes represent exogenous variables, and those in red circles represent endogenous variables. According to the causal structure discovery results based on PCIGCI, the specified variables are all identified as exogenous variables, which is consistent with the generation mechanism of the dataset.
The oriented edges are mainly divided into two categories. One is from exogenous variables to endogenous variables, which reveals the direct causal relationship between the specified operating parameters and the result of transient stability assessment, such as T_{j} → S_{sys} and T_{f} → SI, represented by the blue arrows. The other category of oriented edges is from endogenous variables to endogenous variables, which indicates that the implied structural causal relationship among the transient stability variables can also be identified, such as \(Q_{{{\text{G}}\left 0 \right}} \to V_{\left 0 \right2} \to S_{{{\text{sys}}}}\), represented by the green arrows.
To examine the influence of different operating conditions on causal structure discovery, causal structures among the transient stability variables are evaluated to determine whether they are stable and widely supported in different datasets, or are sensitive to changes in operating conditions when the distribution characteristics of variables in the operating dataset change. The causality support rate (CSR) index defined in (11) is thus put forward to quantitatively evaluate the support degree of each causal rule based on the overall datasets, as:
where \(p_{j}^{(i)}\) represents whether the jth causal rule exists on graph G_{i}. If it exists, \(p_{j}^{(i)}\) is set to 1, conversely, \(p_{j}^{(i)}\) is set to 0. N represents the number of datasets which are generated by the same or different mechanisms.
The degree of directional asymmetry (DDA) index is proposed in (12). This is calculated based on CSR_{j}. From the perspective of asymmetric causal directions, the support degree of causal directions based on datasets of different generation mechanisms is evaluated.
where \(CSR_{j+}\) represents the forward support rate of the jth causal rule, and \(CSR_{j}\) represents the reverse support rate of the jth causal rule.
It is considered that the variables affecting the operating condition of the power system include P_{G0}, U_{G0} and T_{f}. Each variable is perturbed respectively while the distribution of other variables is frozen. Three perturbation amplitudes of − 10%, 5% and 10% are considered based on the original scenario, so 9 extended datasets are formed and denoted as X_{exti}, {i = 1,2,…,9}. The overall datasets include the original and extended datasets, denoted as X_{all}. Therefore, X_{all} contains 10 data subsets.
The reliability of all occurred cause → outcome oriented edges generated by PCIGCI is assessed using CSR and DDA, and the assessment results are shown in Table 1.
As shown in Table 1, a total of 28 causal relationships are identified based on each data subset, and they have been arranged in descending order according to the DDA index. The asymmetry of causal directions from serial numbers 1 to 24 are above 0.6, which illustrates the high support of these causal relationships in the datasets, while it also indicates the relative certainty of causal direction among variables. In contrast, examining the causal relationship of No. 28, although the support rate of positive causality reaches 40%, the support rate of reverse causality is also as high as 30%, and the asymmetry of causal direction is 0.1. The degree of causal information asymmetry is low, so that this causal relationship cannot be effectively confirmed. For the causal relationships from No. 25 to 28, the DDA indices are all not more than 0.3, and such causal relationships are obviously unreliable.
There is a fundamental difference between correlation and causality. If the causal relationship of No. 27 is investigated, the Pearson correlation coefficient between θ_{01} and θ_{02} is 0.987, showing an extremely significant correlation. However, the index of causality asymmetry is only 0.1, because the two variables have a similar data generation mechanism and distribution characteristics. Figure 5 also shows that such causal relationships cannot be effectively supported in dataset X_{ori}.
To compare and analyze the influence of different operational modes on the discovery of causal structure, 5000 samples are extracted from X_{all} in a random order each time, with a total of 10 data subsets generated by putting them back for sampling 10 times, and PCIGCI is applied to each data subset. The CSR and DDA indices are used to reveal the identified causal structures when the data is mixed with multiple operational modes.
Using the above example conditions, the first 24 causal structures in Table 1 are effectively identified, and the DDA indices are greater than 0.8, while No. 25–27 causal rules are not identified by any data subsets. For the causal relationship of UI → SI, CSR_{+}/CSR_{} is 30/20, and DDA index is retained as 0.1. In fact, it shows that even if the distribution of the overall dataset is the same as X_{all}, adopting different data subset partitioning rules will lead to different distribution characteristics of variables, thus affecting the results of causal structure discovery.
4.1.3 Causal effect inference in SMIB
From the causal relationship shown in Table 1, X_{all} is designated as the data source of causal inference. The cause variables (such as \(T_{{\text{j}}}\), \(T_{{\text{f}}}\), \(P_{{{\text{G}}\left 0 \right}}\), \(\delta_{\left 0 \right}\)) and outcome variables (transient stability margin index SI, stability label S_{sys}) of interest are selected to form 7 pairs of causal relationship. In addition, the intervention direction for the cause variables is set to increase, and the sample group division is shown as:
where \(C_{i}^{(j)}\) represents the jth observation data of the ith cause variable. \(D_{i}^{(j)}\) is the corresponding group label for \(C_{i}^{(j)}\), while \(D_{i}^{(j)} = 1\) indicates that the jth observation sample belongs to the treatment group and \(D_{i}^{(j)} = 0\) indicates that the jth observation sample belongs to the comparison group.
The causal effect inference results are shown in Table 2. It is clear from Table 2 that the increase in T_{j} has a positive effect on the transient stability margin of the generator, with an overall causal effect of 1.43 in the entire sample set. It indicates that the average change in stability index SI resulting from an increase in T_{j} is 1.43 when comparing all samples with T_{j} > 20 s and T_{j} < 20 s. Based on the meaning of the stability index SI, it implies an increase in transient stability margin. The ACE index represents the “average intervention effect” of only the cause variables, under a variety of complex combinations of transient stability operating conditions and fault information, while minimizing the impact of other operating condition differences on the response variables.
Similarly, the ACE index on T_{j} → S_{sys} is 0.36, indicating that an increase in the inertia leads to an average increase of 36% in the probability of the system remaining stable. Conversely, the causal effect of an increase in the fault duration T_{f} on stability margin is − 0.18. It can be inferred from the dataset as a general conclusion that an increase in fault duration is detrimental to transient stability of power systems.
Taking the initial generation P_{G0} as an example, the expectation value of the samples that receive intervention is 2.27 pu, and that of the samples that do not receive intervention is 1.74 pu. Thus the average intervention strength is 0.53 pu, and the causal effect on the stability index SI is − 2.47. As a result, the probability of power system maintaining transient stability after intervention will decrease by 72%.
From the perspective of effect, the strength of interventions can be directly compared. Based on data from the treatment and comparison groups, the intervention of applying a 0.53 pu on P_{G0}, when compared to the intervention of applying a 0.06 s on fault duration T_{f}, has a greater causal effect on the transient stability margin of the power system. It can also be seen that the intervention strength of the former is greater than that of the latter, and this is more detrimental to transient stability of the power system, although the two have different dimensions.
To compare the causal effects of applying the same relative intervention intensity to different cause variables, Table 2 has arranged the causal relationships in descending order according to the RACE. For the causal relationship related to SI, the RACE of T_{f}, P_{G0}, T_{j}, and δ_{0} are − 30, − 4.67, 0.152, and − 0.148, respectively. This indicates that when the same relative intervention intensity is applied to the above four cause variables, T_{f} has the strongest causal effect, and is much higher than the other three.
4.1.4 Causal effect reliability test
This paper conducts two refutation experiments to test the causal relationships, i.e., by adding random confounding factors and placebo interventions respectively. This is because causal relationships identified based on observational datasets are difficult to prove true, but their falsity can be revealed by the abnormal behavior of the model in refutation tests. If the causal relationships are correctly identified, the causal effects after adding random confounding factors should be very close to the original results, while placebo interventions replace the identified cause variables with independent random variables and recalculate the causal effects. For the results, if the causal effects greatly decrease or even approach 0, it indicates that the causal relationship is relatively reliable. Table 3 shows some refutation test results of causalities, and indicates that all the causal relationships represented in Fig. 5 have passed the refutation test.
4.1.5 The influence of changing intervention strength on transient stability
The application of interventions is further examined with different strengths on the cause variables that affect power system transient stability. By observing the changes in “causal effects”, it can demonstrate a feasible approach for revealing the physical mechanisms of transient stability based on power system operating datasets.
First, the samples in datasets are divided into 5 groups based on different range values of the cause variables. The first group is the comparison group, and the other 4 groups are divided into different treatment groups according to the intervention strength from low to high. The variable interval division is shown in Table 4, and the causal effects on \(T_{{\text{j}}} \to SI\), \(T_{{\text{j}}} \to S_{{{\text{sys}}}}\), \(T_{{\text{f}}} \to SI\), \(T_{{\text{f}}} \to S_{{{\text{sys}}}}\), \(P_{{{\text{G}}\left 0 \right}} \to SI\), and \(P_{{{\text{G}}\left 0 \right}} \to S_{{{\text{sys}}}}\) under different intervention strengths are shown in Fig. 6.
As can be seen from Fig. 6a and b, when different strengths of intervention are applied to T_{j}, i.e., when \(Do(T_{{\text{j}}} )\) takes values of 2 s, 4 s, 5.9 s, and 7.9 s, the causal effect on the transient stability of the power system gradually increases. On the one hand, it intuitively shows that an increase in T_{j} has a positive causal effect on transient stability, while on the other hand, it shows that as the intensity of intervention on variable T_{j} increases, the causal effect will also change monotonically. Similarly, as shown in Fig. 6c and d, the causal effect of P_{G0} on SI and S_{sys} varies monotonically with the intervention intensity, but in contrast to T_{j}, positive intervention on P_{G0} has a negative causal effect. This implies that the average causal effect based on counterfactual inference can correctly reflect the effect strength of “cause” on “outcome”.
Figure 6e and f show the changes in causal effect of power system transient stability with different strengths of intervention applied to T_{f}. It can be seen that as Do(T_{f}) increases from 0.02 s to 0.08 s, the causal effect on SI decreases from − 0.117 to − 3.375, and the causal effect on S_{sys} decreases from − 0.032 to − 0.683. Additionally, there is a significant nonlinear change in the relationship between the intervention strength on T_{f} and the causal effect. It can be interpreted such that in the process of Do(T_{f}) increasing from 0.04 to 0.06, a large number of samples originally considered to be transiently stable by the datasets are inferred to be in a critical unstable state, and thus the probability of instability increases significantly. The stability margin of unstable samples becomes 0 and no longer changes with further increases in intervention strength. Therefore, when Do(T_{f}) increases from 0.06 to 0.08, the causal effect barely changes.
4.2 Causal inference of transient stability for multimachine systems
To verify the effectiveness of the proposed method in solving the highdimensional and complex problem of transient stability in power systems, further analysis is conducted based on the WSCC 9bus standard test case, as shown in Fig. 7. The same data perturbation method as described in Sect. 3.3 is used to construct the observation datasets, with a sample size of \(S = 10^{4}\). In this case, variables are perturbed with a standard deviation taken as 10% to 15% of the expected value. The fault set contains three faults, which are threephase short circuits at Bus 5, Bus 7, and Bus 9 at 0 s.
The PCIGCI method is applied to discover the causal structure of the WSCC 9bus system, as shown in Fig. 8. The causal graph contains a total of 58 causal relationships, and the causalities show significant multimachine coupling characteristics compared with the SMIB system, revealing a complex causal mechanism of a transient stability problem in multimachine power systems.
Taking the transient stability margin SI_{2} of generator 2 for further analysis, the sample set is first screened according to the location of the fault, and only those samples with faults occurring at Bus 7 are retained. The causal effects of the fault duration \(T_{{\text{f}}}\), the inertial time constants \(T_{{{\text{j}}1}}\), \(T_{{{\text{j2}}}}\), \(T_{{{\text{j}}3}}\), and the daxis transient reactance \(X^{\prime}_{{{\text{d1}}}}\), \(X^{\prime}_{{{\text{d2}}}}\), \(X^{\prime}_{{{\text{d3}}}}\) on SI_{2} are examined. The causal effects ACE and the causal effects under per unit intervention strength RACE are shown in Table 5, and have been arranged in descending order of RACE. As is seen from Table 5, the causal effect of \(T_{{{\text{j2}}}}\) on SI_{2} is the largest and the causal effect of \(X^{\prime}_{{{\text{d3}}}}\) on SI_{2} is the smallest under the applied unit intervention.
A timedomain simulation model of the standard WSCC 9bus system is also set up. The fault location is set at Bus 7, and the 7 cause variables listed in Table 5 are positively perturbed to 1.1 times the values in the standard case. The stability margin change indicator \(\Delta SI_{2}\) of generator 2 is shown in Table 6.
Table 6 shows that the ranking of the \(\Delta SI_{2}\) index is the same as that of RACE, indicating that the RACE index has the ability to evaluate the causal effect of counterfactual inference on different dimensional variables. It can provide method support for further revealing the physical mechanisms of power system transient stability.
5 Conclusion
The current datadriven transient stability assessment methods mainly focus on constructing correlation relationships between variables. Because they neglect the causal relationships between variables, they face poor robustness and difficulty in interpretation. This restricts engineering application. Combined with the new advances in causal theory, this paper takes the power system transient stability problem as the object, and proposes improved methods for discovering causal structure and inferring causal effects based on operational datasets. The main conclusions are:

(1)
An improved causal structure discovery method based on the PCIGCI algorithm for datasets of power system transient stability assessment is proposed. This addresses the shortcomings of Markov equivalence and massive variables. It proves the feasibility of discovering the causal structure based on operational datasets.

(2)
The RACE index is proposed to quantitatively evaluate the causal effect under unit intervention. It has the ability to evaluate the causal effect of counterfactual inference on different dimensional variables. RACE can be used to reveal the sorting of the causal effects, and to provide an approach for explaining the influence of different causes on the evaluation results of transient stability.

(3)
Exploring the causal relationship between variables of transient stability assessment based on data expands the capabilities of datadriven methods and helps to understand the deeper mechanisms in the power system transient stability problem.
In future research, there exists the potential to develop a stable learning model for transient stability assessment that uses causal relationships. This approach would mitigate the incorporation of irrelevant local features from the data, facilitate the integration of identified causal relationships as constraints during the datadriven model learning phase, and effectively eliminate spurious correlations among variables. As a result, this strategy could effectively curtail overfitting tendencies and bolster the overall robustness of the model.
Availability of data and materials
Please contact author for data and material request.
References
Chen, Y., Shen, C., & Wang, J. (2009). Distributed transient stability simulation of power systems based on a JacobianFree NewtonGMRES method. IEEE Transactions on Power Systems, 24(1), 146–156.
Magnusson, P. C. (1947). The transientenergy method of calculating stability. Transactions of the American Institute of Electrical Engineers, 66(1), 747–755.
Xue, Y. (1992). Extended equal area criterion revisited. IEEE Transactions on Power Systems, 7(3), 1012–1022.
An, J., Zhang, L., Zhou, Y., & Yu, J. (2022). Transient stability margin prediction under the concept of security region of power systems based on the long shortterm memory network and attention mechanism. Frontiers in Energy Research, 10(332), 1–15.
Zhang, R. Y., Wu, J. Y., Li, B. Q., & Shoa, M. (2020). Selfadaptive power system transient stability prediction based on transfer learning. Power System Technology, 44(6), 2196–2205.
Zhou, Z. H., Bu, G. Q., Ma, S. C., Luo, Y., & Han, N. (2021). Assessment and optimization of power system transient stability based on featureseparated neural networks. Power System Technology, 45(9), 3658–3667.
An, J., Yu, J., Li, Z., Zhou, Y., & Mu, G. (2020). A datadriven method for transient stability margin prediction based on security region. Journal of Modern Power Systems and Clean Energy, 8(6), 1060–1069.
Zhang, L., An, J., Zhou, Y. B. (2023) Transient stability evaluation of power system based on temporal convolution and graph attention network. In: Automation of electric power systems (pp. 1–12), Available: http://kns.cnki.net/kcms/detail/32.1180.tp.20221116.1730.010.html
Bagnell, J. A. (2005) Robust supervised learning. In Proceedings of the 20th national conference on artificial intelligence (pp. 714–719), Menlo Park, CA.
Hua, W. H., Niu, G., Sato, I., Sugiyama, M. (2018) Does distributionally robust supervised learning give robust classifiers?. In Proceedings of the 35th international conference on machine learning (pp. 2029–2037), Cambridge MA.
Rahimian, H., Mehrotra, S. (2019) Distributionally robust optimization: A review. [Online]. arXiv preprint arXiv:1908.05659.
Xu, H., Ma, Y., Liu, H. C., Deb, D., Liu, H., Tang, J. L., & Jain, A. K. (2020). Adversarial attacks and defenses in images, graphs and text: a review. International Journal of Automation and Computing, 17, 151–178.
Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). GradCAM: Visual explanations from deep networks via gradientbased localization. In Proceedings of the IEEE international conference on computer vision, Piscataway (pp. 618–626).
Schwab, P., & Karlen, W. (2019). CXPlain: Causal explanations for model interpretation under uncertainty. Advances in neural information processing systems, 32(10220), 10230.
Madumal, P., Miller, T., Sonenberg, L. (2020). Explainable reinforcement learning through a causal lens. In Proceedings of the 34th AAAI conference on artificial intelligence, Palo Alto, CA (pp. 2493–2500).
Kanamori, K., Takagi, T., Kobayashi, K., Ike, Y., Uemura, K., & Arimura, H. (2021). Ordered counterfactual explanation by mixedinteger linear optimization. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 35, No. 13, pp. 1156411574).
Glymour, M., Pearl, J., & Jewell, N. P. (2016). Causal inference in statistics: A primer (pp. 1–160). John Wiley & Sons.
Hu, H., & Kerschberg, L. (2018). Evolving medical ontologies based on causal inference (pp. 954–957). Sanya: ASONAM.
Nugroho, F. A., Ederveen, T. H., Wibowo, A., Boekhorst, J., de Jonge, M. I., & Heskes, T. (2019). Application of a causal discovery model to study the effect of iron supplementation in children with iron deficiency anemia. In 2019 3rd international conference on informatics and computational sciences (ICICoS) (pp. 1–5). IEEE.
Trenberth, K. E. (2012). Framing the way to relate climate extremes to climate change. Climatic Change, 115(2), 283–290.
Nowack, P., Runge, J., Eyring, V., & Haigh, J. D. (2020). Causal networks for climate model evaluation and constrained projections. Nature Communications, 11(1), 1–11.
Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5), 688–701.
Pearl, J., & Mackenzie, D. (2018). The book of why: The new science of cause and effect (pp. 23–51). Hachette.
Kalisch, M., & Buehlmann, P. (2007). Estimating highdimensional directed acyclic graphs with the PCalgorithm. The Journal of Machine Learning Research, 8(1), 613–636.
Verma, T., Pearl, J. (1990). Equivalence and synthesis of causal models. In Proceedings of the 6th annual conference on uncertainty in artificial intelligence, Amsterdam (pp. 255–270).
Spirtes, P., Glymour, C. N., & Scheines, R. (2000). Causation, prediction, and search. MIT Press.
Shimizu, S., Hoyer, P. O., Hyvärinen, A., Kerminen, A., & Jordan, M. (2006). A linear nonGaussian acyclic model for causal discovery. Journal of Machine Learning Research, 7(10), 2003–2030.
Hoyer, P., Janzing, D., Mooij, J. M., Peters, J., & Schölkopf, B. (2008). Nonlinear causal discovery with additive noise models. Advances in Neural Information Processing Systems, 21, 689–696.
Mu, G., Chen, Q., & Liu, H. B. (2022). A reciprocal information entropy causal inference method for exploring the causeeffect relationship in power system operation data. Proceedings of the CSEE, 42(15), 5406–5416.
Janzing, D., Mooij, J., Zhang, K., Lemeire, J., Zscheischler, J., Daniušis, P., & Schölkopf, B. (2012). Informationgeometric approach to inferring causal directions. Artificial Intelligence, 182, 1–31.
Gang, M., Wang, Z. H., Han, Y. D., & Mei, H. (1993). A new method for quantitative assessment of the transient stability of power systems—trajectory analysis method. Proceedings of the CSEE, 13(3), 23–30.
Acknowledgements
Not applicable.
Funding
This work was supported by the National Natural Science Foundation of China (Grant No.: 51877034).
Author information
Authors and Affiliations
Contributions
All authors contributed to the research, read and approved the manuscript. JA proposed the initial concept of structural causal relationships, and gave technique guidance in the whole research process. YS modeled the simulation system, and collected the data. GM proposed a method to calculate causality effect under unit intervention intensity, which has been successfully applied to transient stability evaluation problems. YZ proposed the concept and index of reliability assessment of causality and constructed the framework of this paper.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zhou, Y., An, J., Mu, G. et al. An improved constraintinference approach for causality exploration of power system transient stability. Prot Control Mod Power Syst 8, 59 (2023). https://doi.org/10.1186/s4160102300330w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s4160102300330w