Skip to main content
  • Original research
  • Open access
  • Published:

An improved constraint-inference approach for causality exploration of power system transient stability


Transient stability is the key aspect of power system dynamic security assessment, and data-driven methods are becoming alternative measures of assessment. The current data-driven methods only construct correlations between variables while neglecting causal relationships. Therefore, they face problems such as poor robustness, which restrict their practical application. This paper introduces an improved constraint-inference approach for causality exploration of power system transient stability. Firstly, a causal structure discovery method of power system transient stability is proposed based on a PC-IGCI algorithm, which addresses the shortage caused by Markov equivalence and massive variables. Then, a relative average causal effect index is proposed to reveal the relationship between relative intervention strength and causal effects. The results of a case study verify that the proposed method can identify the causal structure between the transient stability variables entirely based on data. In addition, the causal effect sorting between “cause” and “outcome” of transient stability variables is revealed. This paper provides a new approach for data mining to uncover the causal mechanisms between variables in power systems and expand the capabilities of data-driven methods in power system application.

1 Introduction

With the expansion of power grids, the operational mode of power systems is becoming more complex. Power systems face significant challenges in terms of security and stability, with transient stability assessment being a crucial component. The mechanism model analysis method based on reduction theory plays an important role in transient stability assessment. It includes numerical integration methods, direct methods [1,2,3], etc. However, the effectiveness of mechanism modeling methods relies on accurate models and parameters, which are increasingly difficult to achieve in complex power systems. In addition, numerical integration methods are unable to provide the evaluation results of transient stability directly, and thus they still require people to further analyze the simulation results and data.

In recent years, with the widespread installation of measurement devices in power systems and the improvement of data analysis and processing capabilities, analyzing the complex operational behavior of power systems based on data-driven methods has become a research hotspot [4,5,6,7,8]. Artificial intelligence models are representative applications of data-driven methods, which can construct complex mapping between input datasets and sample labels. These data-driven models have many advantages, including direct output of power system transient stability assessment results and significantly increasing the speed of evaluation through offline training and online matching.

However, these mapping relationships, built under the guiding principle of “correlation” with no regard for “causation”, cause predictions made by data-driven models to face problems such as poor robustness to out of distribution datasets [9,10,11,12] and difficulty in interpretation [13,14,15,16]. This also leads to the fact that such data-driven assessment methods have not been widely used in safety sensitive engineering scenarios.

In fact, the “causal relationship” between variables is a characterization of the physical problem. It plays an irreplaceable role in revealing the mechanism of events, guiding intervention behavior, and other aspects. It is also an important carrier of knowledge that is easy for humans to understand [17]. To bridge the gaps of existing data-driven methods, exploring causal relationships from data is a significant requirement in security sensitive scenarios such as power system transient stability analysis.

The theory of causation developed from statistics is concerned with discovering causal relationships behind data. Since the causal model was proposed at the end of the twentieth century, relevant research began to flourish, providing an important means for data analysis. At present, causal theory has achieved great success in many fields such as economics, education, sociology, etc. [18,19,20,21]. Causal inference is mainly concerned with the discovery of causal structure between variables and the evaluation of causal effect. Since causal inference usually needs to change the generation mechanism of target variables, this is also the key point in distinguishing between correlation and causality.

Currently, there are two widely accepted causal models: Rubin causal model (RCM) and structural causal model (SCM). RCM, also known as the potential causal framework [22], mainly studies the average causal effect of two variables, while SCM proposed by Pearl [23] uses a causal graph to model the causal relationship between variables. As well as causal effect estimation, it mainly focuses on the problem of causal structure discovery. The classical methods of causal structure discovery can be divided into constraint-based and structural equation-based methods. The PC [24], IC [25] and FCI [26] algorithms, as representatives of constraint-based methods, have the advantages of dealing with high-dimensional variables and being applicable to both linear and nonlinear problems, but they can only give equivalence classes of possible causal graphs. The methods based on structural equations are to make certain assumptions about the form of structural equations, which can identify the complete causal graphs. However, their applicability is also limited by the equation form, such as LiNGAM [27] and ANM [28] algorithms, etc. New theories and methods of causal inference can provide a reference for revealing causal relationships between transient stability variables from data in modern power systems.

In the power industry, some initial attempts have been made to look at causal relationships. In [29], a reverse information entropy causal inference method (RIECI) is proposed by revealing the asymmetric attributes of the causal relationship between highly correlated pairs of variables in power systems, indicating the feasibility of analyzing causal relationships from operating datasets. However, the transient stability assessment of power systems is a high dimensional and complicated problem. Exploring complex causal relationships between variables based on data-driven methods still faces challenges.

The highlights of this paper are:

  1. (1)

    An improved causal structure discovery method based on a PC-IGCI algorithm for datasets of power system transient stability assessment is proposed. It addresses the shortcoming of Markov equivalence class of the existing constrained-based method.

  2. (2)

    The related average causal effect (RACE) index is proposed to quantitatively evaluate the causal effect under unit interventions. This reveals the relationship between relative intervention intensity of the cause variable and causal effects.

The rest of the paper is arranged as follows: classical causal inference methods are briefly introduced in Sect. 2, while an improved causality exploring method combining causal structure discovery with causal effect evaluation for transient stability assessment is proposed in Sect. 3. Example simulation and verification are provided in Sect. 4. Section 5 offers the conclusion of this paper.

2 Introduction to classical data-driven causal inference methods

2.1 A causal structure discovery method based on PC algorithm

The Peter-Clark (PC) algorithm proposed in [26] is one of the classic methods for causal structure discovery.

A causal network (also known as a structural causality graph) is represented by a directed acyclic graph (DAG) showing the probability dependencies between variables. It can be represented by a triple G = (V, E, P). Here, V = {v1, v2, …, vn} is the set of all nodes in the DAG, and E = {e(vi, vj)|vi, vjV} is the set of single-directed edges between every two nodes, where e(vi, vj) denotes the causal relationship vi → vj between vi and vj. P = {P(vi|pavi)|vi, paviV} is the set of conditional probabilities.

The PC algorithm for causal structure construction is divided into two stages. The first stage aims to identify the dependencies between nodes and represent them as an undirected graph. The skeleton of the structural causality graph is constructed in this stage, as shown in Fig. 1a. The second stage aims to infer the direction of causal dependencies between nodes, extending the undirected graph to the DAG as shown in Fig. 1b.

Fig. 1
figure 1

Two-stage diagrams of the PC algorithm

2.1.1 Casual skeleton construction

A condition independence test is the main method for the PC algorithm to identify causal dependencies between variables.

The hth order sample partial correlation coefficient between any two variables i and j under k conditions can be estimated by:

$$\rho_{{i,j|{\varvec{k}}}} { = }\frac{{\rho_{{i,j|{\varvec{k}}\backslash h}} - \rho_{{i,h|{\varvec{k}}\backslash h}} \rho_{{j,h|{\varvec{k}}\backslash h}} }}{{\sqrt {\left( {1 - \rho^{2}_{{i,h|{\varvec{k}}\backslash h}} } \right)\left( {1 - \rho^{2}_{{j,h|{\varvec{k}}\backslash h}} } \right)} }}$$

It needs to be transformed into a normal distribution through a Fisher Z transformation, shown as:

$${\text{Z}}\left( {i,j|{\varvec{k}}} \right){ = }\frac{1}{2}{\text{log}}\left( {\frac{{{1 + }\hat{\rho }_{{i,j|{\varvec{k}}}} }}{{{1 - }\hat{\rho }_{{i,j|{\varvec{k}}}} }}} \right)$$

For a given significance level \(\alpha \in (0,1)\), the test rule is shown as:

$$\sqrt {n - \left| {\varvec{k}} \right| - 3} \left| {Z(i,j|{\varvec{k}})} \right| \le \Phi^{ - 1} \left( {1 - \alpha /2} \right)$$

where Ф is the cumulative distribution function of N(0,1). If (3) is true, then the hypothesis that variables i and j are independent under condition k, must be accepted.

The d-separation criterion [24] can be applied to identify causal undirected graphs. Starting with an undirected complete graph, if there is no edge between nodes vi and vj in the graph, then there must be a set Z that d-separates vi and vj. By testing whether the subset of V/{vi, vj} can d-separate \(v_{i}\) and \(v_{j}\) one by one, the causal dependence between vi and vj can be inferred. Subsequent removal of nonexistent "edges" between variables forms the skeleton of the structural causality graph.

2.1.2 Causal direction inference

Some causal directions between cause variable and outcome variable can also be inferred based on the result of conditional independence. Consider a skeleton of three variables, vivkvj, where vi and vj are not connected by an edge and are independent, denoted as vivj. However, if vi and vj are not independent under the condition of variable vk, then the causal dependency direction between variables in the undirected graph vivkvj can only be vi → vk ← vj as shown in Fig. 2a. This is called a V-structure. The V-structure is a special form in structural causality graphs, and has unique identifiability in causal direction identification.

Fig. 2
figure 2

Causal relationship between three variables

If the conditional independence constraint is vivj| vk, then there are three possible causal dependence directions between variables vi, vj, and vk, as shown in Fig. 2b. Structures 1 and 2 are called “chain” structures, and structure 3 is called a “fork” structure. Thus, these causal directions cannot be uniquely identified.

2.1.3 The problem of incomplete causal structure graph caused by Markov equivalence

The generated structural causality graph typically contains both directed edges with non-reversible direction and undirected edges with reversible direction. This can only be called a completed partially directed acyclic graph (CPDAG) rather than a Bayesian network (BN).

For example, considering the CPDAG obtained from variables vi, vj, vk and vl using the d-separation criterion, as shown in Fig. 3a, G1 contains only a single V-structure \(v_{j} \to v_{l} \leftarrow v_{k}\), and the two undirected edges \(v_{i} - v_{j}\) and \(v_{i} - v_{k}\) cannot be inferred with certain causal direction by the Meek principles [24]. Therefore, G2, G3, and G4, which have the same skeleton and V-structure, are Markov equivalent and form a Markov equivalence class of the Bayesian network.

Fig. 3
figure 3

Illustrative diagram of Markov equivalence class

Markov equivalent classes are a common problem faced by constraint-based methods. Because the direction of edges in the causal network cannot be completely determined, this not only greatly affects the understanding of the causal structure in the data, but also makes it impossible to infer the causal effect between some key variables.

2.2 Causal effect inference method based on ACE

After obtaining the causal relationship network between the variables, causal inference techniques can quantify the degree to which the “cause” affects the “outcome” in each causal direction, i.e., the causal effect.

The relationship between the outcome variable of sample i and whether it receives the intervention is shown as:

$$y_{i} = y_{0i} + (y_{1i} - y_{0i} )D_{i}$$

where yi is the outcome variable, y1i and y0i represent the results of sample i after and before receiving the intervention, respectively. Di = {0,1}, i.e., 1 for the treatment group and 0 for the comparison group. (y1i − y0i) represents the causal effect of whether sample i receives the intervention. However, because of individual differences in different samples, the impact of applying the same intervention on the results is different. To minimize the impact of individual differences in the samples, expectations can be taken for the causal effect of all samples, namely, the average causal effect (ACE), shown as:

$$ACE{ = }E\left( {y_{1i} - y_{0i} } \right)$$

The challenge in evaluating causal effects lies in the fact that \(y_{1i}\) and \(y_{0i}\) cannot be observed at the same time. It belongs to the “counterfactual” causal inference framework, i.e., for a single data sample, it can only be in one of the two states of being intervened or not being intervened. In fact, once data is collected, it is an unchangeable record, while how to implement “intervention” on data is also one of the key concerns of data-driven methods.

Matching estimator methods provide a feasible solution to this problem. If sample i belongs to the treatment group, a sample j in the comparison group is found such that the covariates x (features other than \(\left\{ {y_{i} ,D_{i} } \right\}\)) of sample j are as close as possible to those of sample i, i.e., xixj. In this case, \(y_{j}\) can be used as the estimation of y0i, i.e., \(\hat{y}_{0i} = y_{j}\).

Practically, it is difficult to find similar xj to match xi in high-dimensional space if the dimension of xi is very high, with a great risk of match failure. In addition, the average causal effect only controls the intervention intensity of cause variables by dividing the samples into a treatment group and a comparison group, but cannot quantitatively reveal the relationship between intervention intensity and causal effect.

3 Improved constraint-inference approach for causality exploration of power system transient stability

To address the shortcomings of existing causal inference methods introduced in Sect. 2, this section proposes an improved constraint-inference approach for causality exploration of power system transient stability. This includes an improved causal structure discovery method based on the PC-IGCI algorithm, and a causal inference method based on propensity score matching with nearest-neighbour within calliper and relative average causal effect indicators (RACE).

3.1 Improved causal structure discovery method based on PC-IGCI algorithm

To overcome the Markov equivalence problem, some have proposed causal function models from the perspective of the distribution characteristics of data caused by the causal mechanism. The information geometric causal inference (IGCI) algorithm proposed in [30] is a typical representative of the causal function algorithm. It uses the independence between distribution of the cause variable and the “cause-effect” function mechanism to determine the causal relationship between variables and has been proven to be reliable in non-linear causal direction mining problems.

The IGCI causal determination indicators for variables x and y are as follows:

$$\begin{aligned} C_{x \to y} & = D\left( {p_{X} \left\| {\varepsilon_{X} } \right.} \right) - D\left( {p_{Y} \left\| {\varepsilon_{Y} } \right.} \right) \\ & = \, S(u) - S(v) + S(p_{Y} ) - S(p_{X} ) \\ \end{aligned}$$

If \(C_{x \to y} < 0\), then x is inferred to cause y;

If \(C_{x \to y} > 0\), then y is inferred to cause x.

In (6), \(D(\cdot\mid\mid\cdot)\) is the relative entropy distance and \(S( \cdot )\) is the differential entropy. The detailed expressions for these quantities can be found in [30].

However, the IGCI algorithm is mainly used for determining the causal direction between two variables, because the method itself does not have the ability to identify the causal graph skeleton from the data. This makes it difficult to apply in identification of the structural causality between high-dimensional variables. In addition, if there is no possible causal candidate direction between variables in the first place, it is not possible to obtain reliable results using only the IGCI method.

The improved causal structure discovery method based on PC-IGCI proposed in this paper uses the IGCI method to extend the causal discovery on the basis of the CPDAG generated by the PC algorithm, determining the causal directions of the undirected edges, and generating a causal network expressed as DAG. It solves the deficiency of causal structure discovery methods being only based on constraint, and improves the conditions for the application of IGCI through the prior causal skeleton construction, making the causal direction recognition results more reliable.

3.2 Causal effect inference method based on PSM-NNC method and RACE index

To solve the problem of samples matching between treatment and comparison groups caused by the high-dimensional characteristics of samples in the evaluation of causal effect, this paper proposes a sample matching method based on propensity score. In addition, because the data-driven transient stability evaluation datasets have a large number of samples, the one-to-many sample matching method can make the evaluation of causal effect more accurate. Here the nearest-neighbour matching within caliper algorithm is introduced. This is called the PSM-NNC method.

The propensity score is the conditional probability that sample i enters the treatment group, i.e., \(p(x_{i} ) = P(D_{i} = 1|x = x_{i} )\). Therefore, the main steps for calculating ACE through the PSM-NNC are as follows:

  1. (1)

    Select covariates xi. Include as many variables as possible that may affect \(\left( {y_{0i} ,y_{1i} } \right)\) and \(D_{i}\).

  2. (2)

    Estimate propensity score. Here logistic regression is used to establish a regression model to evaluate the likelihood of each sample receiving intervention.

  3. (3)

    Perform propensity score matching. The nearest-neighbor matching within the caliper algorithm is introduced, i.e., finding the closest match within the bias γ range. This can improve the efficiency of sample matching and the accuracy of causal effect estimation.

  4. (4)

    Calculate ACE estimated value for the matched samples, as:

    $$ACE = \frac{1}{N}\sum\limits_{i = 1}^{N} {\left( {\hat{y}_{1i} - \hat{y}_{0i} } \right)}$$

where N is the sample size, \(\hat{y}_{1i}\) and \(\hat{y}_{0i}\) are the estimates of \(y_{1i}\) and \(y_{0i}\), respectively.

To reveal the impact of intervention intensity on causal effects, it is assumed that \(E(C_{i} |D_{i} = 0)\) and \(E(C_{i} |D_{i} = 1)\) represent the expected values of the cause variable in the comparison and treatment groups, respectively. The difference between them can indirectly reflect the average intervention strength on the cause variable, represented by \(Do( \cdot )\), i.e.:

$$Do(C_{i} ) = E(C_{i} |D_{i} = 1) - E(C_{i} |D_{i} = 0)$$

Based on quantitative evaluation of intervention intensity, RACE is proposed for calculating ACE under unit intervention strength. This can help to compare and analyse the causal effect under relative intervention strength of the different cause variables. The calculation equation is shown as:

$$RACE = \frac{{E(C_{i} |D_{i} = 0)}}{{Do(C_{i} )}}ACE$$

The RACE index can be used to reveal the sorting of the influence of transient stability results when the intervention of the same relative intensity is applied to different cause variables. This provides a basis for explaining the influence of different causes on the evaluation results of transient stability from the perspective of causal effect.

3.3 Sample set construction of power system stability assessment used for causality inference

Although measured data with unknown generation mechanisms can be directly used to construct sample sets for causal reference, large perturbations rarely occur in actual power systems and there are few examples of transient instability. Therefore, constructing a transient stability assessment data sample set based on simulation is an effective method to solve the shortage of measured data.

There are many factors affecting transient stability of power systems. They can be classified as operating variables that change with operating conditions and structure variables determined by the grid structure and component parameters. Usually, some operating variables need to be specified in the simulation, such as PG|0|, UG|0|, PL|0|, QL|0|, Tf, and Fl. PG|0| represents the set of generator steady-state active power variables, UG|0| is the set of initial voltages of generation nodes, PL|0| and QL|0| represent the active and reactive load levels before the disturbance, Tf represents the fault duration, and Fl represents the location of the fault occurrence. This can be expressed by continuous variables such as electrical distance, or by discrete coding of the fault location.

The samples of other operating variables need to be generated through power flow equations or calculations based on dynamic simulation results data, such as QG|0|, δ|0|, V|0|, θ|0|, SI, UI, and Ssys. QG|0| represents the set of generator steady-state reactive output variables, δ|0| represents the initial generator power angle, while V|0| and θ|0| represent the pre-disturbance voltage amplitude and phase angle of load and contact nodes, respectively. SI and UI represent the stability margin evaluation and instability evaluation indices of the generator. These can quantitatively evaluate the transient stability margin of each generator based on the trajectory information after the disturbance, and the calculation method is described in [31]. Ssys is the system stability flag, which takes the value of 0 or 1, where 0 represents instability, 1 represents stability, and the maximum relative power angle difference can be used as a criterion.

If the impact of structure variables on transient stability assessment is not of concern, then the structural variables can be specified as constants, such as Xd, Tj, Zl, where Xd is the d-axis transient reactance of generator, Tj is the generator inertia time constant, and Zl is the impedance parameters of transmission lines.

Numerous data samples are generated by perturbing some specified operating variables and concerned structure variables. We assume that for each variable participating in the perturbation, a Gaussian distribution with expectation of \(\mu { = }\mu_{{}}^{{{\text{sp}}}}\) and standard deviation of \(\sigma { = }\sigma_{{}}^{{{\text{sp}}}}\) is satisfied. \(\mu_{{}}^{{{\text{sp}}}}\) and \(\sigma_{{}}^{{{\text{sp}}}}\) are the given values of expectation and standard deviation, respectively. A sample set is constructed by combining a total of m perturbations. Thereby, the original sample set matrix \({\text{X}}_{{{\text{ori}}}}\) used for transient stability causality inference can be denoted as:

$$X_{{{\text{ori}}}} = [O_{{{\text{ori}}}}^{{{\text{spec}}}} ,O_{{{\text{ori}}}}^{{{\text{cal}}}} ,S_{{{\text{ori}}}}^{{{\text{spec}}}} ]$$

where \(O_{{{\text{ori}}}}^{{{\text{spec}}}}\) represents the samples of specified operating variables, \(O_{{{\text{ori}}}}^{{{\text{cal}}}}\) represents the calculated samples of operating variables, and \(S_{{{\text{ori}}}}^{{{\text{spec}}}}\) represents the samples of perturbed structure variables.

In a causal network, a given observed variable can be divided into an exogenous variable set U and an endogenous variable set V, according to whether it can be determined or influenced by other variables. The variables in V are often the ones that need to be interpreted. Dividing variables into endogenous and exogenous endows them with explained or to be explained attributes. This helps to gain a profound understanding of the identified structural causal relationships.

4 Case study

To comprehensively test the effectiveness of the causality exploring method proposed in this paper, this section first takes a single machine infinite bus (SMIB) system as an example to verify the feasibility of discovering the causal structure and evaluating causal effect from the operational data sets. Then the case of WSCC-9 is used to verify the capability of discovering high-dimensional causal relationships.

4.1 Causal inference of transient stability in SMIB system

4.1.1 Case settings

The topology of the SMIB system is shown in Fig. 4. The system reference voltage is 230 kV, and the generator uses the classical second-order model. A three-phase short-circuit fault at 50% of the branch from Bus 2 to Bus 3 is applied, with simulation starting at 0 s and a time step of 0.01 s.

Fig. 4
figure 4

Single-line diagram of the SMIB system

Following the sample generation method introduced in Sect. 3.3, the original sample set matrix Xori for transient stable causal inference is constructed. It contains 3 specified operating variables of PG|0|, UG|0| and Tf, 8 calculated operating variables of QG|0|, δ|0|, V|0|3, \(\theta_{\left| 0 \right|1}\), \(\theta_{\left| 0 \right|2}\), SI, UI, and Ssys, and 2 concerned structural variables of Tj, X’d. The dataset consists of 5000 samples.

4.1.2 Causal structure discovery test on data sets

The causal structure discovering method described in Sect. 3.1 is applied to Xori, and the significance level \(\alpha = 0.05\) is set to identify the causal structure. The results are characterized as a structural causality graph, denoted as Gori. There is no undirected edge caused by Markov equivalence in Gori, which means that the causal relationships in the graph are determined. The structural causality graph of the SIMB is shown in Fig. 5.

Fig. 5
figure 5

Structural causality graph of SIMB

In Fig. 5, the variables in purple boxes represent exogenous variables, and those in red circles represent endogenous variables. According to the causal structure discovery results based on PC-IGCI, the specified variables are all identified as exogenous variables, which is consistent with the generation mechanism of the dataset.

The oriented edges are mainly divided into two categories. One is from exogenous variables to endogenous variables, which reveals the direct causal relationship between the specified operating parameters and the result of transient stability assessment, such as Tj → Ssys and Tf → SI, represented by the blue arrows. The other category of oriented edges is from endogenous variables to endogenous variables, which indicates that the implied structural causal relationship among the transient stability variables can also be identified, such as \(Q_{{{\text{G}}\left| 0 \right|}} \to V_{\left| 0 \right|2} \to S_{{{\text{sys}}}}\), represented by the green arrows.

To examine the influence of different operating conditions on causal structure discovery, causal structures among the transient stability variables are evaluated to determine whether they are stable and widely supported in different datasets, or are sensitive to changes in operating conditions when the distribution characteristics of variables in the operating dataset change. The causality support rate (CSR) index defined in (11) is thus put forward to quantitatively evaluate the support degree of each causal rule based on the overall datasets, as:

$$CSR_{j} = \frac{1}{N}\sum\limits_{i = 1}^{N} {p_{j}^{(i)} } \times 100\%$$

where \(p_{j}^{(i)}\) represents whether the jth causal rule exists on graph Gi. If it exists, \(p_{j}^{(i)}\) is set to 1, conversely, \(p_{j}^{(i)}\) is set to 0. N represents the number of datasets which are generated by the same or different mechanisms.

The degree of directional asymmetry (DDA) index is proposed in (12). This is calculated based on CSRj. From the perspective of asymmetric causal directions, the support degree of causal directions based on datasets of different generation mechanisms is evaluated.


where \(CSR_{j+}\) represents the forward support rate of the jth causal rule, and \(CSR_{j-}\) represents the reverse support rate of the jth causal rule.

It is considered that the variables affecting the operating condition of the power system include PG|0|, UG|0| and Tf. Each variable is perturbed respectively while the distribution of other variables is frozen. Three perturbation amplitudes of − 10%, 5% and 10% are considered based on the original scenario, so 9 extended datasets are formed and denoted as Xext-i, {i = 1,2,…,9}. The overall datasets include the original and extended datasets, denoted as Xall. Therefore, Xall contains 10 data subsets.

The reliability of all occurred cause → outcome oriented edges generated by PC-IGCI is assessed using CSR and DDA, and the assessment results are shown in Table 1.

Table 1 CSR and DDA index of causality

As shown in Table 1, a total of 28 causal relationships are identified based on each data subset, and they have been arranged in descending order according to the DDA index. The asymmetry of causal directions from serial numbers 1 to 24 are above 0.6, which illustrates the high support of these causal relationships in the datasets, while it also indicates the relative certainty of causal direction among variables. In contrast, examining the causal relationship of No. 28, although the support rate of positive causality reaches 40%, the support rate of reverse causality is also as high as 30%, and the asymmetry of causal direction is 0.1. The degree of causal information asymmetry is low, so that this causal relationship cannot be effectively confirmed. For the causal relationships from No. 25 to 28, the DDA indices are all not more than 0.3, and such causal relationships are obviously unreliable.

There is a fundamental difference between correlation and causality. If the causal relationship of No. 27 is investigated, the Pearson correlation coefficient between θ|0|1 and θ|0|2 is 0.987, showing an extremely significant correlation. However, the index of causality asymmetry is only 0.1, because the two variables have a similar data generation mechanism and distribution characteristics. Figure 5 also shows that such causal relationships cannot be effectively supported in dataset Xori.

To compare and analyze the influence of different operational modes on the discovery of causal structure, 5000 samples are extracted from Xall in a random order each time, with a total of 10 data subsets generated by putting them back for sampling 10 times, and PC-IGCI is applied to each data subset. The CSR and DDA indices are used to reveal the identified causal structures when the data is mixed with multiple operational modes.

Using the above example conditions, the first 24 causal structures in Table 1 are effectively identified, and the DDA indices are greater than 0.8, while No. 25–27 causal rules are not identified by any data subsets. For the causal relationship of UI → SI, CSR+/CSR- is 30/20, and DDA index is retained as 0.1. In fact, it shows that even if the distribution of the overall dataset is the same as Xall, adopting different data subset partitioning rules will lead to different distribution characteristics of variables, thus affecting the results of causal structure discovery.

4.1.3 Causal effect inference in SMIB

From the causal relationship shown in Table 1, Xall is designated as the data source of causal inference. The cause variables (such as \(T_{{\text{j}}}\), \(T_{{\text{f}}}\), \(P_{{{\text{G}}\left| 0 \right|}}\), \(\delta_{\left| 0 \right|}\)) and outcome variables (transient stability margin index SI, stability label Ssys) of interest are selected to form 7 pairs of causal relationship. In addition, the intervention direction for the cause variables is set to increase, and the sample group division is shown as:

$$\left\{ \begin{gathered} D_{i}^{(j)} = 1 \, C_{i}^{(j)} > u_{i}^{sp} \hfill \\ D_{i}^{(j)} = 0 \, C_{i}^{(j)} < u_{i}^{sp} \hfill \\ \end{gathered} \right.$$

where \(C_{i}^{(j)}\) represents the jth observation data of the ith cause variable. \(D_{i}^{(j)}\) is the corresponding group label for \(C_{i}^{(j)}\), while \(D_{i}^{(j)} = 1\) indicates that the jth observation sample belongs to the treatment group and \(D_{i}^{(j)} = 0\) indicates that the jth observation sample belongs to the comparison group.

The causal effect inference results are shown in Table 2. It is clear from Table 2 that the increase in Tj has a positive effect on the transient stability margin of the generator, with an overall causal effect of 1.43 in the entire sample set. It indicates that the average change in stability index SI resulting from an increase in Tj is 1.43 when comparing all samples with Tj > 20 s and Tj < 20 s. Based on the meaning of the stability index SI, it implies an increase in transient stability margin. The ACE index represents the “average intervention effect” of only the cause variables, under a variety of complex combinations of transient stability operating conditions and fault information, while minimizing the impact of other operating condition differences on the response variables.

Table 2 Causal effect inference results

Similarly, the ACE index on Tj → Ssys is 0.36, indicating that an increase in the inertia leads to an average increase of 36% in the probability of the system remaining stable. Conversely, the causal effect of an increase in the fault duration Tf on stability margin is − 0.18. It can be inferred from the dataset as a general conclusion that an increase in fault duration is detrimental to transient stability of power systems.

Taking the initial generation PG|0| as an example, the expectation value of the samples that receive intervention is 2.27 pu, and that of the samples that do not receive intervention is 1.74 pu. Thus the average intervention strength is 0.53 pu, and the causal effect on the stability index SI is − 2.47. As a result, the probability of power system maintaining transient stability after intervention will decrease by 72%.

From the perspective of effect, the strength of interventions can be directly compared. Based on data from the treatment and comparison groups, the intervention of applying a 0.53 pu on PG|0|, when compared to the intervention of applying a 0.06 s on fault duration Tf, has a greater causal effect on the transient stability margin of the power system. It can also be seen that the intervention strength of the former is greater than that of the latter, and this is more detrimental to transient stability of the power system, although the two have different dimensions.

To compare the causal effects of applying the same relative intervention intensity to different cause variables, Table 2 has arranged the causal relationships in descending order according to the |RACE|. For the causal relationship related to SI, the RACE of Tf, PG|0|, Tj, and δ|0| are − 30, − 4.67, 0.152, and − 0.148, respectively. This indicates that when the same relative intervention intensity is applied to the above four cause variables, Tf has the strongest causal effect, and is much higher than the other three.

4.1.4 Causal effect reliability test

This paper conducts two refutation experiments to test the causal relationships, i.e., by adding random confounding factors and placebo interventions respectively. This is because causal relationships identified based on observational datasets are difficult to prove true, but their falsity can be revealed by the abnormal behavior of the model in refutation tests. If the causal relationships are correctly identified, the causal effects after adding random confounding factors should be very close to the original results, while placebo interventions replace the identified cause variables with independent random variables and recalculate the causal effects. For the results, if the causal effects greatly decrease or even approach 0, it indicates that the causal relationship is relatively reliable. Table 3 shows some refutation test results of causalities, and indicates that all the causal relationships represented in Fig. 5 have passed the refutation test.

Table 3 Refutation test results

4.1.5 The influence of changing intervention strength on transient stability

The application of interventions is further examined with different strengths on the cause variables that affect power system transient stability. By observing the changes in “causal effects”, it can demonstrate a feasible approach for revealing the physical mechanisms of transient stability based on power system operating datasets.

First, the samples in datasets are divided into 5 groups based on different range values of the cause variables. The first group is the comparison group, and the other 4 groups are divided into different treatment groups according to the intervention strength from low to high. The variable interval division is shown in Table 4, and the causal effects on \(T_{{\text{j}}} \to SI\), \(T_{{\text{j}}} \to S_{{{\text{sys}}}}\), \(T_{{\text{f}}} \to SI\), \(T_{{\text{f}}} \to S_{{{\text{sys}}}}\), \(P_{{{\text{G}}\left| 0 \right|}} \to SI\), and \(P_{{{\text{G}}\left| 0 \right|}} \to S_{{{\text{sys}}}}\) under different intervention strengths are shown in Fig. 6.

Table 4 Sample grouping with different intervention strengths
Fig. 6
figure 6

Causal effect changes with different intervention strengths

As can be seen from Fig. 6a and b, when different strengths of intervention are applied to Tj, i.e., when \(Do(T_{{\text{j}}} )\) takes values of 2 s, 4 s, 5.9 s, and 7.9 s, the causal effect on the transient stability of the power system gradually increases. On the one hand, it intuitively shows that an increase in Tj has a positive causal effect on transient stability, while on the other hand, it shows that as the intensity of intervention on variable Tj increases, the causal effect will also change monotonically. Similarly, as shown in Fig. 6c and d, the causal effect of PG|0| on SI and Ssys varies monotonically with the intervention intensity, but in contrast to Tj, positive intervention on PG|0| has a negative causal effect. This implies that the average causal effect based on counterfactual inference can correctly reflect the effect strength of “cause” on “outcome”.

Figure 6e and f show the changes in causal effect of power system transient stability with different strengths of intervention applied to Tf. It can be seen that as Do(Tf) increases from 0.02 s to 0.08 s, the causal effect on SI decreases from − 0.117 to − 3.375, and the causal effect on Ssys decreases from − 0.032 to − 0.683. Additionally, there is a significant non-linear change in the relationship between the intervention strength on Tf and the causal effect. It can be interpreted such that in the process of Do(Tf) increasing from 0.04 to 0.06, a large number of samples originally considered to be transiently stable by the datasets are inferred to be in a critical unstable state, and thus the probability of instability increases significantly. The stability margin of unstable samples becomes 0 and no longer changes with further increases in intervention strength. Therefore, when Do(Tf) increases from 0.06 to 0.08, the causal effect barely changes.

4.2 Causal inference of transient stability for multi-machine systems

To verify the effectiveness of the proposed method in solving the high-dimensional and complex problem of transient stability in power systems, further analysis is conducted based on the WSCC 9-bus standard test case, as shown in Fig. 7. The same data perturbation method as described in Sect. 3.3 is used to construct the observation datasets, with a sample size of \(S = 10^{4}\). In this case, variables are perturbed with a standard deviation taken as 10% to 15% of the expected value. The fault set contains three faults, which are three-phase short circuits at Bus 5, Bus 7, and Bus 9 at 0 s.

Fig. 7
figure 7

Single-diagram of WSCC 3-machine 9-bus system

The PC-IGCI method is applied to discover the causal structure of the WSCC 9-bus system, as shown in Fig. 8. The causal graph contains a total of 58 causal relationships, and the causalities show significant multi-machine coupling characteristics compared with the SMIB system, revealing a complex causal mechanism of a transient stability problem in multi-machine power systems.

Fig. 8
figure 8

Transient stability structure causality of WSCC 9-bus system

Taking the transient stability margin SI2 of generator 2 for further analysis, the sample set is first screened according to the location of the fault, and only those samples with faults occurring at Bus 7 are retained. The causal effects of the fault duration \(T_{{\text{f}}}\), the inertial time constants \(T_{{{\text{j}}1}}\), \(T_{{{\text{j2}}}}\), \(T_{{{\text{j}}3}}\), and the d-axis transient reactance \(X^{\prime}_{{{\text{d1}}}}\), \(X^{\prime}_{{{\text{d2}}}}\), \(X^{\prime}_{{{\text{d3}}}}\) on SI2 are examined. The causal effects ACE and the causal effects under per unit intervention strength RACE are shown in Table 5, and have been arranged in descending order of |RACE|. As is seen from Table 5, the causal effect of \(T_{{{\text{j2}}}}\) on SI2 is the largest and the causal effect of \(X^{\prime}_{{{\text{d3}}}}\) on SI2 is the smallest under the applied unit intervention.

Table 5 Order of unit causal effects

A time-domain simulation model of the standard WSCC 9-bus system is also set up. The fault location is set at Bus 7, and the 7 cause variables listed in Table 5 are positively perturbed to 1.1 times the values in the standard case. The stability margin change indicator \(\Delta SI_{2}\) of generator 2 is shown in Table 6.

Table 6 Comparison of causal effects and model intervention experiments

Table 6 shows that the ranking of the \(\Delta SI_{2}\) index is the same as that of RACE, indicating that the RACE index has the ability to evaluate the causal effect of counterfactual inference on different dimensional variables. It can provide method support for further revealing the physical mechanisms of power system transient stability.

5 Conclusion

The current data-driven transient stability assessment methods mainly focus on constructing correlation relationships between variables. Because they neglect the causal relationships between variables, they face poor robustness and difficulty in interpretation. This restricts engineering application. Combined with the new advances in causal theory, this paper takes the power system transient stability problem as the object, and proposes improved methods for discovering causal structure and inferring causal effects based on operational datasets. The main conclusions are:

  1. (1)

    An improved causal structure discovery method based on the PC-IGCI algorithm for datasets of power system transient stability assessment is proposed. This addresses the shortcomings of Markov equivalence and massive variables. It proves the feasibility of discovering the causal structure based on operational datasets.

  2. (2)

    The RACE index is proposed to quantitatively evaluate the causal effect under unit intervention. It has the ability to evaluate the causal effect of counterfactual inference on different dimensional variables. RACE can be used to reveal the sorting of the causal effects, and to provide an approach for explaining the influence of different causes on the evaluation results of transient stability.

  3. (3)

    Exploring the causal relationship between variables of transient stability assessment based on data expands the capabilities of data-driven methods and helps to understand the deeper mechanisms in the power system transient stability problem.

In future research, there exists the potential to develop a stable learning model for transient stability assessment that uses causal relationships. This approach would mitigate the incorporation of irrelevant local features from the data, facilitate the integration of identified causal relationships as constraints during the data-driven model learning phase, and effectively eliminate spurious correlations among variables. As a result, this strategy could effectively curtail overfitting tendencies and bolster the overall robustness of the model.

Availability of data and materials

Please contact author for data and material request.


  1. Chen, Y., Shen, C., & Wang, J. (2009). Distributed transient stability simulation of power systems based on a Jacobian-Free Newton-GMRES method. IEEE Transactions on Power Systems, 24(1), 146–156.

    Article  Google Scholar 

  2. Magnusson, P. C. (1947). The transient-energy method of calculating stability. Transactions of the American Institute of Electrical Engineers, 66(1), 747–755.

    Article  Google Scholar 

  3. Xue, Y. (1992). Extended equal area criterion revisited. IEEE Transactions on Power Systems, 7(3), 1012–1022.

    Article  Google Scholar 

  4. An, J., Zhang, L., Zhou, Y., & Yu, J. (2022). Transient stability margin prediction under the concept of security region of power systems based on the long short-term memory network and attention mechanism. Frontiers in Energy Research, 10(332), 1–15.

    Google Scholar 

  5. Zhang, R. Y., Wu, J. Y., Li, B. Q., & Shoa, M. (2020). Self-adaptive power system transient stability prediction based on transfer learning. Power System Technology, 44(6), 2196–2205.

    Google Scholar 

  6. Zhou, Z. H., Bu, G. Q., Ma, S. C., Luo, Y., & Han, N. (2021). Assessment and optimization of power system transient stability based on feature-separated neural networks. Power System Technology, 45(9), 3658–3667.

    Google Scholar 

  7. An, J., Yu, J., Li, Z., Zhou, Y., & Mu, G. (2020). A data-driven method for transient stability margin prediction based on security region. Journal of Modern Power Systems and Clean Energy, 8(6), 1060–1069.

    Article  Google Scholar 

  8. Zhang, L., An, J., Zhou, Y. B. (2023) Transient stability evaluation of power system based on temporal convolution and graph attention network. In: Automation of electric power systems (pp. 1–12), Available:

  9. Bagnell, J. A. (2005) Robust supervised learning. In Proceedings of the 20th national conference on artificial intelligence (pp. 714–719), Menlo Park, CA.

  10. Hua, W. H., Niu, G., Sato, I., Sugiyama, M. (2018) Does distributionally robust supervised learning give robust classifiers?. In Proceedings of the 35th international conference on machine learning (pp. 2029–2037), Cambridge MA.

  11. Rahimian, H., Mehrotra, S. (2019) Distributionally robust optimization: A review. [Online]. arXiv preprint arXiv:1908.05659.

  12. Xu, H., Ma, Y., Liu, H. C., Deb, D., Liu, H., Tang, J. L., & Jain, A. K. (2020). Adversarial attacks and defenses in images, graphs and text: a review. International Journal of Automation and Computing, 17, 151–178.

    Article  Google Scholar 

  13. Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, Piscataway (pp. 618–626).

  14. Schwab, P., & Karlen, W. (2019). CXPlain: Causal explanations for model interpretation under uncertainty. Advances in neural information processing systems, 32(10220), 10230.

    Google Scholar 

  15. Madumal, P., Miller, T., Sonenberg, L. (2020). Explainable reinforcement learning through a causal lens. In Proceedings of the 34th AAAI conference on artificial intelligence, Palo Alto, CA (pp. 2493–2500).

  16. Kanamori, K., Takagi, T., Kobayashi, K., Ike, Y., Uemura, K., & Arimura, H. (2021). Ordered counterfactual explanation by mixed-integer linear optimization. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 35, No. 13, pp. 11564-11574).

  17. Glymour, M., Pearl, J., & Jewell, N. P. (2016). Causal inference in statistics: A primer (pp. 1–160). John Wiley & Sons.

    MATH  Google Scholar 

  18. Hu, H., & Kerschberg, L. (2018). Evolving medical ontologies based on causal inference (pp. 954–957). Sanya: ASONAM.

    Google Scholar 

  19. Nugroho, F. A., Ederveen, T. H., Wibowo, A., Boekhorst, J., de Jonge, M. I., & Heskes, T. (2019). Application of a causal discovery model to study the effect of iron supplementation in children with iron deficiency anemia. In 2019 3rd international conference on informatics and computational sciences (ICICoS) (pp. 1–5). IEEE.

  20. Trenberth, K. E. (2012). Framing the way to relate climate extremes to climate change. Climatic Change, 115(2), 283–290.

    Article  Google Scholar 

  21. Nowack, P., Runge, J., Eyring, V., & Haigh, J. D. (2020). Causal networks for climate model evaluation and constrained projections. Nature Communications, 11(1), 1–11.

    Article  Google Scholar 

  22. Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5), 688–701.

    Article  Google Scholar 

  23. Pearl, J., & Mackenzie, D. (2018). The book of why: The new science of cause and effect (pp. 23–51). Hachette.

    MATH  Google Scholar 

  24. Kalisch, M., & Buehlmann, P. (2007). Estimating high-dimensional directed acyclic graphs with the PC-algorithm. The Journal of Machine Learning Research, 8(1), 613–636.

    MATH  Google Scholar 

  25. Verma, T., Pearl, J. (1990). Equivalence and synthesis of causal models. In Proceedings of the 6th annual conference on uncertainty in artificial intelligence, Amsterdam (pp. 255–270).

  26. Spirtes, P., Glymour, C. N., & Scheines, R. (2000). Causation, prediction, and search. MIT Press.

    MATH  Google Scholar 

  27. Shimizu, S., Hoyer, P. O., Hyvärinen, A., Kerminen, A., & Jordan, M. (2006). A linear non-Gaussian acyclic model for causal discovery. Journal of Machine Learning Research, 7(10), 2003–2030.

    MathSciNet  MATH  Google Scholar 

  28. Hoyer, P., Janzing, D., Mooij, J. M., Peters, J., & Schölkopf, B. (2008). Nonlinear causal discovery with additive noise models. Advances in Neural Information Processing Systems, 21, 689–696.

    MATH  Google Scholar 

  29. Mu, G., Chen, Q., & Liu, H. B. (2022). A reciprocal information entropy causal inference method for exploring the cause-effect relationship in power system operation data. Proceedings of the CSEE, 42(15), 5406–5416.

    Google Scholar 

  30. Janzing, D., Mooij, J., Zhang, K., Lemeire, J., Zscheischler, J., Daniušis, P., & Schölkopf, B. (2012). Information-geometric approach to inferring causal directions. Artificial Intelligence, 182, 1–31.

    Article  MathSciNet  MATH  Google Scholar 

  31. Gang, M., Wang, Z. H., Han, Y. D., & Mei, H. (1993). A new method for quantitative assessment of the transient stability of power systems—trajectory analysis method. Proceedings of the CSEE, 13(3), 23–30.

    Google Scholar 

Download references


Not applicable.


This work was supported by the National Natural Science Foundation of China (Grant No.: 51877034).

Author information

Authors and Affiliations



All authors contributed to the research, read and approved the manuscript. JA proposed the initial concept of structural causal relationships, and gave technique guidance in the whole research process. YS modeled the simulation system, and collected the data. GM proposed a method to calculate causality effect under unit intervention intensity, which has been successfully applied to transient stability evaluation problems. YZ proposed the concept and index of reliability assessment of causality and constructed the framework of this paper.

Corresponding author

Correspondence to Jun An.

Ethics declarations

Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, Y., An, J., Mu, G. et al. An improved constraint-inference approach for causality exploration of power system transient stability. Prot Control Mod Power Syst 8, 59 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: